Number of individuals per visit and omic type

Natural History Samples Catalogue

Definitions:

ND: Newly Diagnosed Participants. All ND participants had a baseline assessment within 6 weeks from diagnosis of T1D (based on the ADA criteria, defined as the time at which insulin therapy was started).

UFM: Unaffected Family Members. Participants who have a first-degree relative with T1D and tested positive for Islet autoantibodies (IAb+).

ND Participants Visit 1: Baseline < 6 weeks from diagnosis Visit 2 3 months Visit 3 6 months Visit 4 9 months Visit 5 12 months UFM Participants Visit 1: Baseline within 3 months after IAb result Visit 2 6 months Visit 3 12 months Visit 4 18 months Visit 5 24 months Visit 6 36 months Visit 7 48 months

Summary of Available Data & Samples

Description of clinical variables, types of biological samples, and omics data collected.

Demographic and clinical data collected as part of the INNODIA project. In more detail, the eCRF is comprised of

  • Weight
  • Height
  • BMI
  • BMI SDS
  • Age
  • Ethnicity
  • Country
  • Date of visit
  • Age at visit
  • Weeks from diagnosis (only for ND)
  • Sample date
  • Autoantibodies
  • Glucose reading
  • HbA1C values
  • Fasting C-peptide
  • Fasting glucose
  • Fasting C-peptide/Glucose ratio
  • Insuling average daily dose
  • Insulin dose/kg
  • MMTT and OGTT metadata (compliance with the MMTT/OGTT protocol)
  • MMTT and OGTT C-peptide and glucose values
  • MMT AUC C-peptide
  • MMT AUC Glucose

Below, all available samples, with volumes, and collection tubes are listed:

  • Serum: FluidX, 0.5 ml/aliquot
  • EDTA Plasma: FluidX, 0.2 ml/aliquot
  • DNA: FluidX (from EDTA plasma), 0.5 ml/aliquot
  • Lithium-heparin Plasma: Fluidx, 300 ul/aliquot
  • Urine: FluidX, 1 ml/aliquot
  • Whole Blood: PAXgene, 10 ml/aliquot
  • Stool: OMNIgene-GUT, 10 ml/aliquot

All sample types follow INNODIA SOPs. See below for summaries:

  • Immunomic data: Flow cytometry data as raw FCS files. Extracted cell population counts based on manual gating are available as CSV or Excel files. Gating strategy information is saved as pictures embedded in PDF files.
  • Genotyping data: Raw genotyping data in PLINK or VCF format and HLA data.
  • Lipidomic data: The plasma lipidomics data contain a total of 403 molecular lipids from major lipid classes such as glycerolipids, phospholipids and sphingolipids. The samples were analysed at Steno Diabetes Center Copenhagen with liquid-chromatography coupled to quadrupole-time-of-flight mass-spectrometry (UHPLC-QToF-MS) in two complimentary analyses using the positive and negative ion modes (detecting 260 and 143 lipids, respectively).
  • Metabolomic data: DThe metabolomics data contain 81 metabolites from major metabolite classes such as amino acids, free fatty acids and molecules in energy metabolism. The samples were analysed at Steno Diabetes Center Copenhagen with two-dimensional gas-chromatography coupled to time-of-flight mass-spectrometry (GCxGC-ToF-MS).
  • Metagenomic data: Stool samples were collected from a subset of individuals participating in the INNODIA Natural History Study, including 98 patients people with newly diagnosed with type 1 diabetes and 198 autoantibody-positive family members. Samples were collected using tubes that preserve DNA at room temperature and then stored at –80°C. Bulk microbial DNA—that is, all DNA from bacteria and other microbes in the sample—was extracted and assessed for quality. The DNA was sequenced using Illumina technology, which generates millions of short DNA reads. These sequences were compared to a large reference database to identify which microbial species were present and what functions they might perform. Both the raw sequencing reads, and the processed microbiome profiles are available for downstream analysis.
  • Proteomic data: The proteomics data generated for the analysis of INNODIA serum samples at the University of Turku was produced using targeted mass spectrometry. This dataset includes analyses from two consecutive groups of individuals newly diagnosed (ND) with type 1 diabetes: "the first 100" and "the next 150". Additionally, there is data from unaffected 460 first degree relatives (UFM). The data from the ND individuals includes available longitudinal samples collected within 6 weeks of diagnosis, then at 3, 6 and 12 months and single samples from each of the UFMs. There is also data from three QC samples that were periodically measured. For both selected groups, liquid chromatography (LC) coupled with mass spectrometry (MS) was used for selected reaction monitoring (SRM) analysis. With this approach, measurements were confined to pre-selected targets. As part of a follow-up validation study, the "next 150" measurements were conducted using a different, faster LC system, focusing on fewer protein targets (70 out of 105, plus 7 additional). In total, data were recorded for 250 peptides, with 130 peptides common to both datasets.
  • Transcriptomic data: The samples analysed were from the "the first 100" and "the next 150" newly diagnosed (ND) INNODIA cohorts. The analysis was carried out at the University of Turku, Finland. The 1st 100 ND sample cohort included 94 patients. Whole blood PAXgene samples were collected at visit 1, within 6 weeks of diagnosis (baseline), and visit 4 (at 12 months after diagnosis), with 46 patients having samples at both time points. The next 150 ND cohort included 155 patients with samples collected at baseline and 12 months after diagnosis. Additionally, the analysis included four whole blood PAXgene INNODIA QC samples, collected from two anonymous donors as per INNODIA SOP. For both cohort analysis, total RNA, including small RNA fractions, was purified using PAXgene Blood miRNA Kit (PreAnalytix/QIAGEN, Cat# 763134) and following the protocol supplied by the kit manufacturer. Library preparation and sequencing were carried out at the Finnish Functional Genomics Centre (https://bioscience.fi/functional-genomics/services/). Before starting library preparation, ERCC Spike-in control Mix 1 (Invitrogen P/N 4456739) was added to 100 ng RNA according to the kit’s protocol. RNA-seq libraries were prepared using TruSeq stranded mRNA HT kit and protocol # 15031047 (Illumina). Pooled libraries were sequenced on an Illumina NovaSeq 6000 instrument, using 2 × 50 bp (1st 100 cohort) or 2 x 100 bp (next 150 cohort) paired-end sequencing with about 30 million single-end reads per sample
  • mi/smallRNA data: The study design involved the analysis of two cohorts of Type 1 Diabetes (T1DM) individuals: an initial screening cohort, the INNODIA first cohort, consisting of n=115 T1DM individuals, and a validation cohort, the INNODIA second cohort, consisting of n=147 T1DM individuals. All subjects were followed-up with programmed visits at 3 (visit 2), 6 (visit 3) and 12 months (visit 4) after clinical diagnosis of T1DM. In both cohorts, blood samples were collected to isolate plasma EDTA and analysed at baseline (visit 1). The collected blood samples were processed within 2 hours from blood draw and underwent centrifugation to separate plasma from contaminant cells and platelets. The plasma samples were then aliquoted (200 μL) and stored at -80°C in a centralized biobank (see SOP plasma microRNAs version 5). For the INNODIA first cohort, the plasma samples were subjected to miRNA profiling using two different sequencing platforms: (A) HTG-miRNA Edge Seq on Illumina NextSeq550 platform (High Output kit v2 cat. FC-404-2005) and (B) Small RNA-seq using QIAseq miRNA Library Kit on Illumina NovaSeq 6000 platform [NovaSeq 6000 SP Reagent Kit (100 cycles) cat. 20027464, NovaSeq XP 2-Lane Kit cat. 20021664, Illumina] using the XP protocol applying 75x1 single reads. For the INNODIA second cohort, the plasma samples were exclusively analyzed using Small RNA-seq using QIAseq miRNA Library Kit and Illumina NovaSeq6000 sequencing. HTG-miRNA Edge Seq is a targeted RNAse-protection based assay, designed to detect a total of 2083 miRNAs (miRbase v.21). Small RNA-seq using QIAseq miRNA Library Kit allows the unbiased detection of virtually all small RNAs(< 50nt) included in the plasma sample. In both the INNODIA first and second cohort, a subset of samples were included as duplicates. Files are reported as sequencing FASTQ files for both Small RNA-seq and HTG-miRNA EdgeSeq.
  • CGM data: Continuous glucose monitoring data will be hosted in INNODIA database. Additionally, CSV files with custom versions of the database extracts will be available.
  • C-peptide data: This will include plasma C-peptide (fasting and serial C-peptide during MMTT and OGTT) as well as dried blood spot C-peptide. The Core Biochemical Assay Laboratory (CBAL) in Cambridge was the CORE LABORATORY for all plasma C-peptide measurements as well as for dried blood spot (DBS) C-peptide analyses.
    • Plasma C-peptide: Fasting plasma C-peptide and serial C-peptide samples taken during MMTT/OGTT were assayed in singleton on a DiaSorin Liaison XL automated immunoassay analyser using a sandwich chemiluminescence immunoassay (Diasorin S.p.A, 13040 Saluggia [VC], Italy).
    • DBS C-peptide: DBS C-peptide was analysed using an in-house assay based on the Meso Scale Discovery (MSD) electrochemical immunoassay technology. Four 3.2mm dried blood spot discs from DBS quality controls and unknowns were eluted in assay buffer overnight at +2–8°C with shaking and brought to room temperature before proceeding with the immunoassay. Commercial liquid calibrator, serum quality controls and the eluted DBS quality controls and unknowns were added in duplicate to an MSD standard bind plate coated with a mouse monoclonal anti c-peptide capture antibody. After incubation at room temperature and washing, a biotinylated monoclonal mouse anti c-peptide detector antibody was added. After a second incubation and wash, streptavidin Sulpho-TAG was added. After a third incubation and wash, MSD Read Buffer T (diluted 1:2) was added and the plate read using the MSD s600 reader. The C-peptide concentration was calculated using MSD Discovery Workbench software. For DBS samples, this measured concentration was converted to a plasma equivalent using an in-house derived factor.

Help & Frequently Asked Questions

🧭 General Navigation

  • Data Explorer: Explore clinical variables and HLA genotyping for ND and UFM participants. Use filters to focus on age, ethnicity, country, and longitudinal variables. Note: keep in mind that the numbers displayed correspond to complete cases through all visits.
  • Data Explorer/Omics: Once you select clinical variables for ND or UFM participants you will see the data points that have omics available by omic type. Note: keep in mind that the numbers displayed correspond to complete cases through all visits.
  • Samples Explorer: Browse biological samples (Stool, EDTA, DNA, etc.) by Sample Type, Branch, Visit, and number of aliquots. Includes barplots and interactive tables.

📊 Filters & Interactions

  • Filters are applied after clicking "Apply Filters".
  • Tables and plots update automatically after filters are applied.
  • Dynamic pickers: available values (e.g. Visit) depend on previous selections (e.g. Sample Type, Branch).

🧬 HLA Genotyping

  • HLA results are grouped into Class I, Class II, and individual genes.
  • You can select one or more genes to view the number of patients with available data.

🧪 Sample Types & Collection

📁 Downloading Data

  • Filtered tables (in Samples Explorer) can be downloaded as CSV using the download button.

🧩 Need Help?

  • If you see no results, try adjusting filters or clearing them.
  • Still stuck? Contact the INNODIA data team at data@innodia.org.