-
GDI Synthetic Genomics Dataset
Synthetic genomic dataset for testing the GDI starter kit. Contains simulated VCF data with allele frequency information for colorectal cancer cases. -
CRG_AFBeacon_GoE_synthetic_dataset
This is a subset of the GDI MS8 synthetic dataset (COVID pop13_1) containing aggregated allele frequency (AF) statistics for variants on chr21 across 42,312 samples. It has been... -
GDI synthetic dataset (Population 11 Finland, Subgroup 2)
This dataset contains the pheno-clinical and genomic information of 42046 individuals from COVID Population 11 Finland, Subgroup 2. 2010 are affected by Phenotype 1, 2010 are... -
Exome Sequencing
All variants detected by whole exome sequencing of 2628 Dutch healthy elderly individuals -
Genome of Europe Luxembourgish dummy dataset
Synthetic data used to demonstrate GDI capabilities -
CRG AFBeacon GoE synthetic dataset
CRG_AFBeacon_GoE_synthetic_dataset -
GoE Small sample data
Slovenian small size dataset with real GoE data -
1+MG COVID dataset
The 1+MG COVID dataset is located at CSC's Allas service and it is available via Dylan Spalding (dylan.spalding@csc.fi) by request for the GDI project. In the future, the... -
Finnish Legacy Dataset from THL
Finnish Dataset produced from legacy data held at THL -
Estonian dataset, 578 samples.
Dataset containing 578 estonian samples of 579 after AF_bcftools pipeline -
Estonian dataset, 578 samples, pgx_pilot AF
Dataset containing 578 estonian samples of 579 after pgx_pilot AF pipeline -
Tiny GoE synthetic data
Example GoE synthetic data from https://raw.githubusercontent.com/GenomicDataInfrastructure/starter-kit-synthetic-... -
Czech PILOT Aggregated Allelic Frequencies
Czech PILOT Aggregated Allelic Frequencies -
CINECA Synthetic Cohort EUROPE UK1 referencing fake samples
Please note: This synthetic data set (with cohort “participants” / ”subjects” marked with FAKE) has no identifiable data and cannot be used to make any inference about cohort... -
SweGen Aggregated Allele Frequency Dataset
Test dataset of aggregated allele frequencies representing parts of genomes from healthy Swedish individuals included in the SweGen project. The aggregated whole-genome dataset... -
1+MG COVID synthetic dataset
The 1+MG COVID dataset is located at CSC's Allas service and it is available via Dylan Spalding (dylan.spalding@csc.fi) by request for the GDI project. In the future, the... -
Genome of Europe Latvian Genome Reference Project (LGRP) Pharmacogenomics...
Aggregated allele frequencies from the 514 individuals representing the Latvian population by sex and country of birth at Pharmacogenetic (PGx) sites. -
Genome of Europe Norwegian Synthetic Dataset
SYNTHETIC This dataset contains aggregated allele frequencies of simulated normal variations in chr21 of XXX people in national population, orginally generated in the GDI... -
GDI PT Pop12 sub1 (ITA)
Synthetic dataset containig CSV and VCF files -
GDI PT INSA Beacon dataset 1
GDI Beacon dataset from INSA