GeMSTONE

Ge M S T O N E

File Upload

Click to upload VCF file...* Download Sample VCF
.vcf or vcf.gz is required (max-size= 500M)
Make sure you follow variant call format (VCF) v4.0, v4.1 or v4.2

Click to upload supplementary VCF control... Download Sample Control VCF File
Optional VCF containing variants to be excluded from analysis.
Must be.vcf, .vcf.gz, or a formatted text file with variants to ignore.

The text file should be a tab-delimeted .txt file with 4 columns:

Chromosome Number/Identifier
Position
Reference Allelle
Alternate Allele

Note: Please make sure your chromosomes identifiers are in [1,2,3,4] format, rather than [chr1, chr2, chr3] format.

specifics on formatted variants text file...

Click to upload pedigree file... Download Sample PED

The PED file is tab delimited with 6 mandatory columns:

Family ID
Individual ID
Paternal ID (0=unknown)
Maternal ID (0=unknown)
Sex (1=male; 2=female; other=unknown)
Phenotype (1=unaffected; 2=affected, other=unknown)

(NO header is expected)

IDs are alphanumeric: individuals in the same family should have the same family ID; the individual ID should

uniquely identify a person regardless of family ID, and
match one of the sample IDs in VCF file(s) referring to the same person.

(!) Any IDs found in the VCF file NOT in the pedigree will be treated as control

An optional seventh column can specify the ethnicity of the person. This column is important to choose population-specific allele frequency for the filtering of variants that are rare in the specific ethnicity group of that person. An ethnicity identifier consists of a frequency database name and a population code linked by "_". For example, exac_AMR refers to American in ExAC database, and exac_ALL refers to overall frequency in ExAC database.

A list of available ethnicity identifiers available in our analysis:

ExAC	1000 Genomes	ESP6500	TAGC
exac_ALL: Overall	kg_ALL: Overall	esp_ALL: Overall	tagc_AJ: Ashkenazi
exac_AFR: African/African American	kg_AFR: African	esp_AA: African American
exac_AMR: Latino	kg_AMR: Ad Mixed American	esp_EA: European American
exac_EAS: East Asian	kg_EAS: East Asian
exac_FIN: Finnish	kg_EUR: European
exac_NFE: Non-Finnish European	kg_SAS: South Asian
exac_SAS: South Asian
exac_OTH: Other

If none of the above identifiers found in the seventh column, exac_ALL will be used as default frequency database and ethnicity group for the allele frequency filtering.

specifics on .ped format customization...

Human Genome Build

GRCh38
GRCh37

*** Data submitted to GeMSTONE has a 2 month expiration date, after which it will be deleted from our servers. Data is not shared, or visible to any third parties. The Yu Lab does not keep any files submitted past the expiration date.

Site-basis

Phred-Scaled Quality Score Lowerbound

*QUAL score in VCF file.

Allele Frequency Upperbound

%
*defaults to AF in ExAC unless population specified in PED file.

Ignore Variants Without PASS Flag

Genotype-basis

Genotype Quality Lowerbound

*requires "GQ" FORMAT tag specified for all sites.

Individual Read Depth Lowerbound

*requires "DP" FORMAT tag specified for all sites.

Inheritance Model

Dominant
Recessive Homozygous
Recessive Compound Heterozygous
X-linked Dominant
X-linked Recessive
Y-linked
No Inheritance

Remove Non Pseudo-Autosomal Regions in Sex Chromosomes

Recurrence

Multiple Ocurrences Across All Samples

LowerBound:

UpperBound:

Variant Must Occur In At Least 1 Non-Sporadic Family

Multiple Ocurrences Across Families
(Each Sporadic Sample Counts as a Family)

LowerBound:

UpperBound:

Allele Frequency Databases

These options are only for annotation, not for filtering variants

ExAC	1000 Genomes	ESP6500	TAGC
exac_ALL: Overall	kg_ALL: Overall	esp_ALL: Overall	tagc_AJ: Ashkenazi
exac_AFR: African/African American	kg_AFR: African	esp_AA: African American
exac_AMR: Latino	kg_AMR: Ad Mixed American	esp_EA: European American
exac_EAS: East Asian	kg_EAS: East Asian
exac_FIN: Finnish	kg_EUR: European
exac_NFE: Non-Finnish European	kg_SAS: South Asian
exac_SAS: South Asian
exac_OTH: Other

Cannot submit without a project name and VCF file. Please check any orange tabs above for anything you might've missed.

Versioning Options

dbNSFP version

ClinVar version

SnpEff Version

VT Version

VCFtools Version

BCFtools Version

GEMINI Version

ExAC Version

ESP Version

1000 Genomes Version

TAGC Version

Rosetta ddG Version

Pfam Version

MSigDB Version

HGMD Version

OMIM Version

MGI Version

GDI Version

RVIS Version

PLINKSEQ Version

GTEx Version

HPA Version

HINT Version

Submit a New Job ?

Set Job Parameters from Recipe ?

or

Check Job Status ?

File Upload ?

Human Genome Build

Site-basis

Phred-Scaled Quality Score Lowerbound

Allele Frequency Upperbound

Ignore Variants Without PASS Flag

Genotype-basis

Genotype Quality Lowerbound

Individual Read Depth Lowerbound

Inheritance Model ?

Recurrence ?

Multiple Ocurrences Across All Samples

LowerBound:

UpperBound:

Multiple Ocurrences Across Families(Each Sporadic Sample Counts as a Family)

LowerBound:

UpperBound:

Allele Frequency Databases

Variant Consequence

Coding Transcript Variant

Splicing Variant

Intergenic Variant

Non-Coding Variant

Others

Transcript Biotype

Protein Coding

Pseudogene

Short Noncoding

Long Noncoding

Custom Transcript File ?

Functional Predictions

Conservation Scores

Protein Stability Prediction

Deleteriousness Filter ?

Deleteriousness Thresholds ?

Gene Ontology (GO)

Genotype-phenotype Databases

DISEASE GENE FILE

Protein Domain Annotation

Protein Domain Filter

Protein-protein Interactions

Pathway Databases

Pathway Enrichment Analysis

Calculate and report enriched pathways with q-value ≤

From the following databases:

KEGG

BioCarta

Reactome

GO Biological Processes

GO Cellular Components

GO Molecular Function

Background Gene List

Submit a New Job

Set Job Parameters from Recipe

Check Job Status

File Upload

Inheritance Model

Recurrence

Multiple Ocurrences Across Families
(Each Sporadic Sample Counts as a Family)

Custom Transcript File

Deleteriousness Filter

Deleteriousness Thresholds

GTEx (The Genotype-Tissue Expression Project)
Select All

HPA (The Human Protein Atlas)
Select All