variation-scan
2.0
Scan variant sequences with position specific scoring matrices (PSSM) and report variations that affect the binding score, in order to predict regulatory variants.
variation-scan takes as input a .varseq file in the format produced by retrieve-variation-seq. For details about this format and how to create a .varseq file, see retrieve-variation-seq output format.
A list of matrices in a valid format. Supported formats: alignance, assembly, cis-bp, clustal, cluster-buster, consensus, encode, feature, footprintDB, gibbs, homer, infogibbs, jaspar, meme, meme_block,
motifsampler, mscan, sequences, stamp, stamp-transfac, tab, transfac and uniprobe.
A background model file in oligo-analysis format as produced by oligo-analysis. For details about this format and how to create a oligo-analysis file, see oligo-analysis output format.
A tab delimited file with the following column content.
Name of the matrix (generally the transcription factor name)
ID of the variation
Variation type, according to Sequence Ontology (SO) nomenclature
Coordinates of the variation
Best weight for the putative site
Worst weight for the putative site
Difference between best and worst weight
P_value of the best putative site
P_value of the worst putative site
Ratio between worst and best pval ( pval_ratio = worst_pval/best_pval )
Allele in the best putative site
Allele in the worst putative site
Offset of the best putative site
Offset of the worst putative site
Minimal offset difference between best and worst putative site
Strand of the best putative site
Strand of the worst putative site
Indicate if strand have change between the offset of min_offset_diff
Sequence of the worst putative site
Sequence of the worst putative site
Minor allele frequency
A matrix or a list of matrices in a valid format. Supported formats: alignance, assembly, cis-bp,
clustal, cluster-buster, consensus, encode, feature, footprintDB, gibbs, homer, infogibbs, jaspar,
meme,
meme_block, motifsampler, mscan, sequences, stamp, stamp-transfac, tab, transfac and uniprobe.
Input own motif collections as text or upload motifs as file.
Input a motif from a collection in list.
Input an available motif collection from list.
variation-scan takes as input a .varseq file in the format produced by retrieve-variation-seq. For details about this format and how to create a .varseq file, see retrieve-variation-seq output format.
Background model estimation
Estimate background model with genome sequences from a specific organism.
Markov order of the background model estimation.
Name of the organism where the background model estimation will be made.
Sequence type of the selected organism genome that will be used to estimate the background model.
Input a previously estimated background model file.
Specify the background model file format.
Upload the background model file.
Input the background model file from a URL.
Length of the longest matrix in the input list used to scan the current variant sequences. This value has to be consistent with
the value used for retrieving the variant sequences
and by no means the longest matrix at the input list must be
greater than this value. For details of this option, see Length of flanking sequence on each side of the variant
option at retrieve-variation-seq.
Only output hits with values passing the thresholds.
This value will be used as a lower threshold and all scanned variation records with best weight (best_w) greater than this value will be reported.
This value will be used as a lower threshold and all scanned variation records with weight difference (w_diff) between alleles greater than this value will be reported.
This value will be used as a upper threshold and all scanned variation records with best p-value smaller than this value will be reported.
This value will be used as a lower threshold and all scanned variation records with p-value ratio (pval_ratio) between alleles greater than this value will be reported.