random-genome-fragments
Select a set of fragments with random positions in a given genome, and return their coordinates and/or sequences. The supported organisms are etiher installed in RSAT or from Ensembl. Makes use of EnsEMBL API (www.ensembl.org) for EnsEMBL genomes.
sequences
random-genome-fragments -org organism -l length -r repetitions [-o outputfile] [-v # -rm -lf length_file] [..]
The program outputs a file containing the genomic coordinates or the sequences.
Level of verbosity (detail in the warning messages during execution)
Display full help message
Same as -h
If no output file is specified, the standard output is used. This allows to use the command within a pipe.
Type of data to return. Supported values: seq | coord
By default, coordinates (coord) are returned. For RSAT organisms,
the return type can be 'seq' to retrieve sequences. The sequence
format is fasta. For Ensembl organisms, use the coordinate file (in
ft format) as input to retrieve-ensembl-seq.pl with the options
-ftfile YourCoordFile -ftfileformat ft. You can also use the tools
of sequence providers (UCSC, Galaxy, Ensembl) to efficently extract
the sequences from the coordinates.
Supported values: ft | bed
Default is ft. To convert to another supported feature format, type
the following command: convert-features -h
For very big files, you might consider using the output format BED,
which is adapted to UCSC database. You can thus use the tools of
sequence providers (UCSC, Galaxy, Ensembl) to efficently extract the
sequences. The genomic intervals in this BED file are 0-based, as
specified in UCSC. Chromosome thus start at position 0 (not 1). This
BED file is compatible with UCSC, Galaxy and Ensembl (On the Ensembl
website, the bed file is automatically converted from 0-based into
1-based)
Will use the version of genome with repeat masked
Specifies an organism, installed in RSAT. To have the list of supported organism in RSAT, type the following command: supported-organism
Specifies an organism, from EnsEMBL database. No caps, underscore between words (eg 'homo_sapiens')
Uses a local EnsEMBL server. (Advanced users)
Allows to generate a set r of sequences, each of length l.
Sequence length of random genomic fragments.
Allows to generate random sequences with the same lengths as a set of reference sequences. The difference with the -lf option is that the sequence lengths are automatically calculated.
Allows to generate random sequences with the same lengths as a set of reference sequences. The sequence length file can be obtained with the command sequence-lengths