8405 Annotation for this gene includes both automatic annotation from Ensembl and Havana manual curation, see article. Ensembl/Havana merge 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8406 Sequences from various databases are matched to Ensembl transcripts using Exonerate. These are external references, or 'Xrefs'. DNA match 0 \N 8407 Manual annotation (determined on a case-by-case basis) from the Havana project. Havana 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8408 Alignment of human ESTs (expressed sequence tags) to the genome using the program Est2genome. ESTs are from dbEST Human EST (EST2genome) 0 {'type' => 'est'} 8409 Positions of vertebrate mRNAs along the genome. mRNAs are from the European Nucleotide Archive database. Initial alignments are performed using TBLASTN of Genscan-predicted peptides against the European Nucleotide Archive mRNAs. Vertebrate cDNAs (ENA) 0 {'type' => 'cdna','default' => {'contigviewbottom' => 'stack'}} 8410 Homo Sapiens cDNAs from NCBI RefSeq and EMBL are aligned to the genome using Exonerate cdna2genome model. Human cDNAs (cdna2genome) 0 {'type' => 'cdna'} 8411 Human cDNAs from NCBI RefSeq and ENA are aligned to the genome using Exonerate. Human cDNAs 0 {'type' => 'cdna'} 8412 Proteins from the UniProtKB Swiss-Prot database, aligned to the genome by Havana. UniProt proteins 0 \N 8413 Human protein sequences from UniProtKB and NCBI RefSeq are aligned to the genome using GeneWise or Exonerate. Human proteins 1 \N 8414 Proteins from the UniProtKB TrEMBL database, aligned to the genome by Havana. TrEMBL proteins 0 \N 8415 match Protein 0 \N 8416 Xref mapping based on checksum equivalency Xref checksum 0 \N 8417 Protein domains and motifs in the Pfam database. Pfam domain 1 {'type' => 'domain'} 8418 Protein domains and motifs in the SUPERFAMILY database. Superfamily domains 1 {'type' => 'domain'} 8419 Protein domains and motifs in the SMART database. SMART domains 1 {'type' => 'domain'} 8420 Identification of peptide low complexity sequences by Seg. Low complexity (Seg) 1 \N 8421 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF domain 1 {'type' => 'domain'} 8422 Protein domains and motifs from the PROSITE profiles database are aligned to the genome. PROSITE profiles 1 {'type' => 'domain'} 8423 Prediction of signal peptide cleavage sites by SignalP. Cleavage site (Signalp) 1 \N 8424 Protein coding sequences agreed upon by the Consensus Coding Sequence project, or CCDS. CCDS set 1 {'dna_align_feature' => {'do_not_display' => '1'},'type' => 'cdna','default' => {'contigviewbottom' => 'normal'}} 8425 Transcript where the Ensembl genebuild transcript and the Vega manual annotation have the same sequence, for every base pair. See article. Ensembl/Havana merge 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8426 Annotation produced by the Ensembl genebuild. Ensembl 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8427 Protein fingerprints (groups of conserved motifs) are aligned to the genome. These motifs come from the PRINTS database. Prints domain 1 {'type' => 'domain'} 8428 Alignment of mouse ESTs (expressed sequence tags) to the genome using the program Est2genome. ESTs are from dbEST Mouse EST (EST2genome) 0 {'type' => 'est'} 8429 Prediction of coiled-coil regions in proteins is by Ncoils. Coiled-coils (Ncoils) 1 \N 8430 Non-coding RNAs (ncRNAs) predicted using sequences from RFAM and miRBase. See article. ncRNAs 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8431 Positions of ncRNAs (non-coding RNAs) from the Rfam database are shown. Initial BLASTN hits of genomic sequence to RFAM ncRNAs are clustered and filtered by E value. These hits are supporting evidence for ncRNA genes. RFAM ncRNAs 0 \N 8432 Homo sapiens 'Expressed Sequence Tags' (ESTs) from dbEST are aligned to the genome using Exonerate. Human ESTs 0 {'type' => 'est'} 8433 Non-coding RNAs (ncRNAs) predicted using sequences from RFAM and miRBase. See article. These were projected to the alternate locus via a mapping from the primary assembly. Projected ncRNA 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8434 Transcript that was projected from the primary assembly, aligned to the alternate locus version as supporting evidence. Projected transcript 0 {'type' => 'cdna'} 8435 Manual annotation (determined on a case-by-case basis) from the Havana project, projected to the alternate locus via a mapping from the primary assembly. Projected Havana 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8436 Data from LRG database LRG 0 {'multi_name' => 'LRG genes','colour_key' => 'rna_[status]','caption' => 'LRG gene','name' => 'LRG Genes','label_key' => '[text_label] [display_label]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'}} 8437 First Exon Finder (First EF) predicts positions of the first exons of transcripts, both coding and non-coding, using the sequence to identify features such as CpG islands and promoter regions. First EF 1 \N 8438 Transcription start sites predicted by Eponine-TSS. TSS (Eponine) 1 \N 8439 CpG islands are regions of nucleic acid sequence containing a high number of adjacent cytosine guanine pairs (along one strand). Usually unmethylated, they are associated with promoters and regulatory regions. They are determined from the genomic sequence using a program written by G. Miklem, similar to newcpgreport in the EMBOSS package. CpG islands 1 \N 8440 Ab initio prediction of protein coding genes by Genscan. The splice site models used are described in more detail in C. Burge, Modelling dependencies in pre-mRNA splicing signals. 1998 In Salzberg, S., Searls, D. and Kasif, S., eds. Computational Methods in Molecular Biology, Elsevier Science, Amsterdam, 127-163. Genscan predictions 1 \N 8441 Short non-coding gene density as calculated by ShortNonCodingDensity.pm. Short non-coding genes (density) 1 \N 8442 Coding gene density as calculated by gene_density_calc.pl. Coding genes (density) 1 \N 8443 Percentage of repetitive elements for top level sequences (such as chromosomes, scaffolds, etc.) Repeats (percent) 1 \N 8444 Percentage of G/C bases in the sequence. GC content 1 \N 8445 Pseudogene density as calculated by PseudogeneDensity. Pseudogenes (density) 1 \N 8446 Long non-coding gene density as calculated by LongNonCodingDensity.pm. Long non-coding genes (density) 1 \N 8447 Markers, or sequence tagged sites (STS), from UniSTS are aligned to the genome using Electronic PCR (e-PCR). Marker 1 \N