(1) Molecular biologists who need to interpret RNA sequence and probing
data to produce plausible 3D models for functional RNAs they study;
(2) biologists seeking to catalogue and understand the diersity of life and the inter-relationships of liing things;
(3) biochemists and nano-technologists seeking to understand the
mechanisms of the most ancient “molecular machines” – RNA-containing
supermolecular structures such as the ribosome and splicesosome;
(4) genomicists seeking to discoer non-coding RNAs in genomes; and
(5) academic, goernment, and industry scientists who research and deelop RNA pharmaceuticals or drugs that target RNA.
1.RNA Ontology Consortium简介
RNA本体联盟(RNA Ontology Consortium,ROC)是
Ontology 和Sequence Ontology的资源来共同创建更为广发的整合的Ontology来推动RNA研究。
Gene Ontology可透过在某物种上所获得的基因或蛋白质的生物学知识来解释在其他物种中所对应的基因或蛋白质。Gene Ontology
function）、生物过程（biological process）、基因产物的细胞成分( cellular
行注释的数据库Gene Ontology Annotation (GOA) database.
e.enable reuse of domain knowledge
f.support automated reasoning and inference oer domain knowledge.
4 RNA ontology 涉及的知识领域
1 RNA 序列信息(1D): coding and noncoding, and their identification in genomes (to be incorporated within the Sequence Ontology).
2 RNA 次级结构以及Watson-Crick 碱基配对
3 RNA 3D 结构和基序: backbone conformations, base stacking, and tertiary interactions.
6 RNA–RNA, RNA–protein, and RNA–ligand (metabolite,drug, metal and other ion, and water) interactions.
7 RNA conformational changes and dynamics of functional significance.
8 RNA 分子生物学(RNA加工,成熟以及剪接等等).
9 Biochemical and biophysical experimental data relating to RNA structure and structure–function relationships.
10 RNA as regulator of biological networks and pathways.
RNA Bioinformatics-RNA 信息学工具
a. Non-Coding RNA database
Non-translatable RNA transcripts that appear to work at the RNA leel.
Database of structure-annotated multiple sequence alignments, coariance
models and family annotation for a number of non-coding RNA families
The Structural Classification of RNA (SCOR) is a database designed to
proide a comprehensie perspectie and understanding of RNA motif
structure, function, tertiary interactions and their relationships
tRNAscan-SE allows you to search for tRNA genes in genomic sequence. (site hosted by Eddy Lab at WashU)
NDB (Nucleic Acid Database) is a repository of three-dimensional structural information about nucleic acids.
b. RNA folding Serers
List of RNA folding serers and related web sites maintained by Here Isambert.
c. RNA Informatics Links
An exhaustie list of RNA links; from the experts in the Major lab.
RNAbase is a searchable and annotated database of all publicly aailable RNA structures.
e. The RNA World
An RNA resource hub.
f. The Zuker Group
Algorithms, thermodynamics and databases for RNA secondary structure.
a. Riboswitch finder
RNA motif search program that identifies RNA motifs called riboswitches
which are metabolic binding domains in mRNA that regulate gene
expression. The program was originally designed around a set of
riboswitches found in Bacillus subtilis.
Fast RNA motif/pattern searcher; from the authors: If you re looking
for an RNA motif that fits a hard consensus pattern -- a la PROSITE
patterns, but with base-pairing -- you might check out RNABOB; not a
Web-tool; based on RNAMOT.
RNA motif search program; not a Web-tool.
d. Transterm UTR Motif Search
Transterm is an interactie database proiding access to RNA sequences
and their associated motifs. The RNA sequences are deried from all gene
sequence data in Genbank, including complete genomes, diided into
putatie 5' and 3'UTRs, initiation and term
Company web site with ery good technical resources including an
excellent links page, summaries of recent papers on RNA-related topics,
and free access to reiew articles and web features on RNA-related
Basic Local Alignment Search Tool (BLAST) finds regions of local
similarity between sequences. The program compares nucleotide or
protein sequences to sequence databases and calculates the statistical
significance of matches. BLAST can be used to infer functional and
eolutionary relationships between sequences as well as help identify
members of gene families.
2. EBI Tools
EBI Tools is a project that aims to proide programmatic access to the
arious databases and retrieal and analysis serices that the European
Bioinformatics Institute (EBI) proides through Simple Object Access
Protocol (SOAP) and other related web serice technologies.
Dierse suite of tools for sequence analysis; many programs analagous to GCG; context-sensitie help for each tool.
NCBI information retrieal system, including GenBank, MMDB (structures), genomes, population sets, OMIM, taxonomy and PubMed.
The FeatureExtract serer extracts sequence and feature annotations,
such as intron/exon structure, from GenBank entries and other GenBank
A portal to the human genome. Query by text or BLAST, to access heaps
of info from primary and secondary databases of genomic resources,
transcripts, protein sequences, function, associated diseases,
It goes to the library. You go to the pub; receie email alerts for
current contents of PubMed and GenBank; e.g. use accession number of
htg record as query to receie sequence updates (as the ersion number
8. Ribosomal Database Project
Highly curated database of aligned and annotated rRNA sequences with accompanying phylogenies; data aailable for download.
Seqhound is a sequence retrieal system that proides access to
biological sequence, structure and functional annotation data. Seqhound
can be accessed ia the web interface, through the remote API, or by
10. WU BLAST
Washington Uniersity Basic Local Alignment Search Tool
Serer which predicts consered secondary structure elements of
homologous RNAs. The input of a set of RNA sequences are not required
to be preiously aligned.
Tool which aids in the design and quality control of small interfering
RNAs (siRNAs) for RNA interference (RNAi) and gene silencing. It
ealuates the inhibitory potency of potential siRNA sequences as well as
identifying gene regions that hae a high sil
ERPIN (Easy RNA Profile IdentificatioN) takes as input an RNA sequence
alignment and secondary structure annotation and will identify a wide
ariety of known RNA motifs (such as tRNAs, 5S rRNAs, SRP RNA, C/D box
snoRNAs, hammerhead motifs, miRNAs and others
Serer which proides iterated loop matching and maximum weighted
matching algorithms for pseudoknot containing RNA secondary structure
prediction. Algorithms can apply thermodynamic and comparatie
information, and thus can be used for either aligned
Kinefold calculates (and animates) the folding kinetics of RNA sequences including pseudoknots.
Predict RNA secondary structure from sequence; does not predict pseudoknots
The Database of Macromolecular Moements (MolMoD contains a collection
of animated protein and RNA structures to assist in the exploration of
macromolecular flexibility. Software for structure analysis is also
MOLPROBITY is a structure analysis and alidation program that can
calculate and display steric, H-bonding, and an der Waals interactions
for known structures of proteins, nucleic acids, and complexes.
Predict pseudoknot structures in RNA sequence; source code only.
A RNA secondary structure prediction program which implements two
methods, one based on random stacking and the other based on helical
Predict RNA secondary structure from sequence; note sequence length limit.
Software for RNA/DNA secondary structure prediction and design
Serer with three tools for the rational design of small interfering
RNAs (Sirna), antisense oligonucleotides (Soligo), and trans-cleaing
ribozymes (Sribo). A fourth tool, Srna, returns output including
general folding features.
Serer for computing small interfering RNA (siRNA) sequences which are
best suited for mammalian RNA interference (RNAi). The site accepts a
sequence as input and returns a list of siRNA candidates.
j. siRNA Selection Serer
Serer aiding the design of short interfering RNAs (siRNAs) by proiding
information on stability, SNPs and specificity of the a potential siRNA.
This resource includes siSearch, AOSearch, and a siRNAdb which proides
a platform for mining an siRNA database, and searching for non-specific
matches to your siRNA (small interfering RNAs).
RNA secondary structure iewer applet; must be integrated into web page
to be implemented; can link to multiple computational backends.
T7 RNAi Oligo Designer (TROD) aids in the design of DNA
oligonucleotides for short interfering RNA (siRNA) synthesis with T7
RNA polymerase.It takes an input of a cDNA sequence and outputs a list
of DNA oligos for ordering.
N.ienna RNA Package
Comprises a C codelibrary and seeral stand-alone programs for the prediction and comparison of RNA secondary structures.
5. RNA: Three-Dimensional 3-D Structures
a. Ribosome Images (Wadsworth Center Microscope 3D Database)
b. RNase P 3D models
RNase P 3D models
c. SCOR: Structural Classification of RNA
SCOR: Structural Classification of RNA
d. The Nucleic Acid Database (ND
UTRBlast is an online tool which can blast your untranslated region UTR and compare its similarity to other UTR regions.
UTR Home. A collection of UTR resources and online tools.
UTRdb. A database of UTR sequences. Find your UTR RNA or DNA sequence of interest.
d. UTRScan UTR Scan
UTRScan UTR Scan.The program UTRscan looks for UTR functional elements
by searching through user submitted sequence data for the patterns
defined in the UTRsite collection.
UTRSite is a collection of functional sequence patterns located in 5' or 3' UTR sequences.
岛等，且ncRNA 基因较小，用于gene-finding 软件的基序(motif)变动较大等，因此，到目前为止，还没有高效且通用的ncRNA
搜索tRNA、snoScan 搜索带C/D盒的snoRNAs、SnoGps 搜索带H/ACA 盒的snoRNAs、mirScan
ncRNA Identification Methods Examples:
1. (Sequence homology methods)
2. (Pattern matching and coariance models)
For the identification of P/MRP RNA as well as IRE we used a
combination of pattern searches and secondary structure profile
searches with cmsearch of the Infernal package. Nuclear P RNA and MRP
RNA sequences are poorly consered in sequence. Howeer,three consered
regions are shared; CR-I, CR-I and CR-. For nuclear P RNA there are
also consered elements in the domain 2 to take into account; CR-II and
CR-III. Therefore, for the identification of P and MRP RNA we used a
pattern based on consensus features including the CR-I, CR-I and CR-
motifs as well as base-pairing rules consistent with the helix P2.When
a P or MRP RNA gene was not found using these patterns new searches
were carried out where mismatches were allowed. After the pattern
matching procedure, sequences fitting the secondary structure template
were further analyzed with Rfam coariance models. Highscoring
candidates were further analyzed for characteristics typical for P/MRP
RNA secondary structure; base pairing between the CR-I and CR- motifs,
presence of CR-I as well as the helices P1, P2 and P3. Also IREs were
identified using a combination of pattern matching and coariance
models.To identify as many potential IREs as possible we primarily
searched aailable mRNA sequences. In case there was no aailable mRNA,
genomic sequences was searched for regions homologous to aailable
proteins/mRNAs. Wheneer an IRE candidate was found in a genomic
sequence it was checked for reasonable proximity to the protein/mRNA
match.Candidate sequences were checked for consered primary sequence
motifs and the ability to fold into a secondary structure typical for
the iron responsie element
3. Profile HMMs of highly consered regions in P and MRP RNA
For prediction of P and MRP RNAs we also used profile HMMs created from
CR-I and CR- multiple alignments. We further analyzed all genomic
sequences that contained the CR-I and CR- motifs and where the distance
between the two motifs is less than 3000 bases. Adantages of this
method are that large genomes may be searched quickly (100 Mbases in a
few minutes) and in a highly specific manner identifies the P and MRP
RNA genes.Candidates identified in the search based on HMM profiles
were further analyzed to check
that other consered features of the RNA were present
4.Identification of protein homologues
An efficient method for protein identification is PSI-BLAST (Position
Specific Iteratie BLAST). PSI-BLAST can repeatedly search the target
databases, using a multiple alignment of high scoring sequences found
in each search round to generate a new more sensitie scoring matrix
able to find distantly related sequences that are sometimes missed in a
BLAST search. Multiple PSI-BLAST searches with different query
sequences were carried out in order to identify as many homologues as
possible belonging to a certain protein family.The NCBI Genbank protein
set was used as the primary source, but additional proteins were
identified from indiidual genome projects or identified from TBLASTN
searches of genome sequences. Wheneer releant, these noel sequences
were included in the set of sequences used as database in the PSI-BLAST
search.We also used profile HMMs at the Pfam database for Pop1, Pop3
(Rpp38), Pop5, Rpp14,Rpp20, Rpp25, Rpp40, Rpr2 (Rpp21) to identify
homologues. In cases where aailable Pfam models were not sufficient or
present, new models were created from multiple alignments and used with
the HMMER package to find additional homologues.
To identify homologues to preiously known proteins whose mRNAs are
known to contain IREs we mainly used BLAST to search the NCBI Genbank
set of proteins. Some gene sequences that were not in Genbank were
identified by Genewise  Genewise uses a combination of comparatie
analysis (aligns proteins to genomic sequences) together with
statistical signals to predict genes. For classification of proteins we
also made use of phylogenetic analysis, including methods of parsimony,
maximum likelihood and neighbour-joining..
5.ncRNA prediction using de noo methods
As opposed to the methods that detect new members of already known
ncRNA families described preiously (IRE and MRP/P RNA identification),
we hae also used two de noo methods, QRNA and RNAz , to computationally
screen the S.cereisae genome for ncRNAs.
QRNA makes a prediction of ncRNA based on pairwise alignments . It
compares the score of three distinct models of sequence eolution to
decide which one describes best thegien alignment: a pair SCFG is used
to model the eolution of secondary structure, a pair hidden Marko model
(HMM) describes the eolution of protein coding sequence, and a
different pair HMM implements the independent model of a sequence with
an eolutionary random pattern not consistent with either a secondary
structure or protein coding sequence.QRNA is currently limited to
pairwise alignments, and rather slow for ncRNA gene prediction at a
genomic scale. A program similar to QRNA, which tests for complementary
mutations in three-sequence multiple alignments, is ddbRNA . It
searches for common stems in the multiple alignments in a greedy
fashion. The assessment of the significance of the consered structure
is based on shuffled alignments.
The program RNAz makes a prediction of ncRNA based on multiple sequence
alignments . It uses two independent criteria for classification: a
z-score measuring thermodynamic stability of indiidual sequences, and a
structure conseration index obtained by comparing folding energies of
the indiidual sequences with the predicted consensus folding. The two
criteria are then combined to detect consered and stable RNA secondary
structures with high sensitiity and specificity. Yet another
application suitable for multiple alignments is MSARI . The approach
uses information from a larger set of sequence-aligned orthologs to
detect significant ncRNA secondary structures. Primary sequence
alignments are often inaccurate. In MSARI, one part of the method tries
to correct errors in multiple alignments through energy minimisation
A. T. Willingham在2005年用shRNA的arrayed
library针对512个进化保守的ncRNA进行干扰并进行细胞分析,他们鉴定了一个ncRNA repressor of the
nuclear factor of actiated T cells (NFAT), whichinteracts with multiple
proteins including members of the importin-betasuperfamily and likely
functions as a specific regulator of NFAT nuclear
1.参考文献:A. T. Willingham et al., Science 309, 1570 (2005).
2.参考文献:A. T. Willingham, Q. L. Deeraux, G. M. Hampton, P.Aza-Blanc, Oncogene 23, 8392 (2004).
我给您推荐几篇文献，关于Eolutionary Patterns of Non-Coding RNAs您可以点击下载
Eolutionary Patterns of Non-Coding RNAs.part1.rar (263.67k)
identified in AT-rich hyperthermophiles.part2.rar (104.35k)
RNA base compostion.rar (162.04k)
utionary Discrimination of Mammalian Consered.pdf (197.26k)
ncRNA 研究的tilling array design
1.The publicly aailable genome databaseswere searched using blastn
against all pre-miRNAs of the mir17 family . Conersely, the entire
MicroRNA Registry, was compared against the genomic sequences near the
putatie family members.
2.Exact locations of homologs of known miRNAs were identified using
clustalw alignments and subsequent prediction of the secondary
structure using ienna RNA Package , in particular the programs
RNAfold,RNAalifold, RNALfold, and alidot, in order to erify the hairpin
structure of the precursor.
3.Phylogenetic trees were reconstructed both with Maximum Parsimony and
Neighbor-joining using the phylip package with standard parameters. The
phylogeny of the entire clusters was computed using a concatenation of
the alignments of the indiidual paralogous microRNAs according to their
order in the cluster, and treating microRNAs that are not present in a
particular cluster as missing data. This ensures that distances are
measured based on nucleic acid substitution frequencies, not based on
changes of cluster organization. In order to identify distant sequence
similarities between pre-miRNAs from different paralog groups we
compute a similarity score based on the significance of the alignment
This method produces robust similarity scores in regimes where reliable global alignments cannot be obtained.
The duplication history of the mir17 family was reconstructed by hand based on the following assumptions: Edit operations are
a.duplications of indiidual microRNAs within a linked cluster,
b.the deletion of a microRNA,and
c.the duplication of an entire cluster.
In other words, we explicitly exlude the possibility of recombination
between paralog clusters within an organism and copying of indiidual
microRNAs from one cluster to another.The aailable data do not contain
any eidence that such processes might play a role.
第二部分Mapping miRNA genes
1.miRNA Map 是一个整合的数据，被开发用来存储已知miRNA 基因，假定的miRNA基因，已知的miRNA targets和假定的miRNA target.(Hsu et al 2006).
Eolutionary rates ary among rRNA structural elements
Nucleic Acids Research, 2007, ol. 35, No. 10 3339-3354
S. Smit, J. Widmann and R. Knight*
Department of Chemistry and Biochemistry, Uniersity of Colorado, Boulder, CO 80309
Understanding patterns of rRNA eolution is critical for a number of
fields, including structure prediction and phylogeny. The standard
model of RNA eolution is that compensatory mutations in stems make up
the bulk of the changes between homologous sequences, while unpaired
regions are relatiely homogeneous. We show that considerable
heterogeneity exists in the relatie rates of eolution of different
secondary structure categories (stems, loops, bulges, etc.) within the
rRNA, and that in eukaryotes, loops actually eole much faster than
stems. Both rates of eolution and abundance of different structural
categories ary with distance from functionally important parts of the
ribosome such as the tRNA path and the peptidyl transferase center. For
example, fast-eoling residues are mainly found at the surface; stems
are enriched at the subunit interface, and junctions near the peptidyl
transferase center. Howeer, different secondary structure categories
eole at different rates een when these effects are accounted for. The
results demonstrate that relatie rates and patterns of eolution are
lineage specific, suggesting that phylogenetically and structurally
specific models will improe eolutionary and structural predictions.
RNAmmer: consistent and rapid annotation of ribosomal RNA genes
Nucleic Acids Research, 2007, ol. 35, No. 9 3100-3108
Karin Lagesen1,2,*, Peter Hallin3, Einar Andreas R?dland1,2,4,5,
Hans-Henrik St?rfeldt3, Torbj?rn Rognes1,2,4 and Daid W. Ussery1,2,3
The publication of a complete genome sequence is usually accompanied by
annotations of its genes. In contrast to protein coding genes, genes
for ribosomal RNA (rRNA) are often poorly or inconsistently annotated.
This makes comparatie studies based on rRNA genes difficult. We hae
therefore created computational predictors for the major rRNA species
from all kingdoms of life and compiled them into a program called
RNAmmer. The program uses hidden Marko models trained on data from the
5S ribosomal RNA database and the European ribosomal RNA database
project. A pre-screening step makes the method fast with little loss of
sensitiity, enabling the analysis of a complete bacterial genome in
less than a minute. Results from running RNAmmer on a large set of
genomes indicate that the location of rRNAs can be predicted with a ery
high leel of accuracy. Noel, unannotated rRNAs are also predicted in
many genomes. The software as well as the genome analysis results are
aailable at the CBS web serer.
Microarray analysis of newly synthesized RNA in cells and animals
M. Kenzelmann*??, S. Maertens?, M. Hergenhahn§?, S. Kueffer?, A.
Hotz-Wagenblatt , L. Li**, S. Wang?, C. Ittrich??,T. Lemberger*, R.
Arribas??, S. Jonnakuty , M. C. Hollstein§, W. Schmid*, N. Gretz**, H.
J. Gro¨ ne?, and G. Schu¨ tz*
6164–6169 PNAS April 10, 2007 ol. 104 no. 15
Current methods to analyze gene expression measure steady-state leels
of mRNA. To specifically analyze mRNA transcription, we hae deeloped a
technique that can be applied in io in intact cells and animals. Our
method makes use of the cellular pyrimidine salage pathway and is based
on affinity-chromatographic isolation of thiolated mRNA. When combined
with data on mRNA steady-state leels, this method is able to assess the
relatie contributions of mRNA synthesis and degradation/stabilization.
It oercomes limitations associated with currently aailable methods such
as mechanistic interention that disrupts cellular physiology, or the
inability to apply the techniques in io. Our method was first tested in
serum response of cultured fibroblast cells and then
applied to the study of renal ischemia reperfusion injury,
demonstrating its applicability for whole organs in io. Combined with
data on mRNA steady-state leels, this method proided a detailed
analysis of regulatory mechanisms of mRNA expression and the relatie
contributions of RNA synthesis and turnoer within distinct pathways,
and identification of genes expressed at low abundance at the
Specificity, duplex degradation and subcellular localization of antagomirs
Nucleic Acids Research, 2007, ol. 35, No. 9 2885–2892
Jan Kru¨ tzfeldt1,y, Satoru Kuwajima1, Rai Braich2, Kallanthottathil G. Rajee2,
John Pena3, Thomas Tuschl3, Muthiah Manoharan2 and Markus Stoffel1,
MicroRNAs (miRNAs) are an abundant class of 20–23-nt long regulators of
gene expression. The study of miRNA function in mice and potential
therapeutic approaches largely depend on modified oligonucleotides. We
recently demonstrated silencing miRNA function in mice using chemically
modified and cholesterol-conjugated RNAs termed ‘antagomirs’. Here, we
further characterize the properties and function of antagomirs in mice.
We demonstrate that antagomirs harbor optimized phosphorothioate
modifications, require 419-nt length for highest efficiency and can
discriminate between single nucleotide mismatches of the targeted
miRNA. Degradation of different chemically protected miRNA/antagomir
duplexes in mouse liers and localization of antagomirs in a cytosolic
compartment that is distinct from processing (P)-bodies indicates a
degradation mechanism independent of the RNA interference (RNAi)
pathway. Finally, we show that antagomirs, although incapable of
silencing miRNAs in the central nerous system (CNS) when injected
systemically, efficiently target miRNAs when injected locally into the
mouse cortex. Our data further alidate the effectieness of antagomirs
in io and should facilitate future studies to silence miRNAs for
functional analysis and in clinically releant settings.
Non-coding RNAs: Small Inhibitory-RNA ( siRNA; RNAi; microRNA; miRNA)
Animations of Inhibitory RNA Action:
Nature Reiews - A high quality moie describing inhibitory RNA eents and mechanisms.
Nature Reiews Genetics - A flash animation [Nature Reiews Genetics 2;
110-119 (2001) "Post-Transcriptional Gene Silencing by Double-Stranded
RNA." Figure 1.]