Methods


This page contains brief and concise information about the methodology and steps opted for the identification of Plant tRNAs and the further analysis conducted on the data.


Data Retrieval

All nuclear genome dataset of plants were retrieved from NCBI using file transfer protocol (ftp). Organellar genome dataset of plants was fetched using edirect utility of NCBI. The code used for edirect utility is mentioned below:

esearch -db nucleotide -query "(enter_id_to_be_retrieved)" | efetch -format fasta > file


Detection of tRNA

tRNA genes were detected by tRNAscan-SE (v2.0). Below is the command of tRNAscan-SE which is used for nuclear genome dataset (i.e. eukaryotic mode) and organellar genome dataset:

tRNAscan-SE --detail -H -E -y -f# -s# -m# -b# -a# -l# -d 1>>plant_name.report 2>>plant_name.report -o# --thread 20 seq.fa

The organellar genome dataset was analysed by organellar mode:

tRNAscan-SE --detail -H -O -y -f# -m# -b# -a# -l# -d 1>>plant_name.report 2>>plant_name.report -o# --thread 20 seq.fa


Isoacceptor and Isodecoder wise consensus sequence analysis

The tRNA sequences based and isoacceptors and isodecoders followed by consensus sequence based study via mlocarna utility of LocaRNA tool

/path/to/mlocarna input.fa --stockholm


tRNA secondary structure prediction

tRNA minimum free energy secondary structure of those isoacceptors and isodecoders which only have a single sequence in the respective plant family, is predicted by RNAfold program, and other utilities i.e. relplot.pl, and rotate_ss.pl from ViennaRNA.

The partition function and base pairing probability matrix is caluculated by calling -p option. The maximum expected accuracy (MEA) structure is calculated by using --MEA option:

RNAfold -p --MEA foo.fa

The above command generates postscript secondary structure plot (foo_ss.ps) and dot plot (foo_dp.ps) containing pair probabilities, for each fasta sequence in provided file.

A perl script relplot.pl from ViennaRNA, reads a postscript secondary structure plot and a dot plot containing pair probabilities as produced by "RNAfold -p", and produces a new structure plot, color annotated with reliability information in the form of either pair probabilities or positional entropy (default).

relplot.pl foo_ss.ps foo_dp.ps > foo_rss.ps


tRNA Infernal models

Based on the isoacceptors and isodecoders of each plant species, infernal model was built using the stockholm files generated from mlocarna run. Three utilities of infernal were used in order to build complete models: cmbuild, cmcalibrate and cmpress. The commands are also mentioned below. cmscan was integrated in the "ANALYZE" module of the databasei.

cmbuild outputfile.cm file.stk

cmcalibrate outputfile.cm

cmpress outputfile.cm