Assembly and Annotation of the YO transcriptome

The assembly of the YO/I/DRESA transcriptome was performed at the Oklahoma State University High Performance Computing Center using SSH ( Computations required for preparation of assembly, steady-state abundance estimations and annotation were performed on local departmental computing resources. Protocols and command line programming used for pipeline integration are described in the Supplementary scripts.  Access to the Illumina raw data sets has been deposited in CyVerse Discovery Environment.

Trinity software with default settings (version number: 2.0.1) was used to generate the transcriptome assembly.

For annotation, both nucleotide sequences and predicted peptide sequences were used to run BLAST queries against Swiss-Prot (SP), TrEMBL, and Uniprot Uniref90 protein databases ( (version number: v2.0.1; was used to predict coding peptide sequences from the YO/I/DRESA transcriptome contig sequences. These peptide sequences were annotated via BLASTp against known databases with a cutoff of 1e−5. In order to identify conserved protein families among the predicted peptide sequences, HMMER hmmscan ( was used to search for sequences against a Pfam-A database (downloaded on April 19, 2016) ( In addition, transmembrane helical domains, cleavage sites for signal peptides were identified using TMHMM (version number: 2.0c), and SignalP (version number: 4.1) software respectively.

The KEGG Automatic Annotation Server (KAAS) software ( was used to obtain KO numbers to create a summary of the KO terms associated with annotated transcripts in the YO/I/DRESA_filtered transcriptome.

The manuscript describing a pathway analysis is under review (Shyamal et al ., 2017)

The information containing trinity assembly and annotations is available here.

A website for Bidirectional Blast search against the YO database is under progress.