Crepidula atrasolea Developmental Transcriptome Data
Sequence information is also available through NCBIat dbEST and the Trace Archives.
|
|
|
RNASeq library was prepared with Illumina's TruSeq Stranded mRNAseq Sample Prep kit (Illumina) with one modification: fragmentation was done at 94*C for one minute. Reads are 250nt in length. Read1 aligns to the antisense strand, Read 2 aligns to the sense strand. Sequencing was performed on an Illumina HiSeq 2500. Library adaptors have been trimmed from the 3'-end of the reads. |
|
|
Paired-end, stranded 250bp reads assembled using TrinityRNASeq v2.2.0 running default settings, including defaults for read trimming (Trimmomatic as implemented in TrinityRNASeq) and normalization (as implemented in Trinity RNASeq). Assembled using kmer size of 25bp (default) and 31bp.
|
|
|
Assembly filtered using kallisto to remove all transcripts with TPM lower than 1. Further cleaning of filtered k31 assembly using MCSC pipeline, a decontamination method that uses hierarchical clustering and taxon identification based on sequence similarity to remove contaminants without pre-existing knowledge of contaminants present. Pipeline used as described in Lafond-Lapalme et al., 2016, Bioinformatics. |
|
|
Trinotate annotation report incorporating results of blastx searches of the assembled contigs and blastp searches of the longest open reading frames as predicted by Transdecoder (v2.0.1, https://transdecoder.github.io/) against the Uniprot Swissprot database (rel 30-Nov-2016) using BLAST+ (v2.6.0). Only the top hits to the UniProt database were retained. Protein domains predicted by HMMER (v3.0) and top hits from a Diamond blastx search (v0.8.5) against the UniRef90 database (Release 2016-11) used as implemented in the MCSC decontamination pipeline were also incorporated. Trinotate adds Gene Ontology and KEGG pathway information to the output based on the SwissProt database. |
|
|
This research is funded by N.S.F. Grant IOS 1558061 to J.J.H.