Tytuł pozycji:
Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format)
Deposited are analyzed sequencing data corresponding to the amplicons obtained as a result of RT-PCR reactions for the gapC and tubA partially spliced transcripts of Euglena gracilis. Sequencing of obtained PCR products was carried out commercially on the PacBio RS II (Pacific Biosciences) instrument. Sequencing reads were initially quality controlled and trimmed by the sequencing institution.
Analyzed sequencing reads for the three amplified amplicons, gapC, i6- and i5-tubA, were assembled to contigs (c) and assigned into two categories: accepted (fair sequencing reads taken to the further analysis) and discarded (unfair reads rejected from the further analysis). They were accordingly organized in the deposited folders (f). Only in case of the amplicons i6- and i5-tubA, a part of the discarded reads was mutual for both amplicons as a result of the preliminary quality examination (mapping of the sequences to the reference based on their length). Those reads were provided in the separate folder (f: tubA-i6i5_discarded_reads).
The structure of the folders is following:
f: gapC -
accepted_reads (29 files in fasta format),
discarded_reads (29 files in fasta format);
f: i5-tubA -
accepted_reads (21 files in fasta format),
discarded_reads (10 files in fasta format);
f: i6-tubA -
accepted_reads (20 files in fasta format),
discarded_reads (10 files in fasta format);
f: i6i5-tubA _discarded_reads (23 files in fasta format).
The files were named according to the following example scheme:
c1_gapC_accepted#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the accepted category in analysis number #1 are stored
c1_gapC_discarded#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the discarded category in analysis number #1 are stored.
In case of the gapC amplicon we performed three equal analyses (#1, #2, #3) in order to effectively and reliably examine the sequencing reads, while for i6- and i5-tubA amplicons – two (#1, #2).