Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format)

Szczegóły
Opis

Tytuł:: Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format)
Autorzy:: Gumińska, Natalia
Płecha, Magdalena
Zakryś, Bożena
Milanowski, Rafał
Wydawca:: RepOD
Tematy:: Mathematical Sciences
Euglena gracilis
PacBio
sequencing reads
gapC transcripts
tubA transcripts
Dostawca treści:: Repozytorium Otwartych Danych
: Inne

Przejdź do źródła

Deposited are analyzed sequencing data corresponding to the amplicons obtained as a result of RT-PCR reactions for the gapC and tubA partially spliced transcripts of Euglena gracilis. Sequencing of obtained PCR products was carried out commercially on the PacBio RS II (Pacific Biosciences) instrument. Sequencing reads were initially quality controlled and trimmed by the sequencing institution. Analyzed sequencing reads for the three amplified amplicons, gapC, i6- and i5-tubA, were assembled to contigs (c) and assigned into two categories: accepted (fair sequencing reads taken to the further analysis) and discarded (unfair reads rejected from the further analysis). They were accordingly organized in the deposited folders (f). Only in case of the amplicons i6- and i5-tubA, a part of the discarded reads was mutual for both amplicons as a result of the preliminary quality examination (mapping of the sequences to the reference based on their length). Those reads were provided in the separate folder (f: tubA-i6i5_discarded_reads). The structure of the folders is following: f: gapC - accepted_reads (29 files in fasta format), discarded_reads (29 files in fasta format); f: i5-tubA - accepted_reads (21 files in fasta format), discarded_reads (10 files in fasta format); f: i6-tubA - accepted_reads (20 files in fasta format), discarded_reads (10 files in fasta format); f: i6i5-tubA _discarded_reads (23 files in fasta format). The files were named according to the following example scheme: c1_gapC_accepted#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the accepted category in analysis number #1 are stored c1_gapC_discarded#1: in this file sequencing reads assembled to the contig c1 of gapC amplicon, assigned to the discarded category in analysis number #1 are stored. In case of the gapC amplicon we performed three equal analyses (#1, #2, #3) in order to effectively and reliably examine the sequencing reads, while for i6- and i5-tubA amplicons – two (#1, #2).

Informacja

Euglena gracilis PacBio sequencing reads corresponding to the partially spliced transcripts of the gapC and tubA genes (in fasta format)