Performing a de novo assembly on the external calculation engine

This tutorial illustrates how to import FASTQ file links into a BioNumerics database and finally how to perform a de novo assembly on the external calculation engine.

Sequence read sets

A sequence read set is designed to hold large sets of short reads generated by next generation sequencing (NGS). Base sequences and their associated quality scores are stored for single-end and paired-end reads, originating from various high-throughput sequencing platforms such as Illumina, Ion Torrent, PacBio, Oxford Nanopore, etc.

Download PDF file:

Denovo_cloud.pdf

Download sample data:

Sequence read set data

This data set contains two gzipped fastq files of one paired end read data file pair coming from Staphylococcus aureus. This data was generated by Illumina MiSeq whole genome sequencing and downloaded from NCBI.

Download demonstration database:

WGS_demo_database_for_Staphylococcus_aureus

Demonstration database containing data for a set of 97 Staphylococcus aureus isolates originating from three published studies. This database uses publicly available next-generation sequence reads from the Sequence Read Archive (SRA). For each isolate, NGS reads were de novo assembled into genome sequences. wgMLST alleles were called using the assembly-based and assembly-free method.

Note that the downloaded database backup file (.bnbk) can be restored via the Restore database... functionality in the BIONUMERICS startup screen.

See movie:

Search form

Performing a de novo assembly on the external calculation engine

Sequence read sets