Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
Page Index Toggle Pages: 1
How to view large data? (Read 3779 times)
Sep 7th, 2016 at 8:53am

elginong   Offline
YaBB Newbies

Posts: 3

I tried to download my fastq file but failed. Is there any software out there that can view very large data?


Screenshot__41_.png (120 KB | 380 )
IP Logged
Reply #1 - Sep 12th, 2016 at 4:03pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
A FASTQ file, received from a sequencing facility, contains separate short reads - nucleotide sequences, quality information about each base in a read, etc.

Commonly, you don't read a FASTQ file in detail, because there are a lot of reads inside (several thousands or even more).

However, you may overview the general statistics about the data. In UGENE you can do it using a popular tool FastQC, available from "Tools > NGS data analysis > Reads quality control" item in the main menu.

The next step is to assemble the small reads into a single long genome. There are two strategies to do that:

1) If you have an appropriate reference sequence, you can map all reads to it. There are different tools in UGENE to do the mapping (BWA, Bowtie2, etc.). There is also a complex workflow "Raw DNA-Seq data processing" that allows you to do additional pre- and post-processing steps, required to improve the data quality and make the mapping more efficient.
The assembled reads can be viewed in the UGENE Assembly Browser.

2) The alternative way is to assemble reads de novo. This task is more complex and for big data like human genome it requires a lot of resources. In UGENE for small genomes (like bacterial or viral) you may use SPAdes tool to do the task.
IP Logged
Page Index Toggle Pages: 1