UGENE Forum | |
https://forum.ugene.net/forum/YaBB.pl
General Category >> Help and How-to >> Next Gen Sequencing https://forum.ugene.net/forum/YaBB.pl?num=1300429237 Message started by vijay on Mar 18th, 2011 at 1:20pm |
Title: Next Gen Sequencing Post by vijay on Mar 18th, 2011 at 1:20pm
Can Ugene handle NGS data
|
Title: Re: Next Gen Sequencing Post by Konstantin Okonechnikov on Mar 18th, 2011 at 4:35pm
Support of Next Generation Sequencing capabilities is a high priority direction for UGENE project. We have already implemented several steps towrads NGS data analisys and visualization.
Do you have any particular tasks in mind? |
Title: Re: Next Gen Sequencing Post by Hieu Cao on Apr 30th, 2011 at 4:55am
I have Solexa/Illumina reads in fastq format and contigs in fasta format. I tried several things below.
1. I want to create a local DB from reads and then blastn contigs to this DB. It seems that FormatDB tool does not support for fastq sequence. An empty DB was created. 2. I would like to annotate around 40k contigs by Blastn to RefRNA or some other DB. When I designed a workflow for this by Read sequence => Remote Blastn => Write sequences. Is this posible to transfer blast results to annotation information that attached directly with contig sequences? Of cource I can do this by select single sequence for blasting but not in workflow. DO you have any solution for those tasks? Thanks in advance, |
Title: Re: Next Gen Sequencing Post by Ivan Efremov on May 3rd, 2011 at 10:47am
Hi Hieu,
here are the answers: 1. You can convert fastq to fasta using "Save a copy" button in project view, or using a simple workflow "Read Sequence -> Write Fasta". After converting fastq to fasta you will be able to use the FormatDB tool. 2. You have to select that annotations from remote blast should be used in genbank writer. See attached screenshot. Feel free to ask more questions if further help is needed. |
Title: Re: Next Gen Sequencing Post by Hieu Cao on May 4th, 2011 at 5:35am
Thanks for your quick answers. However for the 1st task, I used a simple workflow "Read Sequence -> Write Fasta" for converting fastq to fasta format. I got following error message.
Code:
|
Title: Re: Next Gen Sequencing Post by Ivan Efremov on May 4th, 2011 at 1:09pm
Many thanks for the report. We will try to fix the bug as soon as possible.
It would be very helpful if you share with us the workflow scheme file you are running and the fastq file. If you can not share the fastq, please tell us its filesize and number of sequences. Thanks! |
Title: Re: Next Gen Sequencing Post by Hieu Cao on May 4th, 2011 at 11:56pm
The fastq file is 5.8 Gb in size containing about 20 milions of reads/sequences.
I attached the schema below. https://forum.ugene.net/forum/YaBB.pl?action=downloadfile;file=remoteBLAST.uwl (3 KB | )
|
Title: Re: Next Gen Sequencing Post by Ivan Efremov on May 5th, 2011 at 10:20am
Ok, thanks. We will re-check and fix handling big datafiles in UGENE.
|
Title: Re: Next Gen Sequencing Post by Hieu Cao on May 8th, 2011 at 5:38am
for the 2nd task, i created a workflow to 1) read sequence from contigs => 2) remote BLASTn => 3) write to a Genbank format file with annotation from remoteBLAST. It looks like your screenshot.
My question is do we need to create a blank file with name 1.gb in target location before running scheme? Because in my scheme, it run well in remote BLASTn task but could not somehow write to output file. What difference between overwrite and append modes? BR, |
Title: Re: Next Gen Sequencing Post by Konstantin Okonechnikov on May 11th, 2011 at 1:26am
You don't have to create an empty file, it will be created automatically. If this doesn't happen, something is wrong. What was the error message after the schema execution?
The difference between the "append" and "overwrite" modes are the following: in "overwrite" mode the output Genbank file will be overwritten on every schema launch, while in "append" mode the data will be added to end of file. |
Title: Re: Next Gen Sequencing Post by Hieu Cao on May 15th, 2011 at 9:02pm
thank you.
Is there any function in UGene that can help us to summarize information of a contig file? Like calculate N50, how many contigs/sequences have its length longer than 1kb, for instance. |
Title: Re: Next Gen Sequencing Post by Ivan Efremov on May 17th, 2011 at 12:55pm
Right now there is no such functionality in UGENE, so we will think over how to add this. For now, it looks like the easiest way would be to create a simple script worker for Workflow Designer.
|
Title: Re: Next Gen Sequencing Post by Mikhail Fursov on May 17th, 2011 at 8:23pm Hieu Cao wrote on May 15th, 2011 at 9:02pm:
Please check this issue for the progress: https://ugene.unipro.ru/tracker/browse/UGENE-313 |
UGENE Forum » Powered by YaBB 2.5 AE! YaBB Forum Software © 2000-2010. All Rights Reserved. |