UGENE Forum
https://forum.ugene.net/forum/YaBB.pl
General Category >> Help and How-to >> Report generation
https://forum.ugene.net/forum/YaBB.pl?num=1425320889

Message started by Leiga on Mar 3rd, 2015 at 1:28am

Title: Report generation
Post by Leiga on Mar 3rd, 2015 at 1:28am
Dear All,
I would like to prepare statistical report from my fastq data.
I was responsible for project with bacterial genome:
1. 1st step - it was de novo assembly (after that I choose possible reference sequence, and...)
2. 2nd step - it was resequencing project (I choose bowtie2 alignment)
After that I used also basys platform for automated annotation.

Now, I have some questions:
A. Is there any possibility to prepare some statistic about sequencing project with ugene platform and included scripts? I mean length distribution, GC-content (I think I saw that), average phred score, and base-quality distribution?


B. non matched sequence - as I write, I used chosen sequence as a reference one, after de novo sequencing I have seen some contigs in which I recognised some plasmids sequences, after resequencing step I think I lost this data (or I do not know where to find this reads...) Can you suggest something in that case? I mean de novo assembly using data which do not match to the "refseq"


Sorry for such many questions, but I really enjoy your platform as a easy to use for people with not to big knowledge about unix...

Regards
L.

Title: Re: Report generation
Post by Olga Golosova on Mar 3rd, 2015 at 9:09pm
Hi Leiga!

A:
UGENE integrates FastQC tool for getting the statistics like shown on the image. You can watch this video to learn how to interpret the FastQC reports: http://www.youtube.com/watch?v=bz93ReOv87Y

To run it in UGENE 1.16, select "Tools > NGS data analysis > Reads quality control" in the main menu.

B:
Could you please give more details? What is the goal? Wouldn't it help to map the reads to the plasmid sequences?

By the way, have you succeeded using SPAdes in UGENE (http://ugene.unipro.ru/forum/YaBB.pl?num=1422521083) or you used another de novo assembler?
fastqc.png (102 KB | 497 )

Title: Re: Report generation
Post by Leiga on Mar 4th, 2015 at 2:06am
Dear Olga,

my goal?
A) main task - to identify and obtain sequence of bacterial DNA
I decided to use de novo as a first step (for free in illumina base space platform, but I have got to much reads to use free spades tool there, that's why i was looking for system like yours), but it gives me only partial information, but helpful in looking for possible reference sequence ( I have checked it manually using BLAST)
After that I choose "refseq" and do resequencing... (very nice coverage - near 1000x, no gaps...., but also i do not know how to sort reads which do not match to my refseq).
B) minor task - plasmids... or another add-ons

For resequencing I used your platform and bowtie2, and obtained consensus I uploaded to basys platform (now I got annotated sequence),
Do I do something wrong? what is a typical pipeline for this kind of task?
Now I am using clc tool, I have similar but different consensus sequence obtained from the same reads and the same refseq, using unmatched reads I was able to perform de novo sequencing and I have some huge fragments of DNA (different set of plasmids, still analyse it)

Another project - unknown bacteriophages sequencing project - I think I need spades for that...
but still have a problem with my data... I used new release of UGENE but it does not change anything. Tomorrow I will have my IT specialist, I am not familiar with unix system, he will help me with command line (I know, you have write my almost everything :)). Tomorrow I will try to use Win version of UGENE, and I will see what I have after using clc tool...

Sorry for my silence, but I had long break from work, today is my second day.

Regards
L.


Title: Re: Report generation
Post by Olga Golosova on Mar 4th, 2015 at 9:42pm
Dear Leiga,

Thank you for the description!


Quote:
but also i do not know how to sort reads which do not match to my refseq



Quote:
using unmatched reads I was able to perform de novo sequencing and I have some huge fragments of DNA (different set of plasmids, still analyse it)


We're going to add a new feature into UGENE to simplify the procedure of the unmapped reads extraction (see UGENE-4123).


Quote:
Another project - unknown bacteriophages sequencing project - I think I need spades for that... but still have a problem with my data...

Thanks again for testing! As I wrote you in the other forum thread, it would help a lot if you share with us the data, so we're able to reproduce the issue. Of course, if it is possible.


Quote:
Sorry for my silence, but I had long break from work, today is my second day.

No worries! I'm glad that you back  :)

UGENE Forum » Powered by YaBB 2.5 AE!
YaBB Forum Software © 2000-2010. All Rights Reserved.