Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
  HomeHelpSearchLoginRegister  
 
 
Page Index Toggle Pages: 1
BLAST CDS from annotated genome file (Read 8653 times)
Jul 13th, 2018 at 10:43am

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi!

Maybe this is a very basic question but I really don't know how UGENE handles BLAST of a complete annotated bacterial genome sequence against a amino acid database. I want to BLAST my genome but only the annotated CDS so when I search the amino acid database, the results are annotated on the genome as annotations overlapping the CDS. What I'm obtaining are hits annotated on intergenic regions as well. How do I restrict BLAST search to only the annotated CDS of the entire sequence?

Thank you in advance  Smiley.

Felipe
 
IP Logged
 
Reply #1 - Jul 13th, 2018 at 2:41pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felipe!

I didn't get which BLAST you use as we have NCBI BLAST, local BLAST and BLAST+. However, I see two approaches in general:
1) Do BLAST for the whole sequence, then filter out annotations that do not intersect the CDS annotations.
2) Extract the CDS regions and then do BLAST for them.

For the first approach you may try to use the "Intersect annotations" sample workflow. Feel free to ask additional questions, if you need help with that.

You're welcome!  Smiley
 
IP Logged
 
Reply #2 - Sep 25th, 2018 at 6:07pm

JeraldHon   Offline
YaBB Newbies

Posts: 1
*
 
Hi Olga, which of those two approaches would you suggest for a beginner like me? Which one is simpler to execute?
 

Everyone should get familiar with these mig welders if they're doing any welding.
IP Logged
 
Reply #3 - Sep 25th, 2018 at 6:25pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Jerald!

I recommend you to start with basic features, for example, try to use local BLAST+.

First, create a BLAST+ database or download it from NCBI. To create it from a set of genomes use "Tools > BLAST > BLAST+ search" in the UGENE main menu. In the appeared dialog make sure to select correct type of sequences: protein/nucleotide, set other settings.

Second, try to run BLAST+ search for a single sequence. To do that open the query sequence in UGENE (you should see the Sequence View window). Select "Actions > Analyze > Query with local BLAST+". Set the created or downloaded database in the dialog. Investigate the results.

Third, try to use a sample workflow in the Workflow Designer. For using local BLAST+ you will need to create a custom workflow that consists of elements "Read Sequence" and "Local BLAST+ Search". It's the same as in the Sequence View (the second step), but in the workflow you can select as many query sequences as you need. You set them as input in the "Read Sequence" element. In the "Local BLAST+ Search" element you set the database.

Forth, try to use the "Intersect annotations" workflow. Or even create a custom workflow from it and the workflow from step3.

You're also welcome to ask additional questions, if any!
 
IP Logged
 
Reply #4 - Nov 8th, 2018 at 12:18pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi Olga, thank you very much for your response. I apologyze for answering after such a long time. I thought I had activated mail notifications for answers to this thread but they weren't.

I have used BLAST+ but I havent managed to blast my complete annotated genome sequence against a amino acid database, as the blastx option is not available.

Basically what I want to do is to locate some gene candidates that are in my annotated genomes that code for homologs of the sequences that I have in the database.

Actually even if I can't get rid of the blast results that don't intersect annotations I can still get the information that I need as long as those intersecting them are displayed. I had results using tblastx but when I only have available amino acid sequences this approach can't be used.

Hence my need to use blastx agaist my amino acid sequence database.

EDIT: I actually managed to do it by using actions>analyze>Query with local BLAST+. For some reason by using tools>BLAST>BLAST+ search the blastx option is not available. There's one disadvantage of using the former approach thoguh, and it is that if you want to annotate the blast results on a .gb file that contains several annotated replicons, only the last replicon of the replicon list will display the blast results. Is this a bug maybe? Also the blastx results appear always annotated on the + strand even if the gene that overlaps that result (because of high homology) is in the - strand, making the result arrow appear in the opposite direction of the gene


Also It could be very useful to have color coding of the blast results that are annotated back to the genome. As of now all these results appear as blue arrows by default. Is it possible to program UGENE to change the color acording to the identity percentage of the alignment?

Thank you for your help.
« Last Edit: Nov 8th, 2018 at 1:37pm by felince tokyo »  
IP Logged
 
Reply #5 - Nov 8th, 2018 at 5:43pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince,

We've reproduced the issues with "blastx":
  • The item is not available in the dialog, opened from the "Tools" menu.
  • Strand of the results is always direct.

Thank you very much for reporting these issues!

Quote:
Also It could be very useful to have color coding of the blast results that are annotated back to the genome. As of now all these results appear as blue arrows by default. Is it possible to program UGENE to change the color according to the identity percentage of the alignment?

I liked this idea! Maybe we'll do something like this in future.
In the current UGENE version color of an annotation is determined by the annotation name. You can configure the color on the "Annotations Highlighting" tab of the Options Panel. On the same tab you can specify a qualifier that you would like to see in the Zoom View.
Also, you can add columns with certain qualifier values and sort these value by clicking on a column header.
See the screenshots.

Regarding your analysis workflow could you please describe it in more details with some sample data?
 
IP Logged
 
Reply #6 - Nov 9th, 2018 at 9:43am

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi Olga thank you so much!

The screenshots are really explanatory, I didn't know how to do this  Smiley

Basically I had sequenced several bacterial genomes and I'm looking for candidate genes that have homology with amino acid sequences of about 180 proteins that I have i a local database.

Of course I want to take the results with significant identity and ignore those with poor alignment.

With what you explanined its enough for what I want to do so I really appreciate your help.

Also thanks a lot for UGENE development. It is indeed a marvelous software.

EDIT: I forgot to report one inconsistency regarding annotation edition on the zoom view. Whenever I want to edit an anotation and double click on its corresponding arrow the actual option to edit the annotation (F2) is not available.

To do be able to do this I have to first click on the annotation arrow on the zoom view, then click on the annotation on the features windows and the go back to the zoom view and double click again the annotation arrow.

Even by doing this there are times when this will not work, and I will be unable to edit my annotation. And even if it works, if I want to edit several annotations quickly I have to do this cumbersome procedure.



Felipe
« Last Edit: Nov 9th, 2018 at 11:52am by felince tokyo »  

annotation_bug_1.jpg (178 KB | 295 )
annotation_bug_1.jpg
IP Logged
 
Reply #7 - Nov 9th, 2018 at 2:03pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince!

Quote:
With what you explanined its enough for what I want to do so I really appreciate your help.
Also thanks a lot for UGENE development. It is indeed a marvelous software

OK, I'm glad that it helped!
And thank you for your kind words about UGENE!

Quote:
I forgot to report one inconsistency regarding annotation edition on the zoom view. Whenever I want to edit an anotation and double click on its corresponding arrow the actual option to edit the annotation (F2) is not available.

To do be able to do this I have to first click on the annotation arrow on the zoom view, then click on the annotation on the features windows and the go back to the zoom view and double click again the annotation arrow.

Even by doing this there are times when this will not work, and I will be unable to edit my annotation. And even if it works, if I want to edit several annotations quickly I have to do this cumbersome procedure.

Thank you very much for reporting this! We'll fix the issue as soon as possible!! The fix will be included into UGENE 1.32 version.
 
IP Logged
 
Reply #8 - Nov 15th, 2018 at 9:27am

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Thank you Olga!

I don't know if its possible but it could be really useful to add an option to copy the value of the highlighted qualifier in the features window. For example, sometimes I want to copy the product name or the coordinates for one gene and paste it in a word or excell document for later use.

Another thing that I think it will be super usefull is to add a column with the length of the CDS. Of course it appears in the zoom view when you double click on it, but it will be nice to have this information on the pop up box that appears when you hover over the CDS and in the feature window.

I don't know how difficult it is to add these features to UGENE, but it will be cool if they could be added  Cheesy
 
IP Logged
 
Reply #9 - Nov 19th, 2018 at 5:04pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince!

Quote:
I don't know if its possible but it could be really useful to add an option to copy the value of the highlighted qualifier in the features window.

It is already possible to do this in the current UGENE version. Select a qualifier in the Annotations Editor and then select "Copy/Paste > Copy qualifier '...' value".

By the way, I noticed the item is not present in the corresponding "Actions" main menu. We'll fix that.

Quote:
For example, sometimes I want to copy the product name or the coordinates for one gene and paste it in a word or excell document for later use.

Another option to do this is to edit an annotation or a qualifier. For example, when you open the "Edit Annotation" dialog, you can switch to "GenBank/EMBL format" and copy the region.

Maybe we'll add a separate option to copy a region into the "Copy/Paste" menu.

Quote:
Another thing that I think it will be super usefull is to add a column with the length of the CDS. Of course it appears in the zoom view when you double click on it, but it will be nice to have this information on the pop up box that appears when you hover over the CDS and in the feature window.

OK, we should think it over.

Actually, you might consider to use our commercial support services for adding such small improvements into UGENE. We'll try to take into account your feedback anyway, however, in case of commercial support you have faster and guaranteed result.  Wink

Quote:
I don't know how difficult it is to add these features to UGENE, but it will be cool if they could be added   Cheesy

Thank you again for your feedback!!
 
IP Logged
 
Page Index Toggle Pages: 1