Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
  HomeHelpSearchLoginRegister  
 
 
Page Index Toggle Pages: 1
Problem mapping to reference (Sanger) (Read 12888 times)
Nov 8th, 2018 at 10:10pm

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
Hi,

I'm new to UGene and the forum. I have looked before posting, but I haven't found any working solution  Sad. I'm trying to map some reads to a reference sequence using the tool "Map Sanger reads to reference".

My reference sequence is a fasta file made from the consensus sequence of a previous alignment. The sequences I'm trying to map are in a multifasta file. When I run the tool, it appears the following error:

Task report [Map Sanger reads to reference]

Status       Failed
Error:        Subtask {Compose alignment} is failed: The related                         
                 chromatogram not found
Time        0h 00m 32.595s

I dont have chromatograms to my reads because they are reads from databases (NCBI, BOLD, etc..)

Just in case this information is necessary, I'm running Ubuntu 18.04.1 LTS in a virtual machine hosted in Windows 10. The Unipro UGENE version is v1.31.1.

Thanks!  Smiley
 
IP Logged
 
Reply #1 - Nov 9th, 2018 at 3:07pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Julio.

Quote:
I have looked before posting, but I haven't found any working solution

Thank you for your feedback! Apparently, we should simplify this feature in UGENE and improve documentation.  Smiley

First of all, note that you shouldn't use "Map Sanger reads to reference" tool. It implies that you provide the Sanger reads in *.ab1 or similar format (see a video about this feature).

For your task there are a couple of workarounds you may try first. Please try them and write, if it works for you or not.

1) If you have a few sequences you would like to align to the reference sequence:
  • Export the reference sequence to an alignment format to open it in the Alignment Editor. Select "Export/Import > Export sequences as alignment" in the context menu for the sequence object in the Project View tp do this (see "option1_open_reference_as_alignment.png").
  • Then make sure the Alignment Editor window is opened and click "Align sequence to this alignment" button on the toolbar (see "option1_align_reads_to_reference.png" and documentation).

Note that you should manually convert a read sequence to its reverse-complement sequence in this case, if required. In the Sequence View select "Edit > Reverse-complement sequence".

2) Sometimes when you map a lot of reads/contigs to a reference sequence, BWA-MEM helps, even if originally it is used for NGS reads:
  • Select "Tools > NGS data analysis > Map reads to reference" in the main UGENE menu.
  • Set "Mapping tool" to BWA-MEM, input the reference sequence and the reads, click "Start" ("option2_bwa-mem_settings.png").

BWA-MEM automatically revert a read sequence, if required.

The result is opened in the Assembly Browser.
 
IP Logged
 
Reply #2 - Nov 9th, 2018 at 11:51pm

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
Thanks a lot for the help! So, in my case what I needed was the 2nd option.

Quote:
2) Sometimes when you map a lot of reads/contigs to a reference sequence, BWA-MEM helps, even if originally it is used for NGS reads:

    Select "Tools > NGS data analysis > Map reads to reference" in the main UGENE menu.
    Set "Mapping tool" to BWA-MEM, input the reference sequence and the reads, click "Start"


Just to put things into context, basically I'm trying to design universal primers (search this video in youtube: PrimerMiner tutorial: Making alignments with Geneious). I'm using the PrimerMiner software for this. They use geneious in the tutorial but I don't have access to this program so UGene is the best option (Hand up for being freeware Cool )

I tried to attach the link to the video where the process is shown, but I'm not yet authorize for this.

I am having some limitations to do what I need.

I have correctly map the reads to the reference. Now, I need to extract a specific region from the aligned reads, and to a fasta file. Is there a way to extract a region from the mapped reads into a fasta file? I tried to convert the sam (ugenedb format) into a fata file but an error appeared.. I guess I might be doing something wrong. Could you guide me?   Smiley

Thanks a lot for the help!

JB
« Last Edit: Nov 10th, 2018 at 5:12am by Julio »  
IP Logged
 
Reply #3 - Nov 12th, 2018 at 4:33pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi!

I'm glad that the second option helped you!

Quote:
I tried to attach the link to the video where the process is shown, but I'm not yet authorize for this.
I am having some limitations to do what I need.

Sorry for these limitations! You need to have several posts to be able to add links.

Quote:
Just to put things into context, basically I'm trying to design universal primers (search this video in youtube: PrimerMiner tutorial: Making alignments with Geneious). I'm using the PrimerMiner software for this. They use geneious in the tutorial but I don't have access to this program so UGene is the best option (Hand up for being freeware Cool )

I've watched the video.

Actually, note that as in the video, when you use "Align sequence to this alignment" feature in UGENE, internally you also use MAFFT tool (if the tool is available).

What happens when you try to use the first option? What is the size of the reference sequence?

Quote:
Now, I need to extract a specific region from the aligned reads, and to a fasta file. Is there a way to extract a region from the mapped reads into a fasta file?

You may export reads in a particular region, but as sequences, not as an alignment (see documentation).

It seems that you might need to export reads in some particular small region, export the corresponding  region of the reference (or consensus) sequence in the Assembly Browser, and then try the first option again.

If the data you use are not private, you may share them and it will be easier to guide you.

Finally, a small hint: when you have your data in the Alignment Editor, to export them to FASTA format, appropriate as input for PrimerMiner, use "Export/Import > Export object" option in the Project View available in the alignment object context menu, then select "FASTA" format in the export dialog.
 
IP Logged
 
Reply #4 - Nov 12th, 2018 at 11:27pm

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
Hi,

Quote:
Sorry for these limitations! You need to have several posts to be able to add links.


Sorry for the misunderstanding, the limitations I meant were regarding my knowledge to use the UGene software, not the forum.

Quote:
Actually, note that as in the video, when you use "Align sequence to this alignment" feature in UGENE, internally you also use MAFFT tool (if the tool is available).

What happens when you try to use the first option? What is the size of the reference sequence?


In the video they use the MAFFT tool to align the mitochondrial sequences to generate a consensus sequence, to which the reads will be mapped. I have done this quite easily in UGene, just I had to use the Levitsky 50% consensus mode. Then I use this consensus to map the reads to it.
When I use this consensus seq with option 1, I get a very gapped alignment. So I guess option 2 is the best.

Quote:
You may export reads in a particular region, but as sequences, not as an alignment (see documentation).

It seems that you might need to export reads in some particular small region, export the corresponding  region of the reference (or consensus) sequence in the Assembly Browser, and then try the first option again.


When I export the region as sequences, I can't get it to save the format of the aligned/mapped reads. I also tried to export the particular region of the reference (consensus) and use option 1, but again I get a very gapped alignment, very different from the shape and order of mapped reads using BWA-MEM.

I have attached the mito.fasta, which are the seqs to create the consensus/reference. The Majority.fasta is too large, so I am linking it (https: //espolec-my.sharepoint.com/:u:/g/personal/jabonill_espol_edu_ec/EatxQZataZtGjRl
F4BrMhM8ByNOqKVnMnW9X0pX5KD473A?e=2SGqfu) (Please remove the space after http:). In this file are all the sequences that need to be mapped/aligned to the consensus seq.

I'll be looking forward to find a way to do this analysis in UGene.
 
IP Logged
 
Reply #5 - Nov 13th, 2018 at 1:54pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Could you please check the link to the big file? I removed the space character, but an error occurs when I input this link: "Este vínculo se ha quitado.".
 
IP Logged
 
Reply #6 - Nov 13th, 2018 at 6:51pm

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
Hi,

I have tried the posted link on my phone and it worked. Please try again this one:

https: //espolec-mysharepoint.com/:u:/g/personal/jabonill_espol_edu_ec/EatxQZataZtGjRlF
4BrMhM8ByNOqKVnMnW9X0pX5KD473A?e=2SGqfu

Remove the space between "https: //"

Let me know if this works
 
IP Logged
 
Reply #7 - Nov 13th, 2018 at 6:57pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Julio.

The second link didn't work for me for some reason, but my colleague was able to download the file using the first link.
 
IP Logged
 
Reply #8 - Nov 13th, 2018 at 7:24pm

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
That's weird, but I'm glad your colleague was able to download the file.  Wink
 
IP Logged
 
Reply #9 - Nov 14th, 2018 at 3:28pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Julio.

Thanks for sharing the data!

I've "played" with them a little bit in UGENE.

Indeed, there is currently no option to export data from the Assembly Browser into an alignment format. I've created an issue about that. Anyway, note that data, produced by BWA-MEM, are filtered (there are less sequences, the sequences may be trimmed, etc.) and the order of sequences is modified. So, depending on your need, this may be a problem.

I've found a quite tweaky workaround you may try. As I already mentioned, if MAFFT external tool is available, we use it to "align sequence to an alignment". However, if the tool is not available (that may happen in case of UGENE Standard Package), we provide a simple algorithm for this feature. It appears, for the data you sent it works better.

So, try to do the following:

Also, to see difference from your reference sequence you may:
  • Set the reference sequence.
  • Select "Disagreements" highlighting on the Options Panel.

Some sequences were aligned quite good, but some not. You may try to additionally align them using, for example, Smith-Waterman pairwise alignment in the same window. However, even that didn't help me a lot with the sequences I tried to experiment with as well as the attempt to reverse-complement a sequence first. So, I think, some sequences are just not similar enough to the reference one.

Note that we plan to simplify and improve aligning sequences to a reference sequence in future UGENE versions (however, unfortunately not in the nearest releases, at least in terms of free support). And we'll take into account the scenario and the sample data you shared (thanks again!).
 
IP Logged
 
Reply #10 - Nov 15th, 2018 at 2:51am

Julio   Offline
YaBB Newbies
Ecuador

Gender: male
Posts: 6
*
 
Thank you for all the help. I tried aligning the consensus sequences with the reads without Mafft, and it worked as you said.  Not everything aligned perfectly, but most of the sequence did. I then exported the alignment and am currently trimming it to the desire length. I look forward for the day this feautres are included in UGene, which is a very powerful and must have tool! congrats!  Grin
 
IP Logged
 
Reply #11 - Nov 19th, 2018 at 4:32pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Thank you Julio! We'll do our best to make UGENE even better!  Smiley
 
IP Logged
 
Page Index Toggle Pages: 1