UGENE Forum - Print Page

UGENE Forum
https://forum.ugene.net/forum/YaBB.pl General Category >> Help and How-to >> Problem mapping to reference (Sanger) https://forum.ugene.net/forum/YaBB.pl?num=1541689817 Message started by Julio on Nov 8^th, 2018 at 10:10pm

Title: Problem mapping to reference (Sanger)
Post by Julio on Nov 8^th, 2018 at 10:10pm

Hi,

I'm new to UGene and the forum. I have looked before posting, but I haven't found any working solution :(. I'm trying to map some reads to a reference sequence using the tool "Map Sanger reads to reference".

My reference sequence is a fasta file made from the consensus sequence of a previous alignment. The sequences I'm trying to map are in a multifasta file. When I run the tool, it appears the following error:

Task report [Map Sanger reads to reference]

Status    Failed
Error:    Subtask {Compose alignment} is failed: The related
               chromatogram not found
Time    0h 00m 32.595s

I dont have chromatograms to my reads because they are reads from databases (NCBI, BOLD, etc..)

Just in case this information is necessary, I'm running Ubuntu 18.04.1 LTS in a virtual machine hosted in Windows 10. The Unipro UGENE version is v1.31.1.

Thanks! :)

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 9^th, 2018 at 3:07pm

Hi Julio.

Quote:

I have looked before posting, but I haven't found any working solution

Thank you for your feedback! Apparently, we should simplify this feature in UGENE and improve documentation. :)

First of all, note that you shouldn't use "Map Sanger reads to reference" tool. It implies that you provide the Sanger reads in *.ab1 or similar format (see a video about this feature).

For your task there are a couple of workarounds you may try first. Please try them and write, if it works for you or not.

1) If you have a few sequences you would like to align to the reference sequence:

Export the reference sequence to an alignment format to open it in the Alignment Editor. Select "Export/Import > Export sequences as alignment" in the context menu for the sequence object in the Project View tp do this (see "option1_open_reference_as_alignment.png").
Then make sure the Alignment Editor window is opened and click "Align sequence to this alignment" button on the toolbar (see "option1_align_reads_to_reference.png" and documentation).

Note that you should manually convert a read sequence to its reverse-complement sequence in this case, if required. In the Sequence View select "Edit > Reverse-complement sequence".

2) Sometimes when you map a lot of reads/contigs to a reference sequence, BWA-MEM helps, even if originally it is used for NGS reads:

Select "Tools > NGS data analysis > Map reads to reference" in the main UGENE menu.
Set "Mapping tool" to BWA-MEM, input the reference sequence and the reads, click "Start" ("option2_bwa-mem_settings.png").

BWA-MEM automatically revert a read sequence, if required.

The result is opened in the Assembly Browser.

option1_open_reference_as_alignment.png (54 KB | 344 )

option1_align_reads_to_reference.png (55 KB | 340 )

option2_bwa-mem_settings.png (105 KB | 388 )

Title: Re: Problem mapping to reference (Sanger)
Post by Julio on Nov 9^th, 2018 at 11:51pm

Thanks a lot for the help! So, in my case what I needed was the 2nd option.

Quote:

2) Sometimes when you map a lot of reads/contigs to a reference sequence, BWA-MEM helps, even if originally it is used for NGS reads:

Select "Tools > NGS data analysis > Map reads to reference" in the main UGENE menu.
Set "Mapping tool" to BWA-MEM, input the reference sequence and the reads, click "Start"

Just to put things into context, basically I'm trying to design universal primers (search this video in youtube: PrimerMiner tutorial: Making alignments with Geneious). I'm using the PrimerMiner software for this. They use geneious in the tutorial but I don't have access to this program so UGene is the best option (Hand up for being freeware 8-) )

I tried to attach the link to the video where the process is shown, but I'm not yet authorize for this.

I am having some limitations to do what I need.

I have correctly map the reads to the reference. Now, I need to extract a specific region from the aligned reads, and to a fasta file. Is there a way to extract a region from the mapped reads into a fasta file? I tried to convert the sam (ugenedb format) into a fata file but an error appeared.. I guess I might be doing something wrong. Could you guide me? [smiley=dankk2.gif]

Thanks a lot for the help!

JB

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 12^th, 2018 at 4:33pm

Hi!

I'm glad that the second option helped you!

Quote:

I tried to attach the link to the video where the process is shown, but I'm not yet authorize for this.
I am having some limitations to do what I need.

Sorry for these limitations! You need to have several posts to be able to add links.

Quote:

I've watched the video.

Actually, note that as in the video, when you use "Align sequence to this alignment" feature in UGENE, internally you also use MAFFT tool (if the tool is available).

What happens when you try to use the first option? What is the size of the reference sequence?

Quote:

Now, I need to extract a specific region from the aligned reads, and to a fasta file. Is there a way to extract a region from the mapped reads into a fasta file?

You may export reads in a particular region, but as sequences, not as an alignment (see documentation).

It seems that you might need to export reads in some particular small region, export the corresponding region of the reference (or consensus) sequence in the Assembly Browser, and then try the first option again.

If the data you use are not private, you may share them and it will be easier to guide you.

Finally, a small hint: when you have your data in the Alignment Editor, to export them to FASTA format, appropriate as input for PrimerMiner, use "Export/Import > Export object" option in the Project View available in the alignment object context menu, then select "FASTA" format in the export dialog.

Title: Re: Problem mapping to reference (Sanger)
Post by Julio on Nov 12^th, 2018 at 11:27pm

Hi,

Quote:

Sorry for these limitations! You need to have several posts to be able to add links.

Sorry for the misunderstanding, the limitations I meant were regarding my knowledge to use the UGene software, not the forum.

Quote:

Actually, note that as in the video, when you use "Align sequence to this alignment" feature in UGENE, internally you also use MAFFT tool (if the tool is available).

What happens when you try to use the first option? What is the size of the reference sequence?

In the video they use the MAFFT tool to align the mitochondrial sequences to generate a consensus sequence, to which the reads will be mapped. I have done this quite easily in UGene, just I had to use the Levitsky 50% consensus mode. Then I use this consensus to map the reads to it.
When I use this consensus seq with option 1, I get a very gapped alignment. So I guess option 2 is the best.

Quote:

When I export the region as sequences, I can't get it to save the format of the aligned/mapped reads. I also tried to export the particular region of the reference (consensus) and use option 1, but again I get a very gapped alignment, very different from the shape and order of mapped reads using BWA-MEM.

I have attached the mito.fasta, which are the seqs to create the consensus/reference. The Majority.fasta is too large, so I am linking it (https: //espolec-my.sharepoint.com/:u:/g/personal/jabonill_espol_edu_ec/EatxQZataZtGjRlF4BrMhM8ByNOqKVnMnW9X0pX5KD473A?e=2SGqfu) (Please remove the space after http:). In this file are all the sequences that need to be mapped/aligned to the consensus seq.

I'll be looking forward to find a way to do this analysis in UGene.

https://forum.ugene.net/forum/YaBB.pl?action=downloadfile;file=Characiformes_mito.fasta (31 KB | 288 )

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 13^th, 2018 at 1:54pm

Could you please check the link to the big file? I removed the space character, but an error occurs when I input this link: "Este vínculo se ha quitado.".

Title: Re: Problem mapping to reference (Sanger)
Post by Julio on Nov 13^th, 2018 at 6:51pm

Hi,

I have tried the posted link on my phone and it worked. Please try again this one:

https: //espolec-mysharepoint.com/:u:/g/personal/jabonill_espol_edu_ec/EatxQZataZtGjRlF4BrMhM8ByNOqKVnMnW9X0pX5KD473A?e=2SGqfu

Remove the space between "https: //"

Let me know if this works

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 13^th, 2018 at 6:57pm

Hi Julio.

The second link didn't work for me for some reason, but my colleague was able to download the file using the first link.

Title: Re: Problem mapping to reference (Sanger)
Post by Julio on Nov 13^th, 2018 at 7:24pm

That's weird, but I'm glad your colleague was able to download the file. ;)

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 14^th, 2018 at 3:28pm

Hi Julio.

Thanks for sharing the data!

I've "played" with them a little bit in UGENE.

Indeed, there is currently no option to export data from the Assembly Browser into an alignment format. I've created an issue about that. Anyway, note that data, produced by BWA-MEM, are filtered (there are less sequences, the sequences may be trimmed, etc.) and the order of sequences is modified. So, depending on your need, this may be a problem.

I've found a quite tweaky workaround you may try. As I already mentioned, if MAFFT external tool is available, we use it to "align sequence to an alignment". However, if the tool is not available (that may happen in case of UGENE Standard Package), we provide a simple algorithm for this feature. It appears, for the data you sent it works better.

So, try to do the following:

Unset MAFFT in the UGENE Application Settings.
Try to use "Align sequence to this alignment" as before.

Also, to see difference from your reference sequence you may:

Set the reference sequence.
Select "Disagreements" highlighting on the Options Panel.

Some sequences were aligned quite good, but some not. You may try to additionally align them using, for example, Smith-Waterman pairwise alignment in the same window. However, even that didn't help me a lot with the sequences I tried to experiment with as well as the attempt to reverse-complement a sequence first. So, I think, some sequences are just not similar enough to the reference one.

Note that we plan to simplify and improve aligning sequences to a reference sequence in future UGENE versions (however, unfortunately not in the nearest releases, at least in terms of free support). And we'll take into account the scenario and the sample data you shared (thanks again!).

aligning_without_mafft.png (118 KB | 335 )

Title: Re: Problem mapping to reference (Sanger)
Post by Julio on Nov 15^th, 2018 at 2:51am

Thank you for all the help. I tried aligning the consensus sequences with the reads without Mafft, and it worked as you said. Not everything aligned perfectly, but most of the sequence did. I then exported the alignment and am currently trimming it to the desire length. I look forward for the day this feautres are included in UGene, which is a very powerful and must have tool! congrats! ;D

Title: Re: Problem mapping to reference (Sanger)
Post by Olga Golosova on Nov 19^th, 2018 at 4:32pm

Thank you Julio! We'll do our best to make UGENE even better! :)