Message started by JoseG on Feb 29th, 2016 at 5:58pm

Title: Using HMMER3
Post by JoseG on Feb 29th, 2016 at 5:58pm

I recently found UGENE looking for alternatives to command prompt-like programs avaliable. So, the first thing is to thank you making avaliable this little gem ;)

Ok, now the subject. Im trying to use the hmmer tools to find hmmer signatures in a sequences aa file (fasta) with both, hmmer3 search and phmmer.

The problem(s) I have found is: it load all the sequences (+35000) in the objects tab (upper left window), but only the first 14 in the viewer (upper right window). So, it only perform the search on those, right?
Anyway, I have to add manually any further seq to perform the analysis, which is quite inconvenient. Am I doing it in the right way? Searching hmmer for single seqs works like a charm, but I have no luck with the "batch" searches.
Moreover, if I could finish a search in all the seqs, how to filter them for positive matches?

I also tried the workflow designer. It works for single seq files but return errors for the full sequences file.

So, is there any way(s) to do this kind of batch hmmer search using a long sequence list and how to filter results?

Thank you in advance

Title: Re: Using HMMER3
Post by Olga Golosova on Feb 29th, 2016 at 7:44pm
Hello, JoseG!

You were right, to analyze a batch of sequences, you need to use the Workflow Designer. What was the error that you encountered, when you tried to use it? Could you please describe it in detail (which workflow you used, what was thee error, etc.)
Ideally, it would be great, if you could send us the error screenshot and the input data that you used.

As for the second question about filtering of results, we also need more details on that. What is the main criteria for filtration? Could you please provide an example?

Title: Re: Using HMMER3
Post by JoseG on Mar 1st, 2016 at 2:01am

I attached 3 screenshots: first is the workflow (and errors) I build using the web documentation about "search sequences with profile hmm"; screenshots 1 and 2 are when I load the fasta file with more than 35000 seq (it only loads 14 in the viewer) and after I added manually one of them (positive) to perform the hmmer search with a pfam profile.
I cannot attach the fasta (very big) and hmmer profile files here but can provide download links if needed.
My question about filtering results is, supposing it can search for the whole list (screenshot1), how to get only the signal-positives? (if possible).

Any advice about how to do correctly the task?

Thank you

Title: Re: Using HMMER3
Post by Olga Golosova on Mar 1st, 2016 at 1:11pm
On the screenshot with the workflow you may see, that the connection arrows are red. This means that there are some errors:

  • It seems that the ports are connected incorrectly: the HMM3 port is connected with the Read Sequence element, the sequence port is connected with the Read HMM3 Profile element.
  • Also, you need to write the result into a GenBank or other format, that can store annotations (the HMM signals in this case). The FASTA format does not allow that.
  • And finally, you possibly modified the slots of the Write FASTA (i.e. Write Sequence) element, that may have caused an additional error.

So, to fix the errors, you may use the attached corrected workflow.

how to get only the signal-positives?

What does it mean technically? Do you want to group all the results into two categories:
1) Sequences with found signals;
2) Sequences without signals?;file=hmm3.uwl (2 KB | 363 )

Title: Re: Using HMMER3
Post by JoseG on Mar 2nd, 2016 at 12:00am

Thank you very much, your workflow file did the trick. I was misunderstanding how the program works. Also, I need more practice with it.

Thank you for the help and great support!

