Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
  HomeHelpSearchLoginRegister  
 
 
Page Index Toggle Pages: 1
Use dedicated graphics to find repeats (Read 8973 times)
Sep 10th, 2019 at 1:58pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi!

I normally use UGENE to locate genome wide direct and inverted repeats in bacterial genomes. I can obtain the results without any problem but I'm finding inconvenient the time it takes to make such calculation (several hours, depending on the genome size). My laptop has an Intel Core i7 8550U with 16 GB of ram and I was wondering if I could speed up this calculation by using the dedicated graphics (Nvidia Geforce MX150). By looking at the options, CUDA seems to be able to work with this graphics chip, but I think it is not being used for this specific calculation. Could you please tell me how to do it?

Thank you  Smiley
 
IP Logged
 
Reply #1 - Sep 10th, 2019 at 2:32pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince!

Thank you very much for the feedback! It was useful.

There is currently no special optimization for a single input sequence except multi-threading. Note that video cards calculations are currently supported for two algorithms only: Smith-Waterman and UGENE Genome Aligner.

Could you please send us some data for which calculation takes several hours?
 
IP Logged
 
Reply #2 - Sep 11th, 2019 at 7:09am

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi Olga, thank you for your response.

This is one of the sequences that I'm analyzing (genbank file):

https://www.ncbi.nlm.nih.gov/nuccore/AP013068

I attached the parameters and the log. As you can see it took more than 4 hours, with the processor at 100% use, although the memory usage was about 3 GB.
 

repeats.jpg (84 KB | 326 )
repeats.jpg
log.jpg (22 KB | 318 )
log.jpg
IP Logged
 
Reply #3 - Sep 12th, 2019 at 4:20pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Thank you, Felince!

On my computer this task also was calculated for 3 hours.

I think, this task is a good candidate for parallelization on a video card. We'll think about this and maybe add some optimization in future. However, if you'd like the future to come as soon as possible  Smiley, please consider to use our commercial support services.
 
IP Logged
 
Reply #4 - Sep 13th, 2019 at 3:17pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Dear Olga,

Thank you very much for taking this into consideration. It will be great to receive direct support from Ugene developers. depending on how this analysis develops further we might consider about acquiring paid support.
 
IP Logged
 
Reply #5 - Sep 13th, 2019 at 3:30pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Dear Felince.

Okay, thank you again for the feedback!
 
IP Logged
 
Reply #6 - Dec 30th, 2019 at 4:01pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Hi, sorry for reviving this old post. I just wanted to ask what was the algorithm used for the detection of repeats.

Thank you!
 
IP Logged
 
Reply #7 - Dec 31st, 2019 at 12:57pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince!

UGENE has its own implementation of the repeats finding algorithm. Here is the source code: https://github.com/ugeneunipro/ugene/tree/master/src/plugins/repeat_finder. If you need more step-by-step description, please write.

P.S. Happy New Year!  Smiley
 
IP Logged
 
Reply #8 - Jan 4th, 2020 at 12:51pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Happy new year to you too Olga!

Thank you for the link. If you could please give me a brief explanation on how Ugene finds the repeats I would really appreciate it.
 
IP Logged
 
Reply #9 - Jan 9th, 2020 at 3:47pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
Hi Felince! Sorry for the delay in reply!

We have two algorithms to search for repeats:
  1. one uses "suffix array algorithm"(https://en.wikipedia.org/wiki/Suffix_array) to compare different regions of a sequence,
  2. another one "diagonals" uses a simple "brute force algorithm" to do the same task.
    This tasks are performed in parallel in multiple threads to improve the performance.

No matter which algorithm you use, there is:
  • a pre-processing steps that depending on the setting may exclude tandem repeats and search for inverted repeats,
  • a post-processing step that may exclude nested repeats, for example, repeat regions that intersect with bigger repeat regions that contain them.

See "Advanced" tab of the "Find Repeats" dialog.
 
IP Logged
 
Reply #10 - Jan 11th, 2020 at 3:58pm

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
Wow, thank you so much for the explanation. When you select the "auto" in the algorithm selection menu, which one of the two is this "auto"? both of them running in parallel?

Best regards,

Felipe
 
IP Logged
 
Reply #11 - Jan 11th, 2020 at 4:40pm

Olga Golosova   Offline
YaBB Administrator

Posts: 338
*****
 
"Suffix array algorithm" is used by default, i.e. in case of "auto".
 
IP Logged
 
Reply #12 - Jan 21st, 2020 at 8:49am

felince tokyo   Offline
YaBB Newbies

Posts: 11
*
 
I see! Thank you for your response Olga!
 
IP Logged
 
Page Index Toggle Pages: 1