UGENE Forum - Use dedicated graphics to find repeats

Sep 10^th, 2019 at 1:58pm

felince tokyo Offline
YaBB Newbies

Posts: 11

Hi!

I normally use UGENE to locate genome wide direct and inverted repeats in bacterial genomes. I can obtain the results without any problem but I'm finding inconvenient the time it takes to make such calculation (several hours, depending on the genome size). My laptop has an Intel Core i7 8550U with 16 GB of ram and I was wondering if I could speed up this calculation by using the dedicated graphics (Nvidia Geforce MX150). By looking at the options, CUDA seems to be able to work with this graphics chip, but I think it is not being used for this specific calculation. Could you please tell me how to do it?

Thank you

Back to top

IP Logged

Reply #1 - Sep 10^th, 2019 at 2:32pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

Hi Felince!

Thank you very much for the feedback! It was useful.

There is currently no special optimization for a single input sequence except multi-threading. Note that video cards calculations are currently supported for two algorithms only: Smith-Waterman and UGENE Genome Aligner.

Could you please send us some data for which calculation takes several hours?

Back to top

IP Logged

Reply #2 - Sep 11^th, 2019 at 7:09am

felince tokyo Offline
YaBB Newbies

Posts: 11

Hi Olga, thank you for your response.

This is one of the sequences that I'm analyzing (genbank file):

https://www.ncbi.nlm.nih.gov/nuccore/AP013068

I attached the parameters and the log. As you can see it took more than 4 hours, with the processor at 100% use, although the memory usage was about 3 GB.

repeats.jpg (84 KB | 318 )

log.jpg (22 KB | 309 )

Back to top

IP Logged

Reply #3 - Sep 12^th, 2019 at 4:20pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

Thank you, Felince!

On my computer this task also was calculated for 3 hours.

I think, this task is a good candidate for parallelization on a video card. We'll think about this and maybe add some optimization in future. However, if you'd like the future to come as soon as possible

, please consider to use our commercial support services.

Back to top

IP Logged

Reply #4 - Sep 13^th, 2019 at 3:17pm

felince tokyo Offline
YaBB Newbies

Posts: 11

Dear Olga,

Thank you very much for taking this into consideration. It will be great to receive direct support from Ugene developers. depending on how this analysis develops further we might consider about acquiring paid support.

Back to top

IP Logged

Reply #5 - Sep 13^th, 2019 at 3:30pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

Dear Felince.

Okay, thank you again for the feedback!

Back to top

IP Logged

Reply #6 - Dec 30^th, 2019 at 4:01pm

felince tokyo Offline
YaBB Newbies

Posts: 11

Hi, sorry for reviving this old post. I just wanted to ask what was the algorithm used for the detection of repeats.

Thank you!

Back to top

IP Logged

Reply #7 - Dec 31^st, 2019 at 12:57pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

Hi Felince!

UGENE has its own implementation of the repeats finding algorithm. Here is the source code: https://github.com/ugeneunipro/ugene/tree/master/src/plugins/repeat_finder. If you need more step-by-step description, please write.

P.S. Happy New Year!

Back to top

IP Logged

Reply #8 - Jan 4^th, 2020 at 12:51pm

felince tokyo Offline
YaBB Newbies

Posts: 11

Happy new year to you too Olga!

Thank you for the link. If you could please give me a brief explanation on how Ugene finds the repeats I would really appreciate it.

Back to top

IP Logged

Reply #9 - Jan 9^th, 2020 at 3:47pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

Hi Felince! Sorry for the delay in reply!

We have two algorithms to search for repeats:

one uses "suffix array algorithm"(https://en.wikipedia.org/wiki/Suffix_array) to compare different regions of a sequence,
another one "diagonals" uses a simple "brute force algorithm" to do the same task.
This tasks are performed in parallel in multiple threads to improve the performance.

No matter which algorithm you use, there is:

a pre-processing steps that depending on the setting may exclude tandem repeats and search for inverted repeats,
a post-processing step that may exclude nested repeats, for example, repeat regions that intersect with bigger repeat regions that contain them.

See "Advanced" tab of the "Find Repeats" dialog.

Back to top

IP Logged

Reply #10 - Jan 11^th, 2020 at 3:58pm

felince tokyo Offline
YaBB Newbies

Posts: 11

Wow, thank you so much for the explanation. When you select the "auto" in the algorithm selection menu, which one of the two is this "auto"? both of them running in parallel?

Best regards,

Felipe

Back to top

IP Logged

Reply #11 - Jan 11^th, 2020 at 4:40pm

Olga Golosova Offline
YaBB Administrator

Posts: 338

"Suffix array algorithm" is used by default, i.e. in case of "auto".

Back to top

IP Logged

Reply #12 - Jan 21^st, 2020 at 8:49am

felince tokyo Offline
YaBB Newbies

Posts: 11

I see! Thank you for your response Olga!

Back to top

IP Logged

	Welcome, Guest. Please Login or Register
	Welcome to our forum.