UGENE Forum
https://forum.ugene.net/forum/YaBB.pl
General Category >> Help and How-to >> Use dedicated graphics to find repeats
https://forum.ugene.net/forum/YaBB.pl?num=1568098726

Message started by felince tokyo on Sep 10th, 2019 at 1:58pm

Title: Use dedicated graphics to find repeats
Post by felince tokyo on Sep 10th, 2019 at 1:58pm
Hi!

I normally use UGENE to locate genome wide direct and inverted repeats in bacterial genomes. I can obtain the results without any problem but I'm finding inconvenient the time it takes to make such calculation (several hours, depending on the genome size). My laptop has an Intel Core i7 8550U with 16 GB of ram and I was wondering if I could speed up this calculation by using the dedicated graphics (Nvidia Geforce MX150). By looking at the options, CUDA seems to be able to work with this graphics chip, but I think it is not being used for this specific calculation. Could you please tell me how to do it?

Thank you  :)

Title: Re: Use dedicated graphics to find repeats
Post by Olga Golosova on Sep 10th, 2019 at 2:32pm
Hi Felince!

Thank you very much for the feedback! It was useful.

There is currently no special optimization for a single input sequence except multi-threading. Note that video cards calculations are currently supported for two algorithms only: Smith-Waterman and UGENE Genome Aligner.

Could you please send us some data for which calculation takes several hours?

Title: Re: Use dedicated graphics to find repeats
Post by felince tokyo on Sep 11th, 2019 at 7:09am
Hi Olga, thank you for your response.

This is one of the sequences that I'm analyzing (genbank file):

https://www.ncbi.nlm.nih.gov/nuccore/AP013068

I attached the parameters and the log. As you can see it took more than 4 hours, with the processor at 100% use, although the memory usage was about 3 GB.
repeats.jpg (84 KB | 305 )
log.jpg (22 KB | 298 )

Title: Re: Use dedicated graphics to find repeats
Post by Olga Golosova on Sep 12th, 2019 at 4:20pm
Thank you, Felince!

On my computer this task also was calculated for 3 hours.

I think, this task is a good candidate for parallelization on a video card. We'll think about this and maybe add some optimization in future. However, if you'd like the future to come as soon as possible  :), please consider to use our commercial support services.

Title: Re: Use dedicated graphics to find repeats
Post by felince tokyo on Sep 13th, 2019 at 3:17pm
Dear Olga,

Thank you very much for taking this into consideration. It will be great to receive direct support from Ugene developers. depending on how this analysis develops further we might consider about acquiring paid support.

Title: Re: Use dedicated graphics to find repeats
Post by Olga Golosova on Sep 13th, 2019 at 3:30pm
Dear Felince.

Okay, thank you again for the feedback!

Title: Re: Use dedicated graphics to find repeats
Post by felince tokyo on Dec 30th, 2019 at 4:01pm
Hi, sorry for reviving this old post. I just wanted to ask what was the algorithm used for the detection of repeats.

Thank you!

Title: Re: Use dedicated graphics to find repeats
Post by Olga Golosova on Dec 31st, 2019 at 12:57pm
Hi Felince!

UGENE has its own implementation of the repeats finding algorithm. Here is the source code: https://github.com/ugeneunipro/ugene/tree/master/src/plugins/repeat_finder. If you need more step-by-step description, please write.

P.S. Happy New Year!  :)

Title: Re: Use dedicated graphics to find repeats
Post by felince tokyo on Jan 4th, 2020 at 12:51pm
Happy new year to you too Olga!

Thank you for the link. If you could please give me a brief explanation on how Ugene finds the repeats I would really appreciate it.

Title: Re: Use dedicated graphics to find repeats
Post by Olga Golosova on Jan 9th, 2020 at 3:47pm
Hi Felince! Sorry for the delay in reply!

We have two algorithms to search for repeats:
[olist]
  • one uses "suffix array algorithm"(https://en.wikipedia.org/wiki/Suffix_array) to compare different regions of a sequence,
  • another one "diagonals" uses a simple "brute force algorithm" to do the same task.
    This tasks are performed in parallel in multiple threads to improve the performance.
    [/olist]
    No matter which algorithm you use, there is:

    • a pre-processing steps that depending on the setting may exclude tandem repeats and search for inverted repeats,
    • a post-processing step that may exclude nested repeats, for example, repeat regions that intersect with bigger repeat regions that contain them.

    See "Advanced" tab of the "Find Repeats" dialog.

  • Title: Re: Use dedicated graphics to find repeats
    Post by felince tokyo on Jan 11th, 2020 at 3:58pm
    Wow, thank you so much for the explanation. When you select the "auto" in the algorithm selection menu, which one of the two is this "auto"? both of them running in parallel?

    Best regards,

    Felipe

    Title: Re: Use dedicated graphics to find repeats
    Post by Olga Golosova on Jan 11th, 2020 at 4:40pm
    "Suffix array algorithm" is used by default, i.e. in case of "auto".

    Title: Re: Use dedicated graphics to find repeats
    Post by felince tokyo on Jan 21st, 2020 at 8:49am
    I see! Thank you for your response Olga!

    UGENE Forum » Powered by YaBB 2.5 AE!
    YaBB Forum Software © 2000-2010. All Rights Reserved.