UGENE Forum
https://forum.ugene.net/forum/YaBB.pl
General Category >> Help and How-to >> SPAdes - Mac, no tool
https://forum.ugene.net/forum/YaBB.pl?num=1422521083

Message started by Leiga on Jan 29th, 2015 at 3:44pm

Title: SPAdes - Mac, no tool
Post by Leiga on Jan 29th, 2015 at 3:44pm
Hi everyone.
I am a fresh user of your software and have problem with genome assembly using SPAdes.
As is written in the topic I am using this application on MacBook Pro. Following the instructions I have downloaded external package (with SPAdes for 64-bit MAC), change in the settings location of that folder and tried to run genome assemby. At the end I have this kind of notification:

Quote:
Task {GenomeAssemblyMultiTask} finished with error: Subtask {GenomeAssemblyTask} is failed: Subtask {SPAdestool} is failed: Undefined tool: 'SPAdes'

SPAdes folder is in folder, but on the list of Preferences/External tools in UGENE software I still do not have anything with "SPAdes" letters.
Can you help me?
Regards
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Jan 29th, 2015 at 7:17pm
Hi Leiga!

There was an issue in the release UGENE version that is already fixed.

Could you please download and use the snapshot UGENE version to run SPAdes? The snapshot package can be downloaded here: http://ugene.unipro.ru/snapshot.html

You will need to specify "SPAdes" and "python" external tools in the Application Settings.

Sorry for this inconvenience and thanks for your question!

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Jan 30th, 2015 at 5:53am
Dear Olga,
I think I do some steps forward... changed version for mentioned by you, changed location of external package in preferences sheet. I tried to assemble genome:

Quote:
[ERROR][23:45] Spades: == Error == exception caught while parsing YAML file (/Biofizyka/UGENE/Bakteriofagi/JD/R15011/datasets.yaml):
[ERROR][23:45] Task {GenomeAssemblyMultiTask} finished with error: Subtask {GenomeAssemblyTask} is failed: Subtask {SPAdes tool} is failed: SPAdes tool exited with code 1

I am not sure where is a mistake now (I choose SPAdes, Paired-ends, I left properties in default mode, add equal number of files for left and right reads, dataset - multi cell, running mode - error correction and assembly, k-mer size - auto.
These are reads from unknown bacteriophage (dsDNA), not above 200k bp.

Regards
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Jan 30th, 2015 at 5:53pm
Dear Leiga,

I was not able to reproduce the issue on our test data on the snapshot UGENE version (on Mac OS X 64-bit). I used, for example, the following sample data:
https://dl.dropboxusercontent.com/u/59362392/SRR1617092_1.fastq
https://dl.dropboxusercontent.com/u/59362392/SRR1617092_2.fastq

Could you please download the data and try to run SPAdes with them? The run may take ~0.5 hour.

Note that I used the following versions of the external tools:
- python 2.7
- SPAdes 3.1.1 (bin/spades.py file was specified as the tool)

Also, could you please send the YAML file (/Biofizyka/UGENE/Bakteriofagi/JD/R15011/datasets.yaml)?
And, if it is possible, it would be great if you could send some test files on which the issue is reproduced.

By the way, what was the sequencing platform?

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Jan 30th, 2015 at 7:16pm
I think it is just a bad day :)

I run this app using your data:


Quote:
[ERROR][13:05] Mode: read error correction and assembling
[ERROR][13:05] Read error correction parameters:
[ERROR][13:05] ===== Read error correction started.
[ERROR][13:05] == Running read error correction tool: /Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer /Biofizyka/UGENE/Testowe/corrected/configs/config.info


System?
NextSeq 500

yaml file:

Quote:
[
{
orientation: "fr",
type: "paired-end",
single reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq",
]
}
{
orientation: "fr",
type: "paired-end",
single reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq",
]
}
{
orientation: "fr",
type: "paired-end",
single reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq",
]
}
{
orientation: "fr",
type: "paired-end",
single reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq",
]
}
]

Hope, solution is easy to find, and it is just my fault.
Regards
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Jan 30th, 2015 at 8:44pm
Yeap, this is definitely not the best day :)


Quote:
[ERROR][13:05] Mode: read error correction and assembling
[ERROR][13:05] Read error correction parameters:
[ERROR][13:05] ===== Read error correction started.
[ERROR][13:05] == Running read error correction tool: /Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer /Biofizyka/UGENE/Testowe/corrected/configs/config.info

This "errors" are not actually errors. You can skip this UGENE status. It happens because the log lines are parsed incorrectly. Notice that each of the log lines you posted contains the "error" word.

Has the pipeline with the sample data finished successfully anyway?

Thanks for the YAML file. We will investigate this problem.

Links to the issues in the bug-tracker:
* SPAdes error: exception caught while parsing YAML file:
   https://local.ugene.unipro.ru/tracker/browse/UGENE-3967

* SPAdes log is parsed incorrectly:
   https://local.ugene.unipro.ru/tracker/browse/UGENE-3966

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Jan 31st, 2015 at 12:54am
Yes, assembly reach the end of the process and finally I have several contigs from your data.
After that I tried to do the same with my data and have the same log as before but different yaml file:


Quote:
[
{
orientation: "fr",
type: "paired-end",
left reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq",
],
right reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R2_001.fastq",
],
}
{
orientation: "fr",
type: "paired-end",
left reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq",
],
right reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R2_001.fastq",
],
}
{
orientation: "fr",
type: "paired-end",
left reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq",
],
right reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R2_001.fastq",
],
}
{
orientation: "fr",
type: "paired-end",
left reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq",
],
right reads: [
"/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R2_001.fastq",
],
}
]

Maybe there is a problem with file names?

Title: Re: SPAdes - Mac, no tool
Post by Yuriy Vaskin on Feb 3rd, 2015 at 7:05pm
Dear Leiga,

It looks like there is a bug with interlaced paired-read processing. It will be fixed in the next UGENE release thanks to your help.

For now you could try using the paired-end mode with demultiplexed left and right reads.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Feb 3rd, 2015 at 7:34pm
Dear Leiga,

Alternatively, you may also use a new snapshot with the fixture! I will write you as soon as it is ready. I think, it will be tomorrow.
Sorry again for these problems!

Olga

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Feb 4th, 2015 at 10:17pm
Dear Leiga,

Could you please try the new snapshot (revision must be >= 9049)?
Can be downloaded from: http://ugene.unipro.ru/snapshot.html

Thank you very much!

Kind regards,
Olga

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Feb 6th, 2015 at 1:49am
Dear Olga,
below you can find log after new run of app (this new one):

Quote:
[INFO][15:32] UGENE started
[INFO][15:32] UGENE version: 1.16.0-dev 64-bit
[INFO][15:32] UGENE distribution: portable
[INFO][15:32] Starting {Check for updates} task
[INFO][15:32] Task {Check for updates} finished
[ERROR][19:40] Spades: == Error == system call for: "['/Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer', '/Biofizyka/UGENE/Bakteriofagi/JD/R15011/corrected/configs/config.info']" finished abnormally, err code: -9
[ERROR][19:40] == Error == system call for: "['/Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer', '/Biofizyka/UGENE/Bakteriofagi/JD/R15011/corrected/configs/config.info']" finished abnormally, err code: -9
[ERROR][19:40] Task {GenomeAssemblyMultiTask} finished with error: Subtask {GenomeAssemblyTask} is failed: Subtask {SPAdes tool} is failed: SPAdes tool exited with code 1

As you can see it takes some time, and computer was very unstable (i did not see that before, during alignment with bowtie cpu was used in 90-98% for ugene app, but now it was strange, like sea waves, 2-90% for hammer app).
Regards,
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Feb 6th, 2015 at 8:58pm
Dear Leiga,

We are not able to reproduce the error on our test data.

It seems that now it is a problem in the SPAdes tool (http://bioinf.spbau.ru/spades), integrated into UGENE for assembly of sequencing data. It seems that the error occurs on particular data. As can be seen from the log, the error happened several hours later after you had launched the tool.

Could you please try to launch SPAdes from the command line to make sure that this error is indeed inside the SPAdes algorithm?

Below are instructions to do it:
1) Create a new folder.
2) Put into the folder:
    2.1) the input files ("R15011_S11_L003_R1_001.fastq", etc.).
    2.2) "/Biofizyka/UGENE/Bakteriofagi/JD/R15011/datasets.yaml" file.
3) Launch "Terminal" on Mac OS.
4) In the terminal change directory to the created folder. To do it input:
   
Code:
cd /path/to/new/folder

5) Launch SPAdes from the terminal. To do it input:
   
Code:
python /path/to/spades/spades.py --dataset datasets.yaml -t 2 -o /path/to/output


Kind regards,
Olga

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Mar 4th, 2015 at 7:16pm
Log report from newest version of UGENE release and command line is the same....

UGENE 1.16.0


Quote:
Command line: /Volumes/Unipro UGENE 1.16.0/Unipro UGENE.app/Contents/MacOS/tools/SPAdes-3.1.1/bin/spades.py --dataset /Biofizyka/UGENE/Bakteriofagi/JD/R15011/datasets.yaml -t 4 -m 250 -o /Biofizyka/UGENE/Bakteriofagi/JD/R15011

System information:
  SPAdes version: 3.1.1
  Python version: 2.7.5
  OS: Darwin-14.0.0-x86_64-i386-64bit

Output dir: /Biofizyka/UGENE/Bakteriofagi/JD/R15011
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Multi-cell mode (you should set '--sc' flag if input data was obtained with MDA (single-cell) technology
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 2, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 3, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 4, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
Read error correction parameters:
  Iterations: 1
  PHRED offset will be auto-detected
  Corrected reads will be compressed (with gzip)
Assembly parameters:
  k: automatic selection based on read length
  Mismatch careful mode is turned OFF
  Repeat resolution is enabled
  MismatchCorrector will be SKIPPED
Other parameters:
  Dir for temp files: /Biofizyka/UGENE/Bakteriofagi/JD/R15011/tmp
  Threads: 4
  Memory limit (in Gb): 250


======= SPAdes pipeline started. Log can be found here: /Biofizyka/UGENE/Bakteriofagi/JD/R15011/spades.log


===== Read error correction started.


== Running read error correction tool: /Volumes/Unipro UGENE 1.16.0/Unipro UGENE.app/Contents/MacOS/tools/SPAdes-3.1.1/bin/hammer /Biofizyka/UGENE/Bakteriofagi/JD/R15011/corrected/configs/config.info

   0:00:00.000    4M /    4M   INFO  General                 (main.cpp                  :  82)   Loading config from /Biofizyka/UGENE/Bakteriofagi/JD/R15011/corrected/configs/config.info
   0:00:00.001    4M /    4M   INFO  General                 (memory_limit.hpp          :  42)   Memory limit set to 250 Gb
   0:00:00.001    4M /    4M   INFO  General                 (main.cpp                  :  91)   Trying to determine PHRED offset
   0:00:00.001    4M /    4M   INFO  General                 (main.cpp                  :  97)   Determined value is 33
   0:00:00.001    4M /    4M   INFO  General                 (hammer_tools.cpp          :  36)   Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
     === ITERATION 0 begins ===
   0:00:00.001    4M /    4M   INFO K-mer Index Building     (kmer_index.hpp            : 467)   Building kmer index
   0:00:00.001    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 127)   Splitting kmer instances into 64 buckets. This might take a while.
   0:00:00.001    4M /    4M   INFO  General                 (file_limit.hpp            :  29)   Open file limit set to 7168
   0:00:00.001    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 145)   Memory available for splitting buffers: 20.833 Gb
   0:00:00.001    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 153)   Using cell size of 1048576
   0:00:00.002    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq
   0:00:15.702    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 960047 reads
   0:00:29.729    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 1932674 reads
   0:00:39.973    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 2686738 reads
   0:00:39.973    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L004_R2_001.fastq
   0:00:53.462    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 3639774 reads
   0:01:07.661    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 4605240 reads
   0:01:19.003    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 5398992 reads
   0:01:19.003    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq
   0:01:31.349    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 6357087 reads
   0:01:43.646    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 7328546 reads
   0:01:53.383    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L003_R2_001.fastq
   0:02:06.050    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 9072191 reads
   0:02:29.045    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq
   0:03:05.231    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L002_R2_001.fastq
   0:03:39.098    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq
   0:03:51.803    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 16827403 reads
   0:04:11.342    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L001_R2_001.fastq
   0:04:45.876    3G /    3G   INFO K-mer Splitting          (kmer_data.cpp             : 181)   Processed 20718382 reads
   0:04:45.951   16M /    3G   INFO  General                 (kmer_index.hpp            : 345)   Starting k-mer counting.
   0:10:19.740   16M /    3G   INFO  General                 (kmer_index.hpp            : 351)   K-mer counting done. There are 398806740 kmers in total.
   0:10:19.740   16M /    3G   INFO  General                 (kmer_index.hpp            : 353)   Merging temporary buckets.
   0:11:56.680   16M /    3G   INFO K-mer Index Building     (kmer_index.hpp            : 476)   Building perfect hash indices
   0:14:57.270  148M /    3G   INFO  General                 (kmer_index.hpp            : 371)   Merging final buckets.
   0:15:34.227  148M /    3G   INFO K-mer Index Building     (kmer_index.hpp            : 515)   Index built. Total 137962246 bytes occupied (2.7675 bits per kmer).
   0:15:34.464  148M /    3G   INFO K-mer Counting           (kmer_data.cpp             : 266)   Arranging kmers in hash map order
   0:18:22.253  148M /    3G   INFO K-mer Counting           (kmer_data.cpp             : 279)   Done. Total swaps: 398806495
   0:18:27.338    3G /    3G   INFO  General                 (main.cpp                  : 151)   Clustering Hamming graph.
   0:18:27.356    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 115)   Serializing sub-kmers.
   0:18:27.356    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 120)   Serializing: [0, 10)
   0:20:21.479    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 120)   Serializing: [10, 21)
   0:22:02.104    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 129)   Splitting sub-kmers, pass 1.
   0:32:11.949    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 134)   Splitting done. Processed 2 blocks. Produced 5242822 blocks.
   0:32:11.952    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 145)   Merge sub-kmers, pass 1
   0:52:47.891    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 170)   Merge done, total 3618766 new blocks generated.
   0:52:50.299    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 175)   Spliting sub-kmers, pass 2.
   1:08:02.025    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 180)   Splitting done. Processed 7237532 blocks. Produced 850399815 blocks.
   1:08:02.025    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 187)   Merge sub-kmers, pass 2
   1:42:50.713    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 205)   Merge done, saw 440263 big blocks out of 850399815 processed.


== Error ==  system call for: "['/Volumes/Unipro UGENE 1.16.0/Unipro UGENE.app/Contents/MacOS/tools/SPAdes-3.1.1/bin/hammer', '/Biofizyka/UGENE/Bakteriofagi/JD/R15011/corrected/configs/config.info']" finished abnormally, err code: -9

In case you have troubles running SPAdes, you can write to spades.support@bioinf.spbau.ru
Please provide us with params.txt and spades.log files from the output directory.



and directly from the command line:


Quote:
MacBook-Pro-Dominik:R15011 Dominik$ python /Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/spades.py --dataset datasets.yaml -t 2 -o /Biofizyka/SPADES/Bakteriofagi/R15011
Command line: /Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/spades.py --dataset datasets.yaml -t 2 -o /Biofizyka/SPADES/Bakteriofagi/R15011

System information:
  SPAdes version: 3.1.1
  Python version: 2.7.6
  OS: Darwin-14.0.0-x86_64-i386-64bit

Output dir: /Biofizyka/SPADES/Bakteriofagi/R15011
Mode: read error correction and assembling
Debug mode is turned OFF

Dataset parameters:
  Multi-cell mode (you should set '--sc' flag if input data was obtained with MDA (single-cell) technology
  Reads:
    Library number: 1, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 2, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L003_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 3, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L002_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
    Library number: 4, library type: paired-end
      orientation: fr
      left reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq']
      right reads: ['/Biofizyka/BaseSpace/R15011/R15011_S11_L001_R2_001.fastq']
      interlaced reads: not specified
      single reads: not specified
Read error correction parameters:
  Iterations: 1
  PHRED offset will be auto-detected
  Corrected reads will be compressed (with gzip)
Assembly parameters:
  k: automatic selection based on read length
  Mismatch careful mode is turned OFF
  Repeat resolution is enabled
  MismatchCorrector will be SKIPPED
Other parameters:
  Dir for temp files: /Biofizyka/SPADES/Bakteriofagi/R15011/tmp
  Threads: 2
  Memory limit (in Gb): 250


======= SPAdes pipeline started. Log can be found here: /Biofizyka/SPADES/Bakteriofagi/R15011/spades.log


===== Read error correction started.


== Running read error correction tool: /Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer /Biofizyka/SPADES/Bakteriofagi/R15011/corrected/configs/config.info

   0:00:00.000    4M /    4M   INFO  General                 (main.cpp                  :  82)   Loading config from /Biofizyka/SPADES/Bakteriofagi/R15011/corrected/configs/config.info
   0:00:00.017    4M /    4M   INFO  General                 (memory_limit.hpp          :  42)   Memory limit set to 250 Gb
   0:00:00.017    4M /    4M   INFO  General                 (main.cpp                  :  91)   Trying to determine PHRED offset
   0:00:00.033    4M /    4M   INFO  General                 (main.cpp                  :  97)   Determined value is 33
   0:00:00.033    4M /    4M   INFO  General                 (hammer_tools.cpp          :  36)   Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ]
     === ITERATION 0 begins ===
   0:00:00.034    4M /    4M   INFO K-mer Index Building     (kmer_index.hpp            : 467)   Building kmer index
   0:00:00.034    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 127)   Splitting kmer instances into 32 buckets. This might take a while.
   0:00:00.034    4M /    4M   INFO  General                 (file_limit.hpp            :  29)   Open file limit set to 2560
   0:00:00.034    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 145)   Memory available for splitting buffers: 41.666 Gb
   0:00:00.034    4M /    4M   INFO K-mer Splitting          (kmer_data.cpp             : 153)   Using cell size of 2097152
   0:00:00.034    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L003_R1_001.fastq
   0:00:09.269    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 328870 reads
   0:00:18.315    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 656668 reads
   0:00:27.492    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 983813 reads
   0:00:36.299    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 1314708 reads
   0:00:45.175    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 1650229 reads
   0:00:54.090    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 1983860 reads
   0:01:02.950    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 2321472 reads
   0:01:12.558    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L004_R2_001.fastq
   0:01:57.690    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 4323327 reads
   0:02:26.490    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L004_R1_001.fastq
   0:03:40.680    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L003_R2_001.fastq
   0:03:49.700    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 8438882 reads
   0:04:53.694    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L002_R1_001.fastq
   0:06:05.238    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L002_R2_001.fastq
   0:07:16.456    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L001_R1_001.fastq
   0:07:44.341    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 176)   Processed 16849411 reads
   0:08:24.308    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 167)   Processing /Biofizyka/BaseSpace/R15011/R15011_S11_L001_R2_001.fastq
   0:09:31.820    1G /    1G   INFO K-mer Splitting          (kmer_data.cpp             : 181)   Processed 20718382 reads
   0:09:31.848    8M /    1G   INFO  General                 (kmer_index.hpp            : 345)   Starting k-mer counting.
   0:14:34.247    8M /    1G   INFO  General                 (kmer_index.hpp            : 351)   K-mer counting done. There are 398806740 kmers in total.
   0:14:34.247    8M /    1G   INFO  General                 (kmer_index.hpp            : 353)   Merging temporary buckets.
   0:19:29.229    8M /    1G   INFO K-mer Index Building     (kmer_index.hpp            : 476)   Building perfect hash indices
   0:24:23.249  144M /    1G   INFO  General                 (kmer_index.hpp            : 371)   Merging final buckets.
   0:25:12.327  144M /    1G   INFO K-mer Index Building     (kmer_index.hpp            : 515)   Index built. Total 137962246 bytes occupied (2.7675 bits per kmer).
   0:25:12.583  144M /    1G   INFO K-mer Counting           (kmer_data.cpp             : 266)   Arranging kmers in hash map order
   0:30:51.874  144M /    1G   INFO K-mer Counting           (kmer_data.cpp             : 279)   Done. Total swaps: 398806450
   0:30:54.280    3G /    3G   INFO  General                 (main.cpp                  : 151)   Clustering Hamming graph.
   0:30:54.312    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 115)   Serializing sub-kmers.
   0:30:54.312    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 120)   Serializing: [0, 10)
   0:32:52.293    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 120)   Serializing: [10, 21)
   0:34:31.253    3G /    3G   INFO Hamming Clustering       (hamcluster.cpp            : 129)   Splitting sub-kmers, pass 1.
   0:43:42.177    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 134)   Splitting done. Processed 2 blocks. Produced 5242822 blocks.
   0:43:42.190    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 145)   Merge sub-kmers, pass 1
   1:01:29.652    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 170)   Merge done, total 3618766 new blocks generated.
   1:01:31.586    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 175)   Spliting sub-kmers, pass 2.
   1:15:03.447    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 180)   Splitting done. Processed 7237532 blocks. Produced 850399815 blocks.
   1:15:03.474    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 187)   Merge sub-kmers, pass 2
   1:54:34.969    3G /    9G   INFO Hamming Clustering       (hamcluster.cpp            : 205)   Merge done, saw 440263 big blocks out of 850399815 processed.


== Error ==  system call for: "['/Biofizyka/UGENE/ext_tools_mac_64-bit/SPAdes-3.1.1/bin/hammer', '/Biofizyka/SPADES/Bakteriofagi/R15011/corrected/configs/config.info']" finished abnormally, err code: -9

In case you have troubles running SPAdes, you can write to spades.support@bioinf.spbau.ru
Please provide us with params.txt and spades.log files from the output directory.
MacBook-Pro-Dominik:R15011 Dominik$

So... problem still exist, maybe it is time to change dataset to another one.

Regards
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Mar 4th, 2015 at 7:41pm
Leiga, thanks for testing!

Title: Re: SPAdes - Mac, no tool
Post by Leiga on Mar 4th, 2015 at 8:24pm
So, I will change fastq files which gives me nice contig in clc software, and will test them with UGENE/SPADES
I think I am not able to test it on windows, there is no tool for that. Am I correct?
L.

Title: Re: SPAdes - Mac, no tool
Post by Olga Golosova on Mar 4th, 2015 at 9:25pm

Quote:
I think I am not able to test it on windows, there is no tool for that. Am I correct?

Yes, there is currently no SPAdes for Windows.


Quote:
So, I will change fastq files which gives me nice contig in clc software, and will test them with UGENE/SPADES

Let me sum up it:
[list bull-redball]
  • It is an error that you can't run SPAdes on your data (R15011_S11_L003_R1_001.fastq and the other files).
  • If it is possible, please share the data with us. It will help a lot to investigate the problem. Currently we're not able to reproduce the error on our test data.
  • We will contact the SPAdes developers about the error and, I hope, it will be fixed in future.
  • SPAdes can be successfully run on other FASTQ files, so sure, you may try to use it with other input data.

  • Title: Re: SPAdes - Mac, no tool
    Post by Leiga on Mar 5th, 2015 at 6:31pm
    So,
    I have change source fastq files (different phage) and after successful  assembly using clc I tried to do the same with spades,
    log below:

    Quote:
    [ERROR][11:11] Spades: == Error == system call for: "['/Volumes/Unipro UGENE 1.16.0/Unipro UGENE.app/Contents/MacOS/tools/SPAdes-3.1.1/bin/hammer', '/Biofizyka/UGENE/Bakteriofagi/JD/ZK1_01/corrected/configs/config.info']" finished abnormally, err code: -6
    [ERROR][11:11] == Error == system call for: "['/Volumes/Unipro UGENE 1.16.0/Unipro UGENE.app/Contents/MacOS/tools/SPAdes-3.1.1/bin/hammer', '/Biofizyka/UGENE/Bakteriofagi/JD/ZK1_01/corrected/configs/config.info']" finished abnormally, err code: -6
    [ERROR][11:12] Task {GenomeAssemblyMultiTask} finished with error: Subtask {GenomeAssemblyTask} is failed: Subtask {SPAdes tool} is failed: SPAdes tool exited with code 1


    Can you register to basespace (illumina) i think I will be able to share whole project with you, and you will get access to raw data from one of sequenced phages.

    Regards,
    L.

    Title: Re: SPAdes - Mac, no tool
    Post by Olga Golosova on Mar 10th, 2015 at 2:21pm
    Hi, we've just registered to basespace. The account is ugene@unipro.ru.

    Title: Re: SPAdes - Mac, no tool
    Post by Leiga on Mar 10th, 2015 at 9:25pm
    Hey,
    can you check it? I try to share project with you, can you try to download data from basespace?
    There is another option (transfer ownership - but it is not what I want to do :))
    Regards
    L.

    Title: Re: SPAdes - Mac, no tool
    Post by Leiga on Mar 20th, 2015 at 7:17pm
    Hi Olga,
    Were you able to download data and try to solve a problem?
    Regards
    L.

    Title: Re: SPAdes - Mac, no tool
    Post by Olga Golosova on Mar 24th, 2015 at 5:35pm
    Hi Leiga,

    Yes, we downloaded the data, thank you!

    We're currently trying to reproduce the issue. We launched SPAdes with the test data about 1.5 days ago and it still does the calculations. I will write you as soon as it is finished.

    Could you please confirm that all the FASTQ files are paired-end reads from the same experiment?

    Title: Re: SPAdes - Mac, no tool
    Post by Leiga on Mar 27th, 2015 at 6:05pm
    Yes,
    I hope you were able to download proper data. What are the names of files which you have used?
    Regards,
    L.

    Title: Re: SPAdes - Mac, no tool
    Post by Ivan Protsyuk on Mar 31st, 2015 at 12:54pm
    Hi Leiga,
    I'm a member of the UGENE team, and I worked on your issue along with Olga. Indeed, we managed to download your data from the BaseSpace platform. The example of a file name is "ZK1-01_S10_L001_R1_001.fastq.gz". Other file names differ in a couple of numbers.

    As for SPAdes, we ran it via UGENE on several desktop computers with your data: two iMacs and one Linux machine. And it finished with errors, similar to what you posted previously, in two cases, and didn't finish at all on one of the iMacs during 4 days. Finally, we launched SPAdes on a server equipped with 2 CPUs with 6 cores each and 64 GB RAM, and it finished successfully in 18 hours. So, we inferred that a desktop computer is not powerful enough for running SPAdes with data like yours.

    If you want, we can share the SPAdes output results using BaseSpace and account name strapag@biol.uni.lodz.pl.

    Title: Re: SPAdes - Mac, no tool
    Post by Leiga on Apr 1st, 2015 at 1:59am
    So,
    do you think, it is because of:
    1. somethink what is specific for nextseq's data (I mean "new" instrument);
    2. Oversequencing... (as you probably see, this is really over sequenced phage genome:))
    3. PC specification only.


    I will try again to use that tool, now I have more optimsed codition for sequencing small genomes on taht instrument (0,8 - 0,5 Gb of data)
    Will tell you about results later...


    And, Yes, please send me your results using illumina account.
    Regards,
    L.

    Title: Re: SPAdes - Mac, no tool
    Post by Ivan Protsyuk on Apr 1st, 2015 at 7:04pm
    I suppose that the issues with SPAdes are caused by the hardware configuration, since in the end, it succeded to align your data on a powerful computer. Probably, there would be no problem if the genome wasn't oversequenced at this extent.

    Unfortunately, BaseSpace doesn't allow uploading arbitrary files, therefore I added the data produced by SPAdes to Google Drive. You can access it here: http://bit.ly/1C78Eft. Please, let me know if you have any questions regarding it.

    UGENE Forum » Powered by YaBB 2.5 AE!
    YaBB Forum Software © 2000-2010. All Rights Reserved.