UGENE Forum - UGENE-produced gb files cannot be parsed in biopython or bioperl

Apr 16^th, 2024 at 7:09pm

Choose Language:

	Welcome, Guest. Please Login or Register
	Welcome to our forum.

Home

Help

UGENE Forum › General Category › Bugs and Issues › UGENE-produced gb files cannot be parsed in biopython or bioperl

‹ Previous Topic | Next Topic ›

Pages: 1

UGENE-produced gb files cannot be parsed in biopython or bioperl (Read 13264 times)

Sep 20^th, 2009 at 2:32pm

vigil Offline
YaBB Newbies

Posts: 9

Thankyou again for a great program, soon to be cited in my paper!

UGENE produces gb files of hits found, for instance when searching for repats or pattern matches. I've been trying to further treat these results by parsing them with bio-perl/python, but the parsers of these two languages are not compatible with the kind of genbank output I get from UGENE.

I notice now I'm using 1.4.1 (on windows, 1.4 gHz single processor on tc4200 laptop) not the most recent verison, but I can't see anything about this problem in the later bug fixes.

The perl code that produced the latest error looked like this:

Code:

use Bio::SeqIO;

### This is a file from UGENEs find pattern tool:####

my $filenme="C:/Documents and Settings/Carl/My Documents/NewBioInf/myGB/hitsOniS3.gb";

$seqio_obj = Bio::SeqIO->new(-file => $filenme, -format => "genbank" );
$seq_obj = $seqio_obj->next_seq;

And here's the error:
Code:

C:\Documents and Settings\Carl>perl readGB.pl

--------------------- WARNING ---------------------
MSG: Unknown alphabet:
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Unexpected error in feature table for  Skipping feature, attempting to recover
---------------------------------------------------

------------- EXCEPTION -------------
MSG: Alphabet '1' is not a valid alphabet ('dna','protein','rna') lowercase
STACK Bio::PrimarySeq::alphabet C:/Perl/site/lib/Bio/PrimarySeq.pm:571
STACK Bio::PrimarySeq::new C:/Perl/site/lib/Bio/PrimarySeq.pm:208
STACK Bio::Seq::new C:/Perl/site/lib/Bio/Seq.pm:484
STACK Bio::Seq::RichSeq::new C:/Perl/site/lib/Bio/Seq/RichSeq.pm:110
STACK Bio::Seq::SeqFactory::create C:/Perl/site/lib/Bio/Seq/SeqFactory.pm:116
STACK Bio::Factory::ObjectFactoryI::create_object C:/Perl/site/lib/Bio/Factory/ObjectFactoryI.pm:102

STACK Bio::Seq::SeqBuilder::make_object C:/Perl/site/lib/Bio/Seq/SeqBuilder.pm:337
STACK Bio::SeqIO::genbank::next_seq C:/Perl/site/lib/Bio\SeqIO\genbank.pm:717
STACK toplevel readGB.pl:5
-------------------------------------

Best regards,

Theo Vigil

IP Logged

Reply #1 - Sep 20^th, 2009 at 6:48pm

Ivan Efremov Offline
YaBB Administrator
Novosibirsk

Gender:
Posts: 46

Hi Theo,
please post the file - we will try to investigate the problem.

UGENE team

IP Logged

Reply #2 - Oct 19^th, 2009 at 2:29pm

vigil Offline
YaBB Newbies

Posts: 9

Forgot what file this was, but I think I corrected the problem by adjusting the number of spaces surrounding the nucleotide position in the genbank file - from 9 to 10 or 11 to 10 or something similar.

IP Logged

Reply #3 - Oct 19^th, 2009 at 3:05pm

Ivan Efremov Offline
YaBB Administrator
Novosibirsk

Gender:
Posts: 46

OK, we will try to investigate the problem.
Thanks for report!

UGENE team

IP Logged

Reply #4 - Oct 19^th, 2009 at 5:26pm

Ivan Efremov Offline
YaBB Administrator
Novosibirsk

Gender:
Posts: 46

Theo,
I've tried to reproduce the error and investigated the following.

This exception is thrown only if I try to open the file (.gb) which does not contain a sequence (see no_seq.gb). And, as I think, this is not 'very' incorrect behavior since you try to access a sequence object from a file without such object - some error must be reported by bio perl.

When I try to open files with sequences (bioperl_ok.gb2) - everything is OK.

no_seq.gb (0 KB | )

bioperl_ok.gb (0 KB | )

UGENE team

IP Logged

Pages: 1

‹ Previous Topic | Next Topic ›

« Home

‹ Board

Top of this page