Welcome, Guest. Please Login or Register
UGENE Bulletin Board
  Welcome to our forum.
  HomeHelpSearchLoginRegister  
 
 
Page Index Toggle Pages: 1
UGENE-produced gb files cannot be parsed in biopython or bioperl (Read 13264 times)
Sep 20th, 2009 at 2:32pm

vigil   Offline
YaBB Newbies

Posts: 9
*
 
Thankyou again for a great program, soon to be cited in my paper!

UGENE produces gb files of hits found, for instance when searching for repats or pattern matches. I've been trying to further treat these results by parsing them with bio-perl/python, but the parsers of these two languages are not compatible with the kind of genbank output I get from UGENE.

I notice now I'm using 1.4.1 (on windows, 1.4 gHz single processor on tc4200 laptop) not the most recent verison, but I can't see anything about this problem in the later bug fixes.

The perl code that produced the latest error looked like this:

Code:
use Bio::SeqIO;

### This is a file from UGENEs find pattern tool:####

my $filenme="C:/Documents and Settings/Carl/My Documents/NewBioInf/myGB/hitsOniS3.gb";

$seqio_obj = Bio::SeqIO->new(-file => $filenme, -format => "genbank" );
$seq_obj = $seqio_obj->next_seq; 



And here's the error:
Code:
C:\Documents and Settings\Carl>perl readGB.pl

--------------------- WARNING ---------------------
MSG: Unknown alphabet:
---------------------------------------------------

--------------------- WARNING ---------------------
MSG: Unexpected error in feature table for  Skipping feature, attempting to recover
---------------------------------------------------

------------- EXCEPTION -------------
MSG: Alphabet '1' is not a valid alphabet ('dna','protein','rna') lowercase
STACK Bio::PrimarySeq::alphabet C:/Perl/site/lib/Bio/PrimarySeq.pm:571
STACK Bio::PrimarySeq::new C:/Perl/site/lib/Bio/PrimarySeq.pm:208
STACK Bio::Seq::new C:/Perl/site/lib/Bio/Seq.pm:484
STACK Bio::Seq::RichSeq::new C:/Perl/site/lib/Bio/Seq/RichSeq.pm:110
STACK Bio::Seq::SeqFactory::create C:/Perl/site/lib/Bio/Seq/SeqFactory.pm:116
STACK Bio::Factory::ObjectFactoryI::create_object C:/Perl/site/lib/Bio/Factory/ObjectFactoryI.pm:102

STACK Bio::Seq::SeqBuilder::make_object C:/Perl/site/lib/Bio/Seq/SeqBuilder.pm:337
STACK Bio::SeqIO::genbank::next_seq C:/Perl/site/lib/Bio\SeqIO\genbank.pm:717
STACK toplevel readGB.pl:5
------------------------------------- 



Best regards,

Theo Vigil
 
IP Logged
 
Reply #1 - Sep 20th, 2009 at 6:48pm

Ivan Efremov   Offline
YaBB Administrator
Novosibirsk

Gender: male
Posts: 46
*****
 
Hi Theo,
please post the file - we will try to investigate the problem.
 

UGENE team
IP Logged
 
Reply #2 - Oct 19th, 2009 at 2:29pm

vigil   Offline
YaBB Newbies

Posts: 9
*
 
Forgot what file this was, but I think I corrected the problem by adjusting the number of spaces surrounding the nucleotide position in the genbank file - from 9 to 10 or 11 to 10 or something similar.
 
IP Logged
 
Reply #3 - Oct 19th, 2009 at 3:05pm

Ivan Efremov   Offline
YaBB Administrator
Novosibirsk

Gender: male
Posts: 46
*****
 
OK, we will try to investigate the problem.
Thanks for report!
 

UGENE team
IP Logged
 
Reply #4 - Oct 19th, 2009 at 5:26pm

Ivan Efremov   Offline
YaBB Administrator
Novosibirsk

Gender: male
Posts: 46
*****
 
Theo,
I've tried to reproduce the error and investigated the following.

This exception is thrown only if I try to open the file (.gb) which does not contain a sequence (see no_seq.gb). And, as I think, this is not 'very' incorrect behavior since you try to access a sequence object from a file without such object - some error must be reported by bio perl.

When I try to open files with sequences (bioperl_ok.gb2) - everything is OK.
 

no_seq.gb (0 KB | )
bioperl_ok.gb (0 KB | )

UGENE team
IP Logged
 
Page Index Toggle Pages: 1