Hello,
In general, gap means, that character on this place wasn't sequenced successfully. If you point to any read you will see the line called "Cigar" (see attachment "cigar.png"). This line means the descripting of bases in a read. "M" means that the number of described bases has been successfully sequenced, "N" - that the number of described bases were not sequences, so we have a gap (defined as "-" in UGENE) in this spot. For example, in the example we have "10M32N38M", which means: - 10 sequenced bases - 32 gaps - 38 sequenced bases
You may see this read on the attachment 2 (read.png).
UGENE parses SAM file correctly and takes into account gaps. I opened IGV and couldn't find any sequences with gaps too - I do not know how IGV parses SAM files, but, probably, it just skips reads with gaps. As you may see on your picture, gaps are inserted correctly - they "moves" character in reads and in different reads the same character locates under the same character.
Have I answered your question? If no, please, detail it a bit
Best regards, Dmitrii Sukhomlinov, The UGENE team.
IP Logged
|