Quote:As for the second case - cutting of a fixed number of bases - could you please explain what is the use case of this issue? When do we need to do so?
if the library has been constructed/amplified using primers/adapters that together with standard illumina sequences carry addition elements, e.g. for custom samples bar-coding.
In such case, library amplicons may have the following structure:
5'-[illumina sequence, constant]-[custom sequence, constant]-[sequence to be aligned, variable]-[custom sequence, constant]-[illumina sequence, constant]-3'
as I have mentioned in the previous reply, it should be possible to remove [custom sequence, constant] using Cut Adapter module by specifying the corresponding custom sequence.
Trimming .fastq by length would be useful when custom sequence is either highly complex or potentially unknown, e.g. molecular barcoding by N12. Additionally, if sequencing across [custom sequence, constant] comes back with low quality or mistakes then Cut Adapter module will not recognize it.
I am happy to share python code for length-based trimming should it be possible to integrate it into UGENE, ideally under external tools within Workflow plugin.