we had a fasta file with unwanted redundancy.
Reads were mapped to a reference and then each match was extracted with an ID and sequence, to which this ID mapped. This resulted in a redundancy in IDs. Our goal was to get only one ID for each sequence.
Continue reading “sort and reduce redundancy in FASTA files”
I often have to extract specific sequences from big fasta. I found this nice script here and with some slight modifications you can also use it for other file types.
Continue reading “Extracting Sequences from a Fasta using ID list”