sort and reduce redundancy in FASTA files

we had a fasta file with unwanted redundancy.

Reads were mapped to a reference and then each match was extracted with an ID and  sequence, to which this ID mapped. This resulted in a redundancy in IDs. Our goal was to get only one ID for each sequence.

Continue reading “sort and reduce redundancy in FASTA files”


Extracting Sequences from a Fasta using ID list

I often have to extract specific sequences from big fasta. I found this nice script here and with some slight modifications you can also use it for other file types.

Continue reading “Extracting Sequences from a Fasta using ID list”

master thesis – publication

The larger project my master’s thesis was a part of has been submitted and can be found in pre-press here: Foerster et al. 2017

I’m very thankful to have been a part of such an interesting project, which provides new insights and knowledge about the use and development of SNP-panels, as well as using this method for genetic wildlife monitoring.