Mejoras a los algoritmos de bioinformática Needleman-Wunsch y K-Band mediante la mejora de la matriz de similitud
Abstract
Bioinformatics algorithms are limited by memory and disc management processes due to their
large amount of data manipulation, like sequence alignment methods. For example, NeedlemanWunsch
algorithm aligns two sequences optimally; it creates a matrix of N rows and M columns,
where N and M are the amount of characters each sequence had. Based on the final size of the
matrix these problems become restrictive.
This thesis’s objective is to set and prove a Needleman-Wunsch algorithm improvement in order
to enhance its efficiency. With the fusion of two dynamic programming algorithms and the
minimization on the similarity matrix usage the amount of disc and / or memory used by the
algorithm will be reduced.
The implemented solution is a modification of the K-Band algorithm, which gets the optimal
solution parsing the matrix once, differing from its original version that parses it many times.
The main feature of the new algorithm is how the band is expanded when the alignment cannot
be optimal, that will happen when the parsing is in the border of the band. Some experiments
were developed and ran in order to validate the solution created. With the results obtained, it was
proved that the proposed improvement of the Needleman-Wunsch algorithm uses the similarity
matrix less and improves the memory and disc performance on big sequences
Description
Proyecto de Graduación (Maestría en Computación) Instituto Tecnológico de Costa Rica, Escuela de Ingeniería en Computación, 2015.
Share
Metrics
Collections
- Maestría en Computación [107]