## A Space-Efficient Algorithm for the Constrained Pairwise Sequence Alignment Problem

Dan He Abdullah N. Arslan (aarslan@cs.uvm.edu)

Department of Computer Science, University of Vermont, Burlington,
VT 05405, USA

### Abstract

The *constrained pairwise sequence alignment* (*CPSA*)
problem aims to align two given sequences by aligning their
similar subsequences in the same region under the guidance of a
given pattern (constraint). Let the lengths of the sequences be
*m*, and *n* where *n ≤ m*, and let *r ≤ n* be the length of
the given pattern. The optimum constrained pairwise alignment
score can be computed using *O(rn)* space by a naive dynamic
programming solution. If an optimal alignment path is desired then
the space requirement of the naive dynamic programming algorithm
is *O(rnm)*. There is a divide-and-conquer algorithm that reduces
the memory requirement of finding an optimal alignment for the
*CPSA* problem to *O(rn)*.
In this paper, we present a space-efficient *CPSA* algorithm that
returns an optimal alignment. Our analysis on real protein
sequences suggests that our algorithm requires only *O(n)* space
in practice. This algorithm is not only space efficient but also
very fast. A generalization of the *CPSA* problem for multiple
sequences is called the constrained multiple sequence alignment
(*CMSA*) problem. Our *CPSA* algorithm also improves the space
requirement of progressive *CMSA* algorithms that use solutions of
*CPSA* problems.

*Japanese Society for Bioinformatics*