Kybernetika 38 no. 1, 45-66, 2002

A new practical linear space algorithm for the longest common subsequence problem

Heiko Goeman and Michael Clausen

Abstract:

This paper deals with a new practical method for solving the longest common subsequence (LCS) problem. Given two strings of lengths $m$ and $n$, $n\ge m$, on an alphabet of size $s$, we first present an algorithm which determines the length $p$ of an LCS in $O(ns+\min\{mp,p(n-p)\})$ time and $O(ns)$ space. This result has been achieved before [29,30](C. Rick: New Algorithms for the Longest Common Subsequence Problem. Research Report No.85123-CS, Department of Computer Science, University of Bonn 1994. and C. Rick: A new flexible algorithm for the longest common subsequence problem. In: Proceedings, 6th Annual Symp. on Combinatorial Pattern Matching (Lecture Notes in Computer Science 937), Springer Verlag, Berlin 1995, pp. 340-351.), but our algorithm is significantly faster than previous methods. We also provide a second algorithm which generates an LCS in $O(ns+\min\{mp,m\log m+p(n-p)\})$ time while preserving the linear space bound, thus solving the problem posed in [29,30]. Experimental results confirm the efficiency of our method.