[Algorithms] Longest Common Subsequence

简介: The Longest Common Subsequence (LCS) problem is as follows: Given two sequences s and t, find the length of the longest sequence r, which is a subsequence of both s and t.

The Longest Common Subsequence (LCS) problem is as follows:

Given two sequences s and t, find the length of the longest sequence r, which is a subsequence of both s and t.

Do you know the difference between substring and subequence? Well, substring is a contiguous series of characters while subsequence is not necessarily. For example, "abc" is a both a substring and a subseqeunce of "abcde" while "ade" is only a subsequence.

This problem is a classic application of Dynamic Programming. Let's define the sub-problem (state) P[i][j] to be the length of the longest subsequence ends at i of s and j of t. Then the state equations are

  1. P[i][j] = max(P[i][j - 1], P[i - 1][j]) if s[i] != t[j];
  2. P[i][j] = P[i - 1][j - 1] + 1 if s[i] == t[j].

This algorithm gives the length of the longest common subsequence.  The code is as follows.

1 int longestCommonSubsequence(string s, string t) {
2     int m = s.length(), n = t.length();
3     vector<vector<int> > dp(m + 1, vector<int> (n + 1, 0));
4     for (int i = 1; i <= m; i++)
5         for (int j = 1; j <= n; j++)
6             dp[i][j] = (s[i - 1] == t[j - 1] ? dp[i - 1][j - 1] + 1 : max(dp[i - 1][j], dp[i][j - 1]));
7     return dp[m][n];
8 }

Well, this code has both time and space complexity of O(m*n). Note that when we update dp[i][j], we only need dp[i - 1][j - 1], dp[i - 1][j] and dp[i][j - 1]. So we simply need to maintain two columns for them. The code is as follows.

 1 int longestCommonSubsequenceSpaceEfficient(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     int maxlen = 0;
 4     vector<int> pre(m, 0);
 5     vector<int> cur(m, 0);
 6     pre[0] = (s[0] == t[0]);
 7     maxlen = max(maxlen, pre[0]);
 8     for (int i = 1; i < m; i++) {
 9         if (s[i] == t[0] || pre[i - 1] == 1) pre[i] = 1;
10         maxlen = max(maxlen, pre[i]);
11     }
12     for (int j = 1; j < n; j++) {
13         if (s[0] == t[j] || pre[0] == 1) cur[0] = 1;
14         maxlen = max(maxlen, cur[0]);
15         for (int i = 1; i < m; i++) {
16             if (s[i] == t[j]) cur[i] = pre[i - 1] + 1;
17             else cur[i] = max(cur[i - 1], pre[i]);
18             maxlen = max(maxlen, cur[i]);
19         }
20         swap(pre, cur);
21         fill(cur.begin(), cur.end(), 0);
22     }
23     return maxlen;
24 }

Well, keeping two columns is just for retriving pre[i - 1], we can maintain a single variable for it and keep only one column. The code becomes more efficient and also shorter. However, you may need to run some examples to see how it achieves the things done by the two-column version.

 1 int longestCommonSubsequenceSpaceMoreEfficient(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     vector<int> cur(m + 1, 0);
 4     for (int j = 1; j <= n; j++) {
 5         int pre = 0;
 6         for (int i = 1; i <= m; i++) {
 7             int temp = cur[i];
 8             cur[i] = (s[i - 1] == t[j - 1] ? pre + 1 : max(cur[i], cur[i - 1]));
 9             pre = temp;
10         }
11     }
12     return cur[m];
13 }

Now you may try this problem on UVa Online Judge and get Accepted:)

Of course, the above code only returns the length of the longest common subsequence. If you want to print the lcs itself, you need to visit the 2-d table from bottom-right to top-left. The detailed algorithm is clearly explained here. The code is as follows.

 1 int longestCommonSubsequence(string s, string t) {
 2     int m = s.length(), n = t.length();
 3     vector<vector<int> > dp(m + 1, vector<int> (n + 1, 0));
 4     for (int i = 1; i <= m; i++)
 5         for (int j = 1; j <= n; j++)
 6             dp[i][j] = (s[i - 1] == t[j - 1] ? dp[i - 1][j - 1] + 1 : max(dp[i - 1][j], dp[i][j - 1]));
 7     int len = dp[m][n];
 8     // Print out the longest common subsequence
 9     string lcs(len, ' ');
10     for (int i = m, j = n, index = len - 1; i > 0 && j > 0;) {
11         if (s[i - 1] == t[j - 1]) {
12             lcs[index--] = s[i - 1];
13             i--;
14             j--;
15         }
16         else if (dp[i - 1][j] > dp[i][j - 1]) i--;
17         else j--;
18     }
19     printf("%s\n", lcs.c_str());
20     return len;
21 }

 

目录
相关文章
|
算法
LeetCode 300. Longest Increasing Subsequence
给定一个无序的整数数组,找到其中最长上升子序列的长度。
58 0
LeetCode 300. Longest Increasing Subsequence
|
存储
LeetCode 329. Longest Increasing Path in a Matrix
给定一个整数矩阵,找出最长递增路径的长度。 对于每个单元格,你可以往上,下,左,右四个方向移动。 你不能在对角线方向上移动或移动到边界外(即不允许环绕)。
81 0
LeetCode 329. Longest Increasing Path in a Matrix
|
人工智能
POJ 2533 Longest Ordered Subsequence
POJ 2533 Longest Ordered Subsequence
117 0
LeetCode - 32. Longest Valid Parentheses
32. Longest Valid Parentheses  Problem's Link  ---------------------------------------------------------------------------- Mean:  给定一个由'('和')'组成的字符串,求最长连续匹配子串长度.
980 0
[LeetCode] Longest Increasing Path in a Matrix
Given an integer matrix, find the length of the longest increasing path. From each cell, you can either move to four directions: left, right, up or down. You may NOT move diagonally or mov
1010 0
|
Linux
[LeetCode] Longest Increasing Subsequence
A typical O(n^2) solution uses dynamic programming. Let's use lens[j] to denote the length of the LIS ending with nums[j].
856 0

热门文章

最新文章

下一篇
开通oss服务