This is a classic problem of Dynamic Programming. We define the state dp[i][j]
to be the minimum number of operations to convert word1[0..i - 1]
to word2[0..j - 1]
. The state equations have two cases: the boundary case and the general case. Note that in the above notations, both i
and j
take values starting from 1
.
For the boundary case, that is, to convert a string to an empty string, it is easy to see that the mininum number of operations to convert word1[0..i - 1]
to ""
requires at least i
operations (deletions). In fact, the boundary case is simply:
-
dp[i][0] = i
; -
dp[0][j] = j
.
Now let's move on to the general case, that is, convert a non-empty word1[0..i - 1]
to another non-empty word2[0..j - 1]
. Well, let's try to break this problem down into smaller problems (sub-problems). Suppose we have already known how to convert word1[0..i - 2]
to word2[0..j - 2]
, which is dp[i - 1][j - 1]
. Now let's consider word[i - 1]
and word2[j - 1]
. If they are euqal, then no more operation is needed and dp[i][j] = dp[i - 1][j - 1]
. Well, what if they are not equal?
If they are not equal, we need to consider three cases:
- Replace
word1[i - 1]
byword2[j - 1]
(dp[i][j] = dp[i - 1][j - 1] + 1 (for replacement)
); - Delete
word1[i - 1]
andword1[0..i - 2] = word2[0..j - 1]
(dp[i][j] = dp[i - 1][j] + 1 (for deletion)
); - Insert
word2[j - 1]
toword1[0..i - 1]
andword1[0..i - 1] + word2[j - 1] = word2[0..j - 1]
(dp[i][j] = dp[i][j - 1] + 1 (for insertion)
).
Make sure you understand the subtle differences between the equations for deletion and insertion. For deletion, we are actually converting word1[0..i - 2]
to word2[0..j - 1]
, which costs dp[i - 1][j]
, and then deleting the word1[i - 1]
, which costs 1
. The case is similar for insertion.
Putting these together, we now have:
-
dp[i][0] = i
; -
dp[0][j] = j
; -
dp[i][j] = dp[i - 1][j - 1]
, ifword1[i - 1] = word2[j - 1]
; -
dp[i][j] = min(dp[i - 1][j - 1] + 1, dp[i - 1][j] + 1, dp[i][j - 1] + 1)
, otherwise.
The above state equations can be turned into the following code directly.
1 class Solution { 2 public: 3 int minDistance(string word1, string word2) { 4 int m = word1.length(), n = word2.length(); 5 vector<vector<int> > dp(m + 1, vector<int> (n + 1, 0)); 6 for (int i = 1; i <= m; i++) 7 dp[i][0] = i; 8 for (int j = 1; j <= n; j++) 9 dp[0][j] = j; 10 for (int i = 1; i <= m; i++) { 11 for (int j = 1; j <= n; j++) { 12 if (word1[i - 1] == word2[j - 1]) 13 dp[i][j] = dp[i - 1][j - 1]; 14 else dp[i][j] = min(dp[i - 1][j - 1] + 1, min(dp[i][j - 1] + 1, dp[i - 1][j] + 1)); 15 } 16 } 17 return dp[m][n]; 18 } 19 };
Well, you may have noticed that each time when we update dp[i][j]
, we only need dp[i - 1][j - 1], dp[i][j - 1], dp[i - 1][j]
. In fact, we need not maintain the full m*n
matrix. Instead, maintaing one column is enough. The code can be optimized to O(m)
or O(n)
space, depending on whether you maintain a row or a column of the original matrix.
The optimized code is as follows.
1 class Solution { 2 public: 3 int minDistance(string word1, string word2) { 4 int m = word1.length(), n = word2.length(); 5 vector<int> cur(m + 1, 0); 6 for (int i = 1; i <= m; i++) 7 cur[i] = i; 8 for (int j = 1; j <= n; j++) { 9 int pre = cur[0]; 10 cur[0] = j; 11 for (int i = 1; i <= m; i++) { 12 int temp = cur[i]; 13 if (word1[i - 1] == word2[j - 1]) 14 cur[i] = pre; 15 else cur[i] = min(pre + 1, min(cur[i] + 1, cur[i - 1] + 1)); 16 pre = temp; 17 } 18 } 19 return cur[m]; 20 } 21 };
Well, if you find the above code hard to understand, you may first try to write a two-column version that explicitly maintains two columns (the previous column and the current column) and then simplify the two-column version into the one-column version like the above code :-)