Solution: Longest Common Subsequence

Let's solve the Longest Common Subsequence problem using Dynamic Programming.

Statement

Suppose you are given two strings. You need to find the length of the longest common subsequence between these two strings.

A subsequence is a string formed by removing some characters from the original string while maintaining the relative position of the remaining characters. For example, “abd” is a subsequence of “abcd”, where the removed character is “c”.

If there is no common subsequence, then return 0.

Constraints:

  • 11 \leq str1.length 500\leq 500
  • 11 \leq str2.length 500\leq 500
  • str1 and str2 contain only lowercase English characters.

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

A naive approach would be to compare the characters of both strings based on the following rules:

  • If the current characters of both strings match, we move one position ahead in both strings.

  • If the current characters of both strings do not match, we recursively calculate the maximum length of moving one character forward in any one of the two strings i.e., we check if moving a character forward in either the first string or the second will give us a longer subsequence.

  • If we reach the end of either of the two strings, we return 00.

The time complexity of the naive approach is O(2m+n)O(2^{m+n}), where mm and nn are the lengths of the two strings, respectively. The space complexity of this approach is O(n+m)O(n + m).

Optimized solution using dynamic programming

We are going to solve this problem with the help of the top-down approach of dynamic programming. The top-down solution, commonly known as the memoization technique, is an enhancement of the recursive solution. It overcomes the problem of calculating redundant solutions over and over again by storing them in an array. In the recursive approach, the following two variables kept changing:

  • The index, i, used to keep track of the current character in str1.

  • The index, j, used to keep track of the current character in str2.

We will use a 2D table, dp, with nn rows and mm columns to store the result at any given state. nn represents the length of str1 and and mm represents the length of str2. At any later time, if we encounter the same subproblem, we can return the stored result from the table with an O(1)O(1) lookup instead of recalculating that subproblem.

Let’s look at the following illustration to get a better understanding of the solution:

Let’s implement the algorithm as discussed above:

import java.util.*;
class LongestCommonSubsequence {
// Helper function with updated signature: i is current index in str1, j is current index in str2
public static int longestCommonSubsequenceHelper(String str1, String str2, int i, int j, int[][] dp) {
if (i == str1.length() || j == str2.length()) {
return 0;
} else if (dp[i][j] == -1) {
if (str1.charAt(i) == str2.charAt(j)) {
dp[i][j] = 1 + longestCommonSubsequenceHelper(str1, str2, i + 1, j + 1, dp);
} else {
dp[i][j] = Math.max(longestCommonSubsequenceHelper(str1, str2, i + 1, j, dp),
longestCommonSubsequenceHelper(str1, str2, i, j + 1, dp));
}
}
return dp[i][j];
}
public static int longestCommonSubsequence(String str1, String str2) {
int n = str1.length();
int m = str2.length();
int[][] dp = new int[n][m];
for (int[] row : dp) {
Arrays.fill(row, -1);
}
return longestCommonSubsequenceHelper(str1, str2, 0, 0, dp);
}
// Driver code
public static void main(String[] args) {
String[] firstStrings = { "qstw", "setter", "abcde", "partner", "freedom" };
String[] secondStrings = { "gofvn", "bat", "apple", "park", "redeem" };
for (int i = 0; i < firstStrings.length; i++) {
System.out.println((i + 1) + ".\tstr1: " + firstStrings[i] + "\n\t" + "str2: " + secondStrings[i] + "\n\n\t"
+ "The length of the longest common subsequence is: "
+ longestCommonSubsequence(firstStrings[i], secondStrings[i]));
System.out.println(new String(new char[100]).replace('\0', '-'));
}
}
}
Longest Common Subsequence

Solution summary

In this case, the top-down approach is used to store the solutions to subproblems encountered in the recursive function. The use of a 2D table (dp) allows for the storage of solutions to subproblems, and the index variables allow for the function to iterate through the characters in the two input strings and compare them, determining whether they are part of the longest common subsequence. The table allows for a quick lookup of previously computed solutions, reducing the number of function calls and improving the performance of the algorithm.

Time complexity

Let’s think of this in terms of the keyspace mapped by the tuple of i and j. For strings of length mm and nn, i can go from 00 to mm, while j can go from 00 to nn. Therefore, the total number of unique subproblems to evaluate and store are (m×n)(m \times n). So, the time complexity of this approach is reduced to O(m×n)O(m \times n) because we avoid redundant calculations by storing all the intermediate results in a 2D array.

Space complexity

We now need O(n×m)O(n \times m) space in memory to store intermediate values in the table.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.