Solution: Longest Common Subsequence
Let's solve the Longest Common Subsequence problem using Dynamic Programming.
Statement
Suppose you are given two strings. You need to find the length of the longest common subsequence between these two strings.
A subsequence is a string formed by removing some characters from the original string while maintaining the relative position of the remaining characters. For example, “abd” is a subsequence of “abcd”, where the removed character is “c”.
If there is no common subsequence, then return 0.
Constraints:
-
str1.length
-
str2.length
str1
andstr2
contain only lowercase English characters.
Solution
So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.
Naive approach
A naive approach would be to compare the characters of both strings based on the following rules:
-
If the current characters of both strings match, we move one position ahead in both strings.
-
If the current characters of both strings do not match, we recursively calculate the maximum length of moving one character forward in any one of the two strings i.e., we check if moving a character forward in either the first string or the second will give us a longer subsequence.
-
If we reach the end of either of the two strings, we return .
The time complexity of the naive approach is , where and are the lengths of the two strings, respectively. The space complexity of this approach is .
Optimized solution using dynamic programming
We are going to solve this problem with the help of the top-down approach of dynamic programming. The top-down solution, commonly known as the memoization technique, is an enhancement of the recursive solution. It overcomes the problem of calculating redundant solutions over and over again by storing them in an array. In the recursive approach, the following two variables kept changing:
-
The index,
i
, used to keep track of the current character instr1
. -
The index,
j
, used to keep track of the current character instr2
.
We will use a 2D table, dp
, with rows and columns to store the result at any given state. represents the length of str1
and and represents the length of str2
. At any later time, if we encounter the same subproblem, we can return the stored result from the table with an lookup instead of recalculating that subproblem.
Let’s look at the following illustration to get a better understanding of the solution:
Let’s implement the algorithm as discussed above:
import java.util.*;class LongestCommonSubsequence {// Helper function with updated signature: i is current index in str1, j is current index in str2public static int longestCommonSubsequenceHelper(String str1, String str2, int i, int j, int[][] dp) {if (i == str1.length() || j == str2.length()) {return 0;} else if (dp[i][j] == -1) {if (str1.charAt(i) == str2.charAt(j)) {dp[i][j] = 1 + longestCommonSubsequenceHelper(str1, str2, i + 1, j + 1, dp);} else {dp[i][j] = Math.max(longestCommonSubsequenceHelper(str1, str2, i + 1, j, dp),longestCommonSubsequenceHelper(str1, str2, i, j + 1, dp));}}return dp[i][j];}public static int longestCommonSubsequence(String str1, String str2) {int n = str1.length();int m = str2.length();int[][] dp = new int[n][m];for (int[] row : dp) {Arrays.fill(row, -1);}return longestCommonSubsequenceHelper(str1, str2, 0, 0, dp);}// Driver codepublic static void main(String[] args) {String[] firstStrings = { "qstw", "setter", "abcde", "partner", "freedom" };String[] secondStrings = { "gofvn", "bat", "apple", "park", "redeem" };for (int i = 0; i < firstStrings.length; i++) {System.out.println((i + 1) + ".\tstr1: " + firstStrings[i] + "\n\t" + "str2: " + secondStrings[i] + "\n\n\t"+ "The length of the longest common subsequence is: "+ longestCommonSubsequence(firstStrings[i], secondStrings[i]));System.out.println(new String(new char[100]).replace('\0', '-'));}}}
Solution summary
In this case, the top-down approach is used to store the solutions to subproblems encountered in the recursive function. The use of a 2D table (dp
) allows for the storage of solutions to subproblems, and the index variables allow for the function to iterate through the characters in the two input strings and compare them, determining whether they are part of the longest common subsequence. The table allows for a quick lookup of previously computed solutions, reducing the number of function calls and improving the performance of the algorithm.
Time complexity
Let’s think of this in terms of the keyspace mapped by the tuple of i
and j
. For strings of length and , i
can go from to , while j
can go from to . Therefore, the total number of unique subproblems to evaluate and store are . So, the time complexity of this approach is reduced to because we avoid redundant calculations by storing all the intermediate results in a 2D array.
Space complexity
We now need space in memory to store intermediate values in the table.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.