Solution: Word Break

Let's solve the Word Break problem using the Dynamic Programming pattern.

Statement

Given a string, s, and a dictionary of strings, wordDict, check if s can be segmented into a space-separated sequence of one or more dictionary words. If yes, return TRUE; else, return FALSE.

Note: The same word in the dictionary may be used multiple times.

Constraints:

  • 11 \leq s.length 250\leq 250

  • 11 \leq wordDict.length 1000\leq 1000

  • 11 \leq wordDict[i].length 20\leq 20

  • s and wordDict[i] consist of only lowercase English letters.

  • All the strings of wordDict are unique.

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

The naive approach to solve this problem is to use a basic recursive strategy, which considers all possible prefixes of a string and checks if they are present in the dictionary. If a prefix is contained in the dictionary, the rest of the string is checked in the same manner.

Although this approach can get us the desired output, it has a time complexity of O(2n)O(2^n), where nn is the length of the string. This is because at each step, we split the string in two, check the prefix in the dictionary, and repeat the process with the rest of the string. The space complexity of this approach is O(n)O(n) because the depth of the recursion tree can go up to nn.

Optimized approach using dynamic programming

In the recursive approach, we notice that many prefixes are computed again even though they were checked in the earlier computations. For example, a prefix of two characters contains a prefix of one character that has already been checked. These redundant computations consume a lot of time that can be avoided using dynamic programming. The idea is to use a lookup table, dp, of size n+1n+1, where nn is the length of the input string and 11 is added to cater the empty string. This table will store the results of the shorter prefixes that can be used, in O(1)O(1) lookup time, for solving the longer prefixes.

The dp is initialized with FALSE except for dp[0], which is set TRUE because an empty string is assumed to be a valid word in the dictionary. Then, using two pointers, i and j, we check every possible prefix s[j..i] and whether they’re contained in the dictionary. If the substring s[j..i] is found in the dictionary, we don’t check further smaller substrings. We also check whether dp[j] is TRUE, which tells us that the prefix s[0..j] was found in the dictionary. If both conditions are fulfilled, the corresponding dp index, i.e., dp[i], is set to TRUE. We continue this process until the whole string has been processed. Finally, we return the value of dp[n], which tells us that the input string could be segmented into a space-separated sequence of dictionary words.

Let’s look at the following illustration to get a better understanding of the solution:

Let's implement the above algorithm:

import java.util.*;
class Main {
// Function to check if a string can be broken down into words from the dictionary
public static boolean wordBreak(String s, List<String> wordDict) {
int n = s.length();
Set<String> wordSet = new HashSet<>(wordDict);
boolean[] dp = new boolean[n + 1];
dp[0] = true;
for (int i = 1; i <= n; i++) {
for (int j = 0; j < i; j++) {
if (dp[j] && wordSet.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[n];
}
// Driver code
public static void main(String[] args) {
List<String> s = new ArrayList<>(Arrays.asList(
"vegancookbook", "catsanddog", "highwaycrash",
"pineapplepenapple", "screamicecream", "educativecourse"
));
List<String> wordDict = new ArrayList<>(Arrays.asList(
"ncoo", "kboo", "inea", "icec", "ghway", "and", "anco", "hi", "way", "wa",
"amic", "ed", "cecre", "ena", "tsa", "ami", "lepen", "highway", "ples",
"ookb", "epe", "nea", "cra", "lepe", "ycras", "dog", "nddo", "hway",
"ecrea", "apple", "shp", "kbo", "yc", "cat", "tsan", "ganco", "lescr",
"ep", "penapple", "pine", "book", "cats", "andd", "vegan", "cookbook"
));
System.out.println("The list of words we can use to break down the strings are:\n\n"+ wordDict+"\n");
System.out.println(new String(new char[100]).replace("\0", "-"));
for (int i = 0; i < s.size(); ++i) {
System.out.printf("Test Case #%d\n\n", i + 1);
System.out.printf("Input: '%s'\n", s.get(i));
Print.possibleCombinations(s.get(i), wordDict);
System.out.printf("Output: %b\n", wordBreak(s.get(i), wordDict));
System.out.println(new String(new char[100]).replace("\0", "-"));
}
}
}
Word Break

Solution summary

  1. We initialize a lookup table, dp, and store TRUE at dp[0] because an empty string is always assumed to be part of the dictionary.
  2. Using two pointers, i and j, we find all possible prefixes of the given input string.
  3. For each prefix s[j..i], we check whether it’s contained in the dictionary and if dp[j] is TRUE. If both conditions are fulfilled, the corresponding index in the lookup table, i.e., dp[i], is set to TRUE.
  4. After the whole string has been processed, we return the value of dp[n], which holds the final result.

Time complexity

The time complexity of converting wordDict into a set is O(wl)O(w* l), where ww is the size of wordDict and ll is the average length of the words in wordDict. The nested loops take O(n3)O(n^3), where nn is the length of the input string s. This is because the nested loops run O(n2)O(n^2) times and in each iteration, the substring operation consumes O(n)O(n) time.

Therefore, the total time complexity of this algorithm is O(n3+wl)O(n^3+w* l).

Space complexity

The space complexity of this algorithm is O(n)O(n), where nn is the length of the input string. This is the space occupied by the lookup table.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.