Solution: Verifying an Alien Dictionary
Let's solve the Verifying an Alien Dictionary problem using the Topological Sort pattern.
Statement
You’re given a list of words with lowercase English letters in a different order, written in an alien language. The order of the alphabet is some permutation of lowercase letters of the English language.
We have to return TRUE if the given list of words is sorted lexicographically in this alien language.
Constraints:
-
words.length
-
words[i].length
order.length
- All the characters in
words[i]
andorder
are lowercase English letters.
Solution
So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which to follow based on considerations such as time complexity and implementation constraints.
Naive approach
The naive approach for this problem is to iterate over the order list and words simultaneously. Start from the first word, then compare the order with all other words present in the list.
The time complexity for the naive approach is , since we’re iterating the order list and the complete list two times in nested loops. The space complexity of this naive approach is constant because we didn’t use any extra memory.
Optimized approach using topological sort
We can solve this problem using the topological sort pattern. Topological sort is used to find a linear ordering of elements that depend on or prioritize each other. For example, if A depends on B or if B has priority over A, B is listed before A in topological order.
For this problem, we’re given the list of words and the order of the alphabet. Using the order of the alphabet, we must check if the list of words is sorted lexicographically.
We can check adjacent words to see if they are in the correct order. For each word, the word on its right should be lexicographically larger, and the one on its left should be lexicographically smaller.
One thing to notice here is that we don’t need to compare all the words. We can just compare each pair of adjacent words instead. If all the pairs of adjacent words are in order, we can safely assume that the integrity is intact. Conversely, if any pair of adjacent words isn’t in order, we can assume that our order is not correct.
Let’s review the illustration to better understand the intuition described above with the help of a sample list of three words, :
We need an efficient way of storing the order of each letter and its ranking provided in the order list. Once we do that, we move on to the comparison part. To compare two adjacent words, words[i]
and words[i + 1]
, we iterate over the letters one by one and find the first index where the letter in both the words is different. If words[i]
has a smaller letter than the corresponding one in words[i + 1]
, we break this iteration because we know these two words are in the right order.
If words[i]
has a lexicographically larger letter, we immediately return FALSE because we’ve found a pair of consecutive words that aren’t in the correct order.
We also need to consider the boundaries. When we iterate over a word, we need to ensure that the other word hasn’t ended, that is, all the letters in the two words match up to the point where one of the strings ends. For example, in “educated” and “educate,” we can’t iterate over all the letters of “educated” because the word “educate” is shorter. In this case, we must examine the length of each word. If the words are the same length or the former word is shorter, then the words list is sorted. However, if the latter word is shorter, then the words list is not sorted.
Now that we have an overview of the solution, we can move on to the actual implementation of it.
Let’s discuss the algorithm for the above approach:
-
We initialize a hash map to record the relations between each letter and its ranking in the order list.
-
We iterate over the words and compare each pair of adjacent words.
-
We find the first index in two consecutive words (
words[i]
andwords[i + 1]
) where the letter in the two words is different.-
If
words[i + 1]
ends beforewords[i]
and no different letters are found, then we need to return FALSE becausewords[i + 1]
should come beforewords[i]
. -
If we find the first different letter and the two letters are in the correct order, then we exit from the current iteration and proceed to the next pair of words.
-
If we find the first different letter and the two letters are in the wrong order, then we safely return FALSE.
-
-
-
By the time we reach the end of the outer loop, we have examined all the pairs of adjacent words and ensured that they are all sorted. Therefore, we return TRUE.
Let’s visualize the algorithm with a simplified example with just three alien words and an order that consists of only eight characters:
words = [app, apple, alpha]
order = "abcdehlp"
Let’s look at the code for this solution below:
import java.util.*;class VerifyDictionary {public static boolean verifyAlienDictionary(String[] words, String order) {if (words.length == 1)return true;Map<Character, Integer> orderMap = new HashMap<>();for (int i = 0; i < order.length(); i++) {orderMap.put(order.charAt(i), i);}for (int i = 0; i < words.length - 1; i++) {for (int j = 0; j < words[i].length(); j++) {if (j >= words[i + 1].length())return false;if (words[i].charAt(j) != words[i + 1].charAt(j)) {if (orderMap.get(words[i].charAt(j)) > orderMap.get(words[i + 1].charAt(j))) {return false;} else {break;}}}}return true;}// Driver codepublic static void main(String[] args) {String[][] words = {{"alpha", "bravo", "charlie", "delta"},{"apple", "app"},{"martian"},{"jupyter", "ascending"},{"passengers", "to", "the", "unknown"}};String[] order = {"abcdefghijklmnopqrstuvwxyz","abcdefghijklmnopqrstuvwxyz","mabcdefghijklnopqrstuvwxyz","jabcdefghiklmnopqrstuvwxyz","ptuhabcdefghijklmnoqrsvwxyz"};for (int i = 0; i < order.length; i++) {System.out.print(i + 1);System.out.print(".\tWords : " + Arrays.toString(words[i]));System.out.print("\n\tOrder : " + order[i]);System.out.println("\n\tAlien Dictionary verified: " + verifyAlienDictionary(words[i], order[i]));System.out.println(new String(new char[100]).replace('\0', '-'));}}}
Solution summary
To recap, the solution to this problem can be divided into the following parts:
-
Fill the hash map with each character of the order string and its adjacent index.
-
Iterate a word list.
-
Iterate
words[i]
string and compare each character with the character of an adjacent word from the list,words[i + 1]
.-
If there is no different character in the selected words, and the
words[i + 1]
end before words[i], return FALSE. -
If there is a different character and the two selected words are in the correct order, we move to the next two adjacent words.
-
If there is a different character and the two words are not in the correct order, we return FALSE.
-
If the loop ends after iterating all of the words, we return TRUE.
-
Time complexity
Because the maximum number of letters could be , storing the letter-order relation of each letter takes constant time, i.e., time.
Next, for each word in words
, we iterate over all of its letters. Therefore, the total time complexity of the nested loops comes out to be , where is the length of words, and is the maximum number of letters in any word in words
.
Hence, the total time complexity of this algorithm is .
Space complexity
We have used a hash map to store the letter-order relations for each word in order
. Due to the fixed length of letters in order
, this approach takes space in memory.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.