Solution: Find Longest Self-Contained Substring
Let’s solve the Find Longest Self-Contained Substring problem using the Hash Maps pattern.
We'll cover the following
Statement
You are given a string, s
, consisting of lowercase English letters. Your task is to find the length of the longest self-contained substring of s
.
A substring t
of s
is called self-contained if:
t
is not equal to the entire strings
.Every character in
t
does not appear anywhere else ins
(outside oft
).
In other words, all characters in t
are completely unique to that substring within the string s
.
Return the length of the longest self-contained substring. If no such substring exists, return -1.
Constraints:
s.length
s
consists only of lowercase English letters.
Solution
We iterate through the string once to record where each character first appears and where it last appears. Two separate hash maps store each character’s first and last occurrence indices. Each unique character serves as a potential starting point, defining an initial window from its first to last occurrence. The window is adjusted based on the following conditions:
As we iterate within this window, if we encounter a character whose last occurrence is further to the right than the current window’s end, we expand the window to include it. This ensures that the substring remains self-contained, meaning all occurrences of each character within it are included. The process continues until no more characters extend the window’s boundary.
As we expand the window, every character inside it must have its first occurrence within the current window’s start boundary. If we encounter a character whose first occurrence index is before the current window’s starting position, it means that an earlier part of the string contains an instance of that character, violating the self-contained property. When this happens, the current substring is invalid, and we discard it from consideration.
The maximum valid window length is tracked and updated accordingly, ensuring the longest valid substring is returned.
The steps of the algorithm are as follows:
Create two hashmaps,
first
, andlast
, to store the first and last occurrence index of each character in the string.Iterate through the string once to populate these hash maps with each character’s first and last occurrence.
Initialize
maxLen = -1
to keep track of the maximum length of a valid self-contained substring found.For each unique character
c1
(processed once), start from its first occurrence index:Initialize the
start
andend
of the window by setting the starting point to the character’s first occurrence and the ending point to its last occurrence in the string.Iterate through the string from the starting position
start
, extending the endpointend
whenever a character’s last occurrence is beyond the current endpointend
.If a character
c2
inside the window has its first occurrence before the window'sstart
, the window is invalid.
Validate the substring:
When the current index
j
reaches theend
, check if the window is valid and its length is less than the total string length.If the window is valid, update
maxLen
with the maximum of its current value and the window’s length (end - start + 1
).
After checking all potential starting characters, return the maximum valid length found. If no valid substring exists, return
-1
.
Let’s look at the illustration below to better understand the solution.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.