76. Minimum Window Substring#

Difficulty: Hard

Link to Problem: To see the Minimum Window Substring problem on LeetCode, click here!


Given two strings s and t of lengths m and n respectively, return the minimum window substring of s such that every character in t (including duplicates) is included in the window. If there is no such substring, return the empty string "".

The testcases will be generated such that the answer is unique.

Constraints:

  • m == s.length

  • n == t.length

  • 1 <= m, n <= \(10^5\)

  • s and t consist of uppercase and lowercase English letters.

Follow up: Could you find an algorithm that runs in \(O(m + n)\) time?

def min_window_substring(s, t):
    if not s or not t:
        return ""

    # Initialize dictionaries to keep track of character counts for t and the current window in s.
    t_dict = {}
    current_window_dict = {}
    
    # Populate t_dict with character counts for string t.
    for char in t:
        t_dict[char] = t_dict.get(char, 0) + 1

    # Initialize pointers for the sliding window.
    left = 0
    min_len = float('inf')
    min_window = ""
    required_chars = len(t_dict)

    # Initialize a variable to keep track of how many required characters have been found in the current window.
    found_chars = 0

    # Iterate over the string s using the right pointer.
    for right in range(len(s)):
        char = s[right]

        # Update the current_window_dict.
        current_window_dict[char] = current_window_dict.get(char, 0) + 1

        # Check if the current character is a required character and if its count matches the required count.
        if char in t_dict and current_window_dict[char] == t_dict[char]:
            found_chars += 1

        # Try to minimize the window by moving the left pointer.
        while left <= right and found_chars == required_chars:
            # Calculate the current window size.
            window_size = right - left + 1

            # If the current window is smaller than the minimum found so far, update min_len and min_window.
            if window_size < min_len:
                min_len = window_size
                min_window = s[left:right+1]

            # Move the left pointer to the right to shrink the window.
            left_char = s[left]
            current_window_dict[left_char] -= 1

            # Check if the character removed from the window was a required character.
            if left_char in t_dict and current_window_dict[left_char] < t_dict[left_char]:
                found_chars -= 1

            # Move the left pointer further to continue shrinking the window.
            left += 1

    return min_window

Explanation:#

  1. We start by checking if either of the input strings s or t is empty. If either of them is empty, we return an empty string since there can’t be any valid substring in this case.

  2. We initialize two dictionaries: t_dict and current_window_dict. These dictionaries will be used to keep track of character counts in the string t and the current window in string s, respectively.

  3. We populate the t_dict dictionary by iterating through string t. For each character, we increment its count in the dictionary using t_dict.get(char, 0) + 1. This allows us to count the occurrences of each character in t.

  4. We initialize two pointers: left and right. The left pointer will represent the start of the current window, and the right pointer will represent the end of the current window. We also initialize min_len to store the length of the minimum window found so far, and min_window to store the actual minimum window substring.

  5. We determine the number of required characters in t (i.e., the number of distinct characters in t) and store it in the variable required_chars.

  6. We initialize a variable found_chars to keep track of how many required characters have been found in the current window. Initially, it is set to 0.

  7. We iterate over the string s using the right pointer. In each iteration, we do the following:

    • Update the current_window_dict by incrementing the count of the current character.

    • Check if the current character is a required character (present in t_dict) and if its count in the current_window_dict matches the required count from t_dict. If so, we increment found_chars.

  8. After updating the window, we attempt to minimize the window size by moving the left pointer to the right. In this step, we:

    • Calculate the size of the current window.

    • If the current window size is smaller than the minimum found so far (min_len), we update min_len and min_window to store the current window substring.

    • Move the left pointer to the right to shrink the window.

    • Check if the character being removed from the window was a required character. If it was, we decrement found_chars.

    • Continue moving the left pointer further to continue shrinking the window if the window still contains all required characters.

  9. Finally, we return min_window, which will contain the minimum window substring that contains all characters from t.

Test cases#

# Example 1:
print(min_window_substring("ADOBECODEBANC", "ABC"))
BANC
# Example 2:
print(min_window_substring("a", "a")) 
a
# Example 3:
print(min_window_substring("a", "aa")) 

Let’s analyze the time and space complexity of the min_window_substring function:

Time Complexity:

  • The main loop iterates through the string s from left to right using the right pointer. This loop runs in O(m) time, where ‘m’ is the length of string s.

  • Inside the loop, we have a while loop that moves the left pointer to the right to shrink the window as needed. In the worst case, this while loop can also run in O(m) time because in the worst case, we may have to move the left pointer all the way to the end of the string.

  • Within each iteration of the while loop, we perform constant time operations, such as updating dictionaries and comparing character counts.

  • The initialization of the t_dict dictionary takes O(n) time, where ‘n’ is the length of string t.

Therefore, the overall time complexity of the function is O(m + n) because the dominant factor is the length of string s, and the length of string t has a smaller impact.

Space Complexity:

  • The space complexity is determined by the space used by the t_dict dictionary, the current_window_dict dictionary, and a few variables.

  • The t_dict dictionary stores character counts for string t. In the worst case, when all characters in t are distinct, this dictionary can have a maximum of ‘n’ key-value pairs. So, the space complexity for t_dict is O(n).

  • The current_window_dict dictionary stores character counts for the current window. In the worst case, it can have a maximum of ‘m’ key-value pairs. So, the space complexity for current_window_dict is also O(m).

  • Other variables used in the function, such as left, right, min_len, min_window, required_chars, and found_chars, are all of constant size and do not depend on the input sizes.

In summary, the overall space complexity of the function is O(max(m, n)), which means it is determined by the larger of the two input strings’ lengths. This is because the space used for t_dict and current_window_dict dominates the space complexity.

Challenging Exercises:#

  1. Smallest Distinct Substring: Instead of finding the minimum window substring containing all characters, find the smallest distinct (unique) substring of s that contains all characters from t. This variation adds complexity because you must find a substring with distinct characters.

  2. No Extra Space: Solve the problem without using any extra space, such as dictionaries or arrays, other than a constant amount of space. This is a significant optimization challenge.