3. Longest Substring Without Repeating Characters

3. Longest Substring Without Repeating Characters#

Difficulty: Medium

Link to Problem: To see the Longest Substring Without Repeating Characters problem on LeetCode, click here!

Given a string s, find the length of the longest substring without repeating characters.

Constraints:

0 <= s.length <= \(5 * 10^4\)
s consists of English letters, digits, symbols and spaces.

Solution:#

Here’s a Python function to solve this problem:

def length_of_longest_substring(s):
    # Create a dictionary to store the index of each character's last occurrence.
    char_index = {}
    
    # Initialize variables to keep track of the start and end of the current substring.
    start = 0
    max_length = 0
    
    for end in range(len(s)):
        # If the character is in the dictionary and its last occurrence is after the start of the current substring,
        # update the start of the substring to the next character after its last occurrence.
        if s[end] in char_index and char_index[s[end]] >= start:
            start = char_index[s[end]] + 1
        
        # Update the last occurrence index of the current character.
        char_index[s[end]] = end
        
        # Calculate the length of the current substring and update the maximum length if necessary.
        current_length = end - start + 1
        max_length = max(max_length, current_length)
    
    return max_length

Explanation:#

We start by defining a function called length_of_longest_substring that takes a single argument s, which is the input string for which we want to find the length of the longest substring without repeating characters.
We create a dictionary called char_index to store the index of the last occurrence of each character in the input string. This dictionary will help us keep track of where each character was last seen.
We initialize two variables, start and max_length.
- start represents the start index of the current substring without repeating characters. It starts at 0.
- max_length is used to keep track of the maximum length found so far and is initially set to 0.
We then iterate through the input string s using a for loop, where the loop variable end represents the current end index of the substring we are considering.
Inside the loop:
- We check if the character s[end] is already in the char_index dictionary and if its last occurrence is within or after the current substring. If so, it means that we’ve encountered a repeating character, and we need to update the start of the substring to the next character after the last occurrence of s[end]. This ensures that we have a new valid substring without repeating characters.
- We then update the char_index dictionary by storing the current index end as the last occurrence index of the character s[end].
Next, we calculate the length of the current substring without repeating characters, which is current_length = end - start + 1. We add 1 to account for the fact that the indices are zero-based.
We update the max_length with the maximum of its current value and the current_length. This step ensures that we keep track of the longest valid substring we have encountered so far.
The loop continues to iterate through the input string, and at the end, we return the max_length, which represents the length of the longest substring without repeating characters.

Test cases#

# Example 1:
print(length_of_longest_substring("abcabcbb"))

# Example 2:
print(length_of_longest_substring("bbbbb"))

# Example 3:
print(length_of_longest_substring("pwwkew"))

Time and Space Complexity Analysis#

Let’s analyze the time and space complexity of the length_of_longest_substring function:

Time Complexity:

The function iterates through the input string s using a single for loop. The loop runs from the beginning of the string to the end once.
Inside the loop, we perform constant-time operations such as dictionary lookups and updates, comparisons, and arithmetic operations.
Therefore, the time complexity of the function is O(n), where n is the length of the input string s.

Space Complexity:

The primary data structure that consumes space in this function is the char_index dictionary.
In the worst case, if there are no repeating characters in the input string, the dictionary can store all unique characters in the string.
Therefore, the space complexity is O(min(n, m)), where n is the length of the input string s, and m is the number of unique characters in the string. In the worst case, when all characters are unique, m is equal to n, so the space complexity is O(n).
Additionally, there are a few integer variables used for indices and lengths, which consume constant space.
Overall, the space complexity of the function is O(min(n, m)) or simply O(n) in the worst case.

In summary, the time complexity of the length_of_longest_substring function is O(n), and the space complexity is O(n) in the worst case due to the char_index dictionary. This algorithm provides an efficient way to find the length of the longest substring without repeating characters in linear time.

Challenging Exercises:#

Longest Substring with K Distinct Characters: Modify the problem to find the length of the longest substring with exactly K distinct characters. For example, given the input string “abcabcbb” and K = 2, the answer would be 4 because the longest substring with two distinct characters is “abca.”
Longest Substring with Unique Characters: Write a function to find the length of the longest substring in a given string where all characters are unique. For example, given the input string “abcabcbb,” the answer would be 4 because “abcd” is the longest substring with unique characters.