208. Implement Trie (Prefix Tree)#

Difficulty: Medium

Link to Problem: To see the Implement Trie problem on LeetCode, click here!


A trie (pronounced as “try”) or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.

Implement the Trie class:

  • Trie() Initializes the trie object.

  • void insert(String word) Inserts the string word into the trie.

  • boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.

  • boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.

Constraints:

  • 1 <= word.length, prefix.length <= 2000

  • word and prefix consist only of lowercase English letters.

  • At most \(3 * 10^4\) calls in total will be made to insert, search, and startsWith.

Probelm Explanation:#

The problem at hand is to implement a data structure known as a Trie (pronounced “try”) or Prefix Tree. A Trie is a tree-like data structure that is used to efficiently store and retrieve a set of strings or words. It’s particularly useful for tasks involving string manipulation, such as autocomplete and spell-checking. The problem statement defines the following operations for implementing the Trie class:

  1. Trie(): This is the constructor that initializes the Trie object.

  2. void insert(String word): This method inserts the given string word into the Trie. It effectively adds the characters of the word to the Trie’s structure, creating nodes for each character if they don’t already exist. At the end of the word, a special flag is set to indicate that this node represents the end of a valid word.

  3. boolean search(String word): This method checks if the given string word is present in the Trie. It starts from the root of the Trie and traverses the Trie by following the characters in the word. If it successfully traverses the Trie and reaches the end of the word, it returns true, indicating that the word exists in the Trie; otherwise, it returns false.

  4. boolean startsWith(String prefix): This method checks if there is any previously inserted word in the Trie that has the given prefix. It’s similar to the search method but does not require that the prefix be a complete word. If there is any word in the Trie that starts with the given prefix, it returns true; otherwise, it returns false.

The problem also provides an example scenario where these operations are called, demonstrating the expected output for each operation.

In summary, the goal is to create a data structure (Trie) that efficiently stores a set of strings and provides methods to insert new strings, search for complete words, and check if a given prefix exists in the stored set of strings.

Solution:#

Here’s a Python function to implement this algorithm:

class TrieNode:
    def __init__(self):
        self.children = {}           # A dictionary to store child nodes
        self.is_end_of_word = False  # Indicates if a word ends at this node

class Trie:
    def __init__(self):
        self.root = TrieNode()       # Initialize the Trie with a root node

    def insert(self, word):
        node = self.root             # Start from the root
        for char in word:
            if char not in node.children:
                node.children[char] = TrieNode()  # Create a new node if the character is not present
            node = node.children[char]           # Move to the child node
        node.is_end_of_word = True                # Mark the end of the inserted word

    def search(self, word):
        node = self.root             # Start from the root
        for char in word:
            if char not in node.children:
                return False         # If the character is not found, the word is not in the Trie
            node = node.children[char]  # Move to the child node
        return node.is_end_of_word    # Check if the node represents the end of a valid word

    def startsWith(self, prefix):
        node = self.root             # Start from the root
        for char in prefix:
            if char not in node.children:
                return False         # If the character is not found, no word starts with the prefix
            node = node.children[char]  # Move to the child node
        return True                  # Prefix found in the Trie

Explanation:#

Let’s explain how the code works step by step:

  1. TrieNode Class:

    • TrieNode is a class that represents nodes in the Trie. Each node has two attributes:

      • children: This is a dictionary that stores child nodes. Each key in the dictionary represents a character, and the corresponding value is the child node for that character.

      • is_end_of_word: This boolean flag indicates whether the node represents the end of a complete word. Initially, it’s set to False for all nodes.

  2. Trie Class:

    • Trie is the main class that implements the Trie data structure. It has the following methods:

    • __init__(self): The constructor initializes the Trie object by creating a root node, which serves as the starting point for all Trie operations.

    • insert(self, word): This method inserts a string word into the Trie. It starts from the root node and iterates through each character in the word.

      • If a character is not present as a child node, it creates a new node for that character.

      • It then moves to the child node and continues the process until the entire word is inserted.

      • Finally, it sets the is_end_of_word flag to True for the last node to mark the end of the inserted word.

    • search(self, word): This method checks if a complete word (string) exists in the Trie. It starts from the root node and iterates through each character in the word.

      • If a character is not found as a child node, it immediately returns False because the word is not in the Trie.

      • It continues to move to the child node for each character.

      • After reaching the end of the word, it checks if the is_end_of_word flag is True for the last node to confirm the presence of the word.

    • startsWith(self, prefix): This method checks if there is any previously inserted word in the Trie that starts with a given prefix. It follows the same logic as the search method but does not require the entire word to be present.

      • If the prefix is found in the Trie, it returns True; otherwise, it returns False.

  3. Example Usage:

    • The code demonstrates how to create a Trie object, insert words (“apple” and “app”), and perform operations on the Trie:

      • It inserts the word “apple” into the Trie.

      • It checks if “apple” is in the Trie, which returns True.

      • It checks if “app” is in the Trie, which returns False.

      • It checks if there is any word in the Trie that starts with “app,” which returns True.

      • It inserts the word “app” into the Trie.

      • It checks if “app” is in the Trie again, which now returns True.

In summary, the code efficiently implements a Trie data structure with the ability to insert words, search for complete words, and check for the existence of words with a given prefix. The Trie is organized as a tree of nodes, with each node representing a character in the words being stored.

Test cases:#

Here’s how you can use this solution:

# Example 1: 

# Example usage:

trie = Trie()
trie.insert("apple")
print(trie.search("apple"))   # Output: True
print(trie.search("app"))     # Output: False
print(trie.startsWith("app")) # Output: True
trie.insert("app")
print(trie.search("app"))     # Output: True
True
False
True
True

Time and Space Complexity Analysis#

Let’s analyze the time and space complexity of the Trie implementation:

  1. Time Complexity:

    • Insertion (insert method): The time complexity of inserting a word into the Trie is O(m), where m is the length of the word. In the worst case, you may need to traverse the entire word to insert it into the Trie.

    • Search (search method): The time complexity of searching for a word in the Trie is also O(m), where m is the length of the word. In the worst case, you may need to traverse the entire word to determine if it exists in the Trie.

    • Starts with (startsWith method): The time complexity of checking if a prefix exists in the Trie is O(p), where p is the length of the prefix. In the worst case, you may need to traverse the entire prefix to determine its presence.

    Therefore, for each operation (insertion, search, or starts with), the time complexity is O(m) or O(p), where m is the length of the word being operated on, and p is the length of the prefix.

  2. Space Complexity:

    • The space complexity of the Trie is determined by the number of nodes and characters stored in it. In the worst case, where none of the inserted words share common prefixes, the space complexity is O(N), where N is the total number of characters across all inserted words.

    • Each node in the Trie represents a character, and the number of nodes is directly related to the total number of characters.

    In practice, the space complexity can be less than O(N) because common prefixes are shared among words in the Trie, leading to space optimization.

In summary:

  • Time complexity for each operation (insertion, search, starts with) is O(m) or O(p), where m is the length of the word and p is the length of the prefix.

  • Space complexity is O(N) in the worst case, where N is the total number of characters across all inserted words. However, space optimization occurs due to common prefix sharing in practice, potentially reducing the actual space used.

Challenging Exercises:#

  1. Count Prefixes: Implement a method to count the number of words in the Trie that have a specific prefix. This exercise requires you to maintain additional data in the Trie nodes to keep track of the count.

  2. Auto-Complete Suggestions: Implement an auto-complete feature using the Trie. Given a prefix, the program should return a list of suggested words based on the words previously inserted into the Trie.