347. Top K Frequent Elements#

Difficulty: Medium

Link to Problem: To see the Top K Frequent Elements problem on LeetCode, click here!


Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Constraints:

  • 1 <= nums.length <= \(10^5\)

  • \(-10^4\) <= nums[i] <= \(10^4\)

  • k is in the range [1, the number of unique elements in the array].

  • It is guaranteed that the answer is unique.

Follow-up: Your algorithm’s time complexity must be better than \(O(n\ log\ n)\), where n is the array’s size.

Solution:#

Here’s a Python function to solve this problem:

def topKFrequent(nums, k):
    # Create a dictionary to store the frequency of each element
    count = {}
    # Create a list of lists to store elements with the same frequency
    frequency = [[] for i in range(len(nums) + 1)]

    # Count the frequency of each element in nums
    for n in nums:
        count[n] = 1 + count.get(n, 0)

    # Place elements in the freq list according to their frequency
    for n, c in count.items():
        frequency[c].append(n)

    res = []
    # Traverse freq list from the end (higher frequencies)
    for i in range(len(frequency) - 1, 0, -1):
        for n in frequency[i]:
            res.append(n)
            if len(res) == k:
                return res

Explanation:#

  1. def topKFrequent(nums, k):: This is a function that takes two arguments: nums, which is the input array of integers, and k, which is the number of most frequent elements to return.

  2. count = {}: This dictionary count will be used to store the frequency of each unique element in the nums array. The keys are elements from the input array, and the values are their corresponding frequencies.

  3. frequency = [[] for i in range(len(nums) + 1)]: This creates a list of empty lists called frequency. It’s used to store elements based on their frequencies, similar to the freq list in the previous code. The size of this list is set to be one greater than the length of the input nums.

  4. for n in nums:: This loop iterates through each element n in the input nums array.

  5. count[n] = 1 + count.get(n, 0): This line counts the frequency of each element n in the nums array. It uses the count.get(n, 0) method to retrieve the current count of n from the count dictionary. If n is not in the dictionary, it defaults to 0. It then increments the count by 1.

  6. After the above loop, the count dictionary will contain counts of each unique element in the nums array.

  7. for n, c in count.items():: This loop iterates through the items (key-value pairs) of the count dictionary. n represents the element, and c represents its frequency.

  8. frequency[c].append(n): This line places the element n into the bucket corresponding to its frequency c. Buckets are represented by the frequency list. For example, if an element n has a frequency of 3, it will be added to frequency[3].

  9. After this loop, the frequency list will contain buckets of elements grouped by their frequencies.

  10. res = []: This list res will be used to store the k most frequent elements.

  11. for i in range(len(freq) - 1, 0, -1):: This loop iterates in reverse order through the frequency list, starting from the highest frequency and going down to 1. Note that there is a typo here; it should be frequency instead of freq.

  12. for n in frequency[i]:: For each element n in the current bucket (i.e., elements with frequency i), it appends n to the result list res.

  13. if len(res) == k:: Once res contains k elements, the code exits and returns the result.

The code efficiently finds the k most frequent elements in the input array without using sorting, similar to the previous explanation, with the only difference being the variable names (e.g., frequency instead of freq).

Test cases#

# Example 1: 
nums = [1,1,1,2,2,3]
k = 2
result1 = topKFrequent(nums, k)
print(result1)
[1, 2]
# Example 2:
nums = [1]
k = 1
result2 = topKFrequent(nums, k)
print(result2)
[1]

Time and Space Complexity Analysis#

Let’s analyze the time and space complexity of the provided code:

Time Complexity:

  1. The first loop that counts the frequency of elements by iterating through nums has a time complexity of O(n), where n is the number of elements in the input array nums.

  2. The second loop iterates through the keys in the count dictionary, which has at most k unique elements (as per the constraints). Therefore, the second loop also has a time complexity of O(k).

  3. The third loop iterates through the frequency list, which has a length equal to the maximum frequency of elements in nums. In the worst case, this loop can have a time complexity of O(n), where n is the number of elements in nums.

Overall, the dominant factor in terms of time complexity is the loop that iterates through the count dictionary. So, the total time complexity of the code is O(n + k).

Space Complexity:

  1. The count dictionary stores the frequency of each unique element in nums. In the worst case, it can have at most k unique elements, so the space complexity for count is O(k).

  2. The frequency list is used to store elements grouped by their frequencies. It has a length of len(nums) + 1, which can be at most 105 based on the constraints. Therefore, the space complexity for frequency is O(105).

  3. The res list stores the k most frequent elements, which can be at most k elements, so the space complexity for res is O(k).

In summary, the space complexity is dominated by the count dictionary and the frequency list, both of which have a space complexity of O(k + 105).

Challenging Exercises:#

  1. Optimized k Most Frequent Elements: Modify the code to find the k most frequent elements in an array while ensuring that the time complexity is O(n + klogk). You can use a priority queue (heap) to achieve this.

  2. Handling Duplicate Frequencies: Extend the code to handle cases where multiple elements have the same frequency and are among the k most frequent elements. Ensure that the output contains exactly k elements.