347. Top K Frequent Elements#
Difficulty: Medium
Link to Problem: To see the Top K Frequent Elements problem on LeetCode, click here!
Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.
Constraints:
1 <=
nums.length<= \(10^5\)\(-10^4\) <=
nums[i]<= \(10^4\)kis in the range[1, the number of unique elements in the array].It is guaranteed that the answer is unique.
Follow-up: Your algorithm’s time complexity must be better than \(O(n\ log\ n)\), where n is the array’s size.
Solution:#
Here’s a Python function to solve this problem:
def topKFrequent(nums, k):
# Create a dictionary to store the frequency of each element
count = {}
# Create a list of lists to store elements with the same frequency
frequency = [[] for i in range(len(nums) + 1)]
# Count the frequency of each element in nums
for n in nums:
count[n] = 1 + count.get(n, 0)
# Place elements in the freq list according to their frequency
for n, c in count.items():
frequency[c].append(n)
res = []
# Traverse freq list from the end (higher frequencies)
for i in range(len(frequency) - 1, 0, -1):
for n in frequency[i]:
res.append(n)
if len(res) == k:
return res
Explanation:#
def topKFrequent(nums, k):: This is a function that takes two arguments:nums, which is the input array of integers, andk, which is the number of most frequent elements to return.count = {}: This dictionarycountwill be used to store the frequency of each unique element in thenumsarray. The keys are elements from the input array, and the values are their corresponding frequencies.frequency = [[] for i in range(len(nums) + 1)]: This creates a list of empty lists calledfrequency. It’s used to store elements based on their frequencies, similar to thefreqlist in the previous code. The size of this list is set to be one greater than the length of the inputnums.for n in nums:: This loop iterates through each elementnin the inputnumsarray.count[n] = 1 + count.get(n, 0): This line counts the frequency of each elementnin thenumsarray. It uses thecount.get(n, 0)method to retrieve the current count ofnfrom thecountdictionary. Ifnis not in the dictionary, it defaults to 0. It then increments the count by 1.After the above loop, the
countdictionary will contain counts of each unique element in thenumsarray.for n, c in count.items():: This loop iterates through the items (key-value pairs) of thecountdictionary.nrepresents the element, andcrepresents its frequency.frequency[c].append(n): This line places the elementninto the bucket corresponding to its frequencyc. Buckets are represented by thefrequencylist. For example, if an elementnhas a frequency of 3, it will be added tofrequency[3].After this loop, the
frequencylist will contain buckets of elements grouped by their frequencies.res = []: This listreswill be used to store the k most frequent elements.for i in range(len(freq) - 1, 0, -1):: This loop iterates in reverse order through thefrequencylist, starting from the highest frequency and going down to 1. Note that there is a typo here; it should befrequencyinstead offreq.for n in frequency[i]:: For each elementnin the current bucket (i.e., elements with frequencyi), it appendsnto the result listres.if len(res) == k:: Oncerescontains k elements, the code exits and returns the result.
The code efficiently finds the k most frequent elements in the input array without using sorting, similar to the previous explanation, with the only difference being the variable names (e.g., frequency instead of freq).
Test cases#
# Example 1:
nums = [1,1,1,2,2,3]
k = 2
result1 = topKFrequent(nums, k)
print(result1)
[1, 2]
# Example 2:
nums = [1]
k = 1
result2 = topKFrequent(nums, k)
print(result2)
[1]
Time and Space Complexity Analysis#
Let’s analyze the time and space complexity of the provided code:
Time Complexity:
The first loop that counts the frequency of elements by iterating through
numshas a time complexity of O(n), where n is the number of elements in the input arraynums.The second loop iterates through the keys in the
countdictionary, which has at mostkunique elements (as per the constraints). Therefore, the second loop also has a time complexity of O(k).The third loop iterates through the
frequencylist, which has a length equal to the maximum frequency of elements innums. In the worst case, this loop can have a time complexity of O(n), where n is the number of elements innums.
Overall, the dominant factor in terms of time complexity is the loop that iterates through the count dictionary. So, the total time complexity of the code is O(n + k).
Space Complexity:
The
countdictionary stores the frequency of each unique element innums. In the worst case, it can have at mostkunique elements, so the space complexity forcountis O(k).The
frequencylist is used to store elements grouped by their frequencies. It has a length oflen(nums) + 1, which can be at most 105 based on the constraints. Therefore, the space complexity forfrequencyis O(105).The
reslist stores the k most frequent elements, which can be at mostkelements, so the space complexity forresis O(k).
In summary, the space complexity is dominated by the count dictionary and the frequency list, both of which have a space complexity of O(k + 105).
Challenging Exercises:#
Optimized k Most Frequent Elements: Modify the code to find the k most frequent elements in an array while ensuring that the time complexity is O(n + klogk). You can use a priority queue (heap) to achieve this.
Handling Duplicate Frequencies: Extend the code to handle cases where multiple elements have the same frequency and are among the k most frequent elements. Ensure that the output contains exactly k elements.