347. Top K Frequent Elements#
Difficulty: Medium
Link to Problem: To see the Top K Frequent Elements problem on LeetCode, click here!
Given an integer array nums
and an integer k
, return the k
most frequent elements. You may return the answer in any order.
Constraints:
1 <=
nums.length
<= \(10^5\)\(-10^4\) <=
nums[i]
<= \(10^4\)k
is in the range[1, the number of unique elements in the array]
.It is guaranteed that the answer is unique.
Follow-up: Your algorithm’s time complexity must be better than \(O(n\ log\ n)\), where n is the array’s size.
Solution:#
Here’s a Python function to solve this problem:
def topKFrequent(nums, k):
# Create a dictionary to store the frequency of each element
count = {}
# Create a list of lists to store elements with the same frequency
frequency = [[] for i in range(len(nums) + 1)]
# Count the frequency of each element in nums
for n in nums:
count[n] = 1 + count.get(n, 0)
# Place elements in the freq list according to their frequency
for n, c in count.items():
frequency[c].append(n)
res = []
# Traverse freq list from the end (higher frequencies)
for i in range(len(frequency) - 1, 0, -1):
for n in frequency[i]:
res.append(n)
if len(res) == k:
return res
Explanation:#
def topKFrequent(nums, k):
: This is a function that takes two arguments:nums
, which is the input array of integers, andk
, which is the number of most frequent elements to return.count = {}
: This dictionarycount
will be used to store the frequency of each unique element in thenums
array. The keys are elements from the input array, and the values are their corresponding frequencies.frequency = [[] for i in range(len(nums) + 1)]
: This creates a list of empty lists calledfrequency
. It’s used to store elements based on their frequencies, similar to thefreq
list in the previous code. The size of this list is set to be one greater than the length of the inputnums
.for n in nums:
: This loop iterates through each elementn
in the inputnums
array.count[n] = 1 + count.get(n, 0)
: This line counts the frequency of each elementn
in thenums
array. It uses thecount.get(n, 0)
method to retrieve the current count ofn
from thecount
dictionary. Ifn
is not in the dictionary, it defaults to 0. It then increments the count by 1.After the above loop, the
count
dictionary will contain counts of each unique element in thenums
array.for n, c in count.items():
: This loop iterates through the items (key-value pairs) of thecount
dictionary.n
represents the element, andc
represents its frequency.frequency[c].append(n)
: This line places the elementn
into the bucket corresponding to its frequencyc
. Buckets are represented by thefrequency
list. For example, if an elementn
has a frequency of 3, it will be added tofrequency[3]
.After this loop, the
frequency
list will contain buckets of elements grouped by their frequencies.res = []
: This listres
will be used to store the k most frequent elements.for i in range(len(freq) - 1, 0, -1):
: This loop iterates in reverse order through thefrequency
list, starting from the highest frequency and going down to 1. Note that there is a typo here; it should befrequency
instead offreq
.for n in frequency[i]:
: For each elementn
in the current bucket (i.e., elements with frequencyi
), it appendsn
to the result listres
.if len(res) == k:
: Onceres
contains k elements, the code exits and returns the result.
The code efficiently finds the k most frequent elements in the input array without using sorting, similar to the previous explanation, with the only difference being the variable names (e.g., frequency
instead of freq
).
Test cases#
# Example 1:
nums = [1,1,1,2,2,3]
k = 2
result1 = topKFrequent(nums, k)
print(result1)
[1, 2]
# Example 2:
nums = [1]
k = 1
result2 = topKFrequent(nums, k)
print(result2)
[1]
Time and Space Complexity Analysis#
Let’s analyze the time and space complexity of the provided code:
Time Complexity:
The first loop that counts the frequency of elements by iterating through
nums
has a time complexity of O(n), where n is the number of elements in the input arraynums
.The second loop iterates through the keys in the
count
dictionary, which has at mostk
unique elements (as per the constraints). Therefore, the second loop also has a time complexity of O(k).The third loop iterates through the
frequency
list, which has a length equal to the maximum frequency of elements innums
. In the worst case, this loop can have a time complexity of O(n), where n is the number of elements innums
.
Overall, the dominant factor in terms of time complexity is the loop that iterates through the count
dictionary. So, the total time complexity of the code is O(n + k).
Space Complexity:
The
count
dictionary stores the frequency of each unique element innums
. In the worst case, it can have at mostk
unique elements, so the space complexity forcount
is O(k).The
frequency
list is used to store elements grouped by their frequencies. It has a length oflen(nums) + 1
, which can be at most 105 based on the constraints. Therefore, the space complexity forfrequency
is O(105).The
res
list stores the k most frequent elements, which can be at mostk
elements, so the space complexity forres
is O(k).
In summary, the space complexity is dominated by the count
dictionary and the frequency
list, both of which have a space complexity of O(k + 105).
Challenging Exercises:#
Optimized k Most Frequent Elements: Modify the code to find the k most frequent elements in an array while ensuring that the time complexity is O(n + klogk). You can use a priority queue (heap) to achieve this.
Handling Duplicate Frequencies: Extend the code to handle cases where multiple elements have the same frequency and are among the k most frequent elements. Ensure that the output contains exactly k elements.