Blind 75Medium
Top K Frequent Elements — Bucket or Heap
Count element frequencies with a hash map and extract the top k using either a min-heap or frequency buckets.
HeapHash Map
Problem Statement
Return the k most frequent elements in the array.
Why This Problem Matters
- Tests ability to trade space for time using buckets or heaps.
- Common question to evaluate knowledge of frequency counting patterns.
- Variants appear in streaming analytics and search ranking problems.
Thought Process
Build frequency map
Use HashMap<Integer, Integer> to count occurrences.
Choose extraction strategy
Bucket sort for O(n) or min-heap for O(n log k). Buckets are simplest when k small to medium.
Collect top k
Iterate buckets from highest frequency down until k elements gathered.
Step-by-Step Reasoning
- Count frequencies using a map.
- Create buckets array where index = frequency, storing numbers with that count.
- Iterate from buckets.length - 1 down collecting numbers until k reached.
- Return result list.
Dry Run
nums = [1,1,1,2,2,3], k = 2
Frequencies {1:3,2:2,3:1}; buckets yield [1,2].
nums = [4,1,-1,2,-1,2,3], k = 2
Top two frequencies correspond to -1 and 2.
Complexity Analysis
Time
O(n)
Space
O(n)
Why
Bucket approach touches each element a constant number of times.
Annotated Solution
JAVA
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class TopKFrequentElements {
public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Integer> freq = new HashMap<>();
for (int num : nums) {
freq.merge(num, 1, Integer::sum);
}
List<List<Integer>> buckets = new ArrayList<>(nums.length + 1);
for (int i = 0; i <= nums.length; i += 1) {
buckets.add(new ArrayList<>());
}
for (Map.Entry<Integer, Integer> entry : freq.entrySet()) {
buckets.get(entry.getValue()).add(entry.getKey());
}
int[] result = new int[k];
int idx = 0;
for (int i = buckets.size() - 1; i >= 0 && idx < k; i -= 1) {
for (int num : buckets.get(i)) {
if (idx < k) {
result[idx++] = num;
}
}
}
return result;
}
}Frequency buckets group numbers by occurrence count, enabling linear-time selection of highest frequencies.
Common Pitfalls
- Forgetting to size buckets to nums.length + 1 results in index errors.
- Min-heap approach must evict when size exceeds k; missing this returns wrong results.
- Order of returned elements can be arbitrary unless additional sorting applied.
Follow-Up Questions
- How do you handle streaming data where n is massive? (Use size-k heap or reservoir sampling.)
- Can you solve the top-k frequent words variant requiring lexicographic ordering?
Key Takeaways
Counting plus bucket sort is a powerful combination for top-k tasks.
Discuss heap alternative to show flexibility.
Clarify that output order is unspecified unless required otherwise.