For Top K Frequent Words, the core move is to scan the array once, count occurrences with a hash map, then rank words by higher frequency and smaller lexicographical order. The interview trap is not the counting step but the comparator: ties must place alphabetically smaller words first. You can solve it cleanly with full sorting or use a heap when you want to cap work around the top k results.
Problem Statement
You are given a list of lowercase words and an integer k. Your job is to return the k words that appear most often in the array. The result cannot be in arbitrary order: words with larger counts must come first, and when two words have the same count, the alphabetically smaller word must appear earlier.
A quick check is words = ["i","love","leetcode","i","love","coding"] with k = 2. Both "i" and "love" appear twice, while the other words appear once, so the answer is ["i","love"] because the tie is broken by lexicographical order. That tie rule is the part that usually breaks otherwise correct Top K Frequent Words solutions.
Examples
Example 1
Input: words = ["i","love","leetcode","i","love","coding"], k = 2
Output: ["i","love"]
"i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Example 2
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4
Output: ["the","is","sunny","day"]
"the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
Constraints
- 1 <= words.length <= 500
- 1 <= words[i].length <= 10
- words[i] consists of lowercase English letters.
- k is in the range [1, The number of unique words[i]]
Solution Approach
Count frequencies with one array pass
Start with the problem's simplest pattern: scan the words array once and store each word's frequency in a hash map. This gives direct lookup for every repeated word and turns the rest of the problem into ranking unique words instead of repeatedly rescanning the full array.
Sort by frequency first, lexicographical order second
Take the unique words and sort them with a comparator that puts larger counts first. When two counts match, compare the strings directly so the alphabetically smaller word comes first. In Top K Frequent Words, this comparator is the real logic, because a reversed tie-break silently returns the wrong answer even when the counts are correct.
Use a heap when you want to focus on top k
A heap approach is also valid after counting frequencies. Maintain ordering so lower-priority entries are removed first, which means you must be extra careful because the heap comparator often looks inverted compared with the final output order. This version helps when discussing top-k extraction, but for LeetCode 692 many candidates choose sorting because the tie rule is easier to express correctly.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
Let u be the number of unique words. Counting takes O(n) time with O(u) space. A full sort of unique words costs O(u log u), so the total is O(n + u log u) time and O(u) space. With a heap, a common version runs in O(n + u log k) time with O(u) counting space plus O(k) heap space, but the comparator is easier to get wrong because lexicographical ties must be reversed correctly inside removal order.
What Interviewers Usually Probe
- They want to see whether you separate counting from ranking instead of trying to compare words during the initial scan.
- They are checking whether you implement the tie-break exactly: same frequency means lexicographically smaller word first.
- They may ask for a heap follow-up to test whether you understand why heap ordering for Top K Frequent Words can look opposite from final output order.
Common Pitfalls or Variants
Common pitfalls
- Sorting tied words in descending alphabetical order, which breaks examples like "i" versus "love" immediately.
- Building a min-heap with the same comparator as the final answer, then popping the wrong word when frequencies tie.
- Forgetting that only unique words should be ranked after counting, which leads to unnecessary repeated comparisons across duplicates.
Follow-up variants
- Return the top k frequent numbers instead of words, which removes the lexicographical tie-break but keeps counting plus ranking.
- Return all words grouped by frequency, which shifts the task toward bucket organization after the hash count.
- Stream words one by one and keep the current top k, which turns Top K Frequent Words into an incremental heap maintenance problem.
How GhostInterview Helps
- GhostInterview surfaces the exact comparator you need for Top K Frequent Words so the frequency rule and alphabetical tie-break stay aligned.
- GhostInterview catches reversed heap tie logic, a common failure mode where the final ranking looks almost right but fails on equal counts.
- GhostInterview keeps the solution focused on array scanning plus hash lookup first, then adds the right ranking step instead of overcomplicating the count phase.
Topic Pages
Related GhostInterview Pages
- LeetCode Interview Copilot - Use GhostInterview as a live solver when you want direct help with LeetCode-style coding questions.
- Coding Interview Assistant - See how GhostInterview supports array, string, linked list, graph, and tree interview workflows.
- How GhostInterview Works - Review the screenshot, reasoning, and answer flow before using the solver in a live interview.
FAQ
What is the best approach for Top K Frequent Words?
The most direct approach is to count each word with a hash map, then sort the unique words by descending frequency and ascending lexicographical order. It is easy to explain, easy to verify on examples, and makes the tie-break rule explicit.
Why does lexicographical order matter in this problem?
Because Top K Frequent Words does not allow arbitrary ordering when counts match. If two words appear the same number of times, the alphabetically smaller one must come first, so your comparator must encode that rule exactly.
Can I use a heap instead of sorting?
Yes. After counting, you can keep a heap of size k to track the best candidates. The tricky part is that the heap comparator often needs to treat lexicographically larger tied words as lower priority so they get removed first.
What pattern should I recognize here?
Recognize array scanning plus hash lookup, followed by ranking of unique keys. The scan builds frequencies in O(n), and the second phase decides the top k with either sorting or a heap.
Why do correct counts still lead to wrong answers sometimes?
Because the failure is often in the ordering step, not the counting step. A solution can count every word perfectly and still fail if the tie-break comparator is reversed for sorting or heap removal.
Need direct help with Top K Frequent Words instead of spending more time grinding it?
Download GhostInterview when you want a LeetCode solver, not another long practice loop. Capture Top K Frequent Words from a screenshot, get the answer path and complexity, and use supported stealth workflows that stay outside captured layers.
Capture the prompt fast instead of rewriting the problem by hand.
Get the solution path, trade-offs, and complexity summary in one pass.
Stay outside captured layers on supported screen-share workflows.
Stay in the same pattern family
Sort Characters By Frequency requires counting characters efficiently and rearranging a string in descending frequency order for clarity.
Open problem page#347 Top K Frequent ElementsFind the k most frequent elements from an array using efficient algorithms like hashing and sorting.
Open problem page#720 Longest Word in DictionaryFind the longest word in a dictionary that can be built one character at a time from other words in the dictionary.
Open problem page