Alien Dictionary — Topological Sort on Characters
Compare adjacent words to derive ordering constraints between letters, then perform Kahn’s algorithm.
Problem Statement
Given a sorted list of words in an unknown language, derive one valid ordering of the alphabet or report impossibility.
Why This Problem Matters
- Demonstrates building a dependency graph from lexical observations—a common interview pattern.
- Exercises careful edge-case handling (prefix conflicts) before topological sorting.
- Lets you show mastery of Kahn’s algorithm and reasoning about cycles.
Thought Process
Seed the graph
Every unique character must appear as a vertex even if it has no edges.
Extract precedence rules
Scan consecutive word pairs and connect the first differing characters.
Detect invalid prefix cases
If word1 is longer and word2 is its prefix, the input is impossible.
Run topological sort
Use indegree counts to seed a queue and process characters with zero remaining prerequisites.
Step-by-Step Reasoning
- Initialise adjacency lists and indegree map covering all characters in words.
- For each adjacent pair, find the first differing character and add an edge from word1[j] to word2[j].
- If no difference exists but word1 is longer, return "" immediately.
- Run Kahn’s algorithm; if the resulting ordering length is less than the number of unique letters, a cycle exists.
Dry Run
words = ["wrt","wrf"]
First difference at column 2 yields edge t → f.
Process queue [w, e, r, t]
Kahn’s algorithm emits characters as indegrees drop, ultimately producing "wertf".
Complexity Analysis
Time
O(C + E)
Space
O(C + E)
Why
C distinct characters, E precedence edges; each is processed once.
Annotated Solution
import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Deque;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
public class AlienDictionary {
public String alienOrder(String[] words) {
Map<Character, List<Character>> adj = new HashMap<>();
Map<Character, Integer> indegree = new HashMap<>();
for (String word : words) {
for (char c : word.toCharArray()) {
adj.putIfAbsent(c, new ArrayList<>());
indegree.putIfAbsent(c, 0);
}
}
for (int i = 0; i < words.length - 1; i += 1) {
String first = words[i];
String second = words[i + 1];
if (first.length() > second.length() && first.startsWith(second)) {
return "";
}
int min = Math.min(first.length(), second.length());
for (int j = 0; j < min; j += 1) {
char a = first.charAt(j);
char b = second.charAt(j);
if (a != b) {
adj.get(a).add(b);
indegree.put(b, indegree.get(b) + 1);
break;
}
}
}
Deque<Character> queue = new ArrayDeque<>();
for (Map.Entry<Character, Integer> entry : indegree.entrySet()) {
if (entry.getValue() == 0) {
queue.offer(entry.getKey());
}
}
StringBuilder order = new StringBuilder();
while (!queue.isEmpty()) {
char c = queue.poll();
order.append(c);
for (char neighbor : adj.get(c)) {
indegree.put(neighbor, indegree.get(neighbor) - 1);
if (indegree.get(neighbor) == 0) {
queue.offer(neighbor);
}
}
}
return order.length() == indegree.size() ? order.toString() : "";
}
}Edges encode lexical precedence; Kahn’s algorithm emits characters whose prerequisites are satisfied. A length mismatch after the BFS signals a cycle.
Common Pitfalls
- Missing the prefix case causes invalid inputs to produce bogus orderings.
- Failing to initialise zero-indegree characters removes them from the ordering entirely.
- Re-adding duplicate edges inflates indegrees; guard against repeated relations if necessary.
Follow-Up Questions
- How would you detect if multiple valid orderings exist?
- Can you output the lexicographically smallest ordering when many are possible?