LeetCode 937 - Reorder Data in Log Files
In this problem, we are given a list of log strings. Each log contains an identifier followed by one or more words separated by spaces. The first token is always the identifier, and everything after it represents the content of the log.
Difficulty: 🟡 Medium
Topics: Array, String, Sorting
Solution
Problem Understanding
In this problem, we are given a list of log strings. Each log contains an identifier followed by one or more words separated by spaces. The first token is always the identifier, and everything after it represents the content of the log.
The logs are divided into two categories:
- Letter-logs, where every word after the identifier contains only lowercase English letters.
- Digit-logs, where every word after the identifier contains only digits.
The goal is to reorder the logs according to three rules:
- All letter-logs must appear before all digit-logs.
- Letter-logs must be sorted lexicographically by their content.
- If two letter-logs have identical content, then they must be sorted by identifier.
- Digit-logs must preserve their original relative order.
The input is an array of strings, and the output should be a reordered array of strings that satisfies all these requirements.
The constraints are fairly small:
- At most 100 logs
- Each log length is at most 100 characters
These limits mean performance is not extremely critical, but we still want a clean and efficient solution. Since sorting is involved, an O(n log n) solution is ideal and completely sufficient.
There are several important edge cases to keep in mind:
- Multiple letter-logs can have identical content, requiring identifier comparison.
- All logs might already be digit-logs.
- All logs might already be letter-logs.
- Digit-logs must remain in original order, so the sorting process must preserve stability for them.
- Content comparison should ignore identifiers initially, because ordering is based primarily on content.
Approaches
Brute Force Approach
A brute-force approach would manually separate letter-logs and digit-logs, then repeatedly compare and reorder letter-logs using a custom sorting routine such as bubble sort.
The algorithm would:
- Traverse all logs.
- Split them into two arrays:
- letterLogs
- digitLogs
- Use nested loops to compare every pair of letter-logs.
- Compare contents first, then identifiers if needed.
- Append digit-logs afterward.
This works because it directly follows the problem rules. However, manually sorting with nested loops leads to O(n²) time complexity, which is unnecessary because modern languages already provide efficient sorting algorithms.
Optimal Approach
The key observation is that the ordering rules can be represented cleanly using a custom sorting key.
For each log:
-
If it is a letter-log, we sort it using:
-
content
-
identifier
-
If it is a digit-log, we place it after all letter-logs while preserving original order.
Stable sorting becomes important here. Python’s built-in sort is stable, and Go’s sort.SliceStable also preserves order for equal elements.
Instead of manually rearranging logs, we define a comparison rule that naturally enforces all requirements.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Manual comparison and swapping |
| Optimal | O(n log n) | O(n) | Uses stable sorting with custom ordering |
Algorithm Walkthrough
- Iterate through each log and split it into two parts:
- identifier
- content
- Determine whether the log is a letter-log or digit-log.
This can be done by checking the first character of the content:
- If it is a digit, the log is a digit-log.
- Otherwise, it is a letter-log.
- Define a sorting key for each log.
For letter-logs:
- Primary key: content
- Secondary key: identifier
For digit-logs:
- They should appear after all letter-logs.
- Their relative order must remain unchanged.
- Use a stable sorting algorithm.
Stability matters because digit-logs should not be rearranged relative to each other. 5. Return the sorted logs array.
Why it works
The algorithm works because the custom ordering exactly matches the problem specification.
- Letter-logs are sorted lexicographically by content.
- Identifier comparison resolves ties.
- Digit-logs are always placed after letter-logs.
- Stable sorting guarantees digit-logs keep their original ordering.
Since every rule is encoded directly into the sorting logic, the final ordering is guaranteed to be correct.
Python Solution
from typing import List
class Solution:
def reorderLogFiles(self, logs: List[str]) -> List[str]:
def sort_key(log: str):
identifier, content = log.split(" ", 1)
# Letter-log
if content[0].isalpha():
return (0, content, identifier)
# Digit-log
return (1,)
return sorted(logs, key=sort_key)
The implementation defines a helper function called sort_key.
Each log is split once into:
identifiercontent
Using split(" ", 1) is important because we only want the first space to separate the identifier from the rest of the log.
If the content begins with a letter, we return a tuple:
(0, content, identifier)
The leading 0 ensures all letter-logs come before digit-logs.
For digit-logs, we return:
(1,)
Since Python sorting is stable, digit-logs maintain their original relative ordering automatically.
Finally, we return the sorted result directly.
Go Solution
package main
import (
"sort"
"strings"
)
func reorderLogFiles(logs []string) []string {
sort.SliceStable(logs, func(i, j int) bool {
id1, content1 := splitLog(logs[i])
id2, content2 := splitLog(logs[j])
isDigit1 := isDigit(content1[0])
isDigit2 := isDigit(content2[0])
// Both are letter-logs
if !isDigit1 && !isDigit2 {
if content1 == content2 {
return id1 < id2
}
return content1 < content2
}
// One is letter-log, one is digit-log
if !isDigit1 && isDigit2 {
return true
}
if isDigit1 && !isDigit2 {
return false
}
// Both are digit-logs, preserve order
return false
})
return logs
}
func splitLog(log string) (string, string) {
parts := strings.SplitN(log, " ", 2)
return parts[0], parts[1]
}
func isDigit(ch byte) bool {
return ch >= '0' && ch <= '9'
}
The Go solution uses sort.SliceStable, which is critical because digit-logs must preserve their relative order.
Unlike Python, Go requires an explicit comparator function instead of a sorting key tuple.
The comparator handles four cases:
- Both logs are letter-logs
- First is letter-log, second is digit-log
- First is digit-log, second is letter-log
- Both are digit-logs
Returning false when both are digit-logs preserves their original ordering because the sort is stable.
Worked Examples
Example 1
Input:
["dig1 8 1 5 1",
"let1 art can",
"dig2 3 6",
"let2 own kit dig",
"let3 art zero"]
Step 1: Classify Logs
| Log | Type | Identifier | Content |
|---|---|---|---|
| dig1 8 1 5 1 | Digit | dig1 | 8 1 5 1 |
| let1 art can | Letter | let1 | art can |
| dig2 3 6 | Digit | dig2 | 3 6 |
| let2 own kit dig | Letter | let2 | own kit dig |
| let3 art zero | Letter | let3 | art zero |
Step 2: Sort Letter-Logs
Sorted by content:
| Content | Identifier |
|---|---|
| art can | let1 |
| art zero | let3 |
| own kit dig | let2 |
Result:
["let1 art can",
"let3 art zero",
"let2 own kit dig"]
Step 3: Append Digit-Logs
Digit-logs remain:
["dig1 8 1 5 1",
"dig2 3 6"]
Final result:
["let1 art can",
"let3 art zero",
"let2 own kit dig",
"dig1 8 1 5 1",
"dig2 3 6"]
Worked Example 2
Input:
["a1 9 2 3 1",
"g1 act car",
"zo4 4 7",
"ab1 off key dog",
"a8 act zoo"]
Step 1: Classify Logs
| Log | Type | Content |
|---|---|---|
| a1 9 2 3 1 | Digit | 9 2 3 1 |
| g1 act car | Letter | act car |
| zo4 4 7 | Digit | 4 7 |
| ab1 off key dog | Letter | off key dog |
| a8 act zoo | Letter | act zoo |
Step 2: Sort Letter-Logs
| Content | Identifier |
|---|---|
| act car | g1 |
| act zoo | a8 |
| off key dog | ab1 |
Sorted letter-logs:
["g1 act car",
"a8 act zoo",
"ab1 off key dog"]
Step 3: Append Digit-Logs
["a1 9 2 3 1",
"zo4 4 7"]
Final result:
["g1 act car",
"a8 act zoo",
"ab1 off key dog",
"a1 9 2 3 1",
"zo4 4 7"]
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n log n) | Sorting dominates runtime |
| Space | O(n) | Sorting and tuple storage require extra space |
The dominant operation is sorting the logs. Since there are n logs, sorting requires O(n log n) comparisons.
Each comparison may inspect parts of the strings, but log lengths are bounded by 100, so this remains efficient.
The extra space comes from sorting internals and temporary structures used during sorting.
Test Cases
from typing import List
class Solution:
def reorderLogFiles(self, logs: List[str]) -> List[str]:
def sort_key(log: str):
identifier, content = log.split(" ", 1)
if content[0].isalpha():
return (0, content, identifier)
return (1,)
return sorted(logs, key=sort_key)
sol = Solution()
# Example 1
assert sol.reorderLogFiles([
"dig1 8 1 5 1",
"let1 art can",
"dig2 3 6",
"let2 own kit dig",
"let3 art zero"
]) == [
"let1 art can",
"let3 art zero",
"let2 own kit dig",
"dig1 8 1 5 1",
"dig2 3 6"
]
# Example 2
assert sol.reorderLogFiles([
"a1 9 2 3 1",
"g1 act car",
"zo4 4 7",
"ab1 off key dog",
"a8 act zoo"
]) == [
"g1 act car",
"a8 act zoo",
"ab1 off key dog",
"a1 9 2 3 1",
"zo4 4 7"
]
# All digit-logs, order should remain unchanged
assert sol.reorderLogFiles([
"dig1 3 4",
"dig2 1 2",
"dig3 5 6"
]) == [
"dig1 3 4",
"dig2 1 2",
"dig3 5 6"
]
# All letter-logs
assert sol.reorderLogFiles([
"let2 own kit",
"let1 art can",
"let3 zoo"
]) == [
"let1 art can",
"let2 own kit",
"let3 zoo"
]
# Same content, tie resolved by identifier
assert sol.reorderLogFiles([
"b1 act car",
"a1 act car"
]) == [
"a1 act car",
"b1 act car"
]
# Single log
assert sol.reorderLogFiles([
"let1 abc"
]) == [
"let1 abc"
]
# Mixed logs with repeated digit-logs
assert sol.reorderLogFiles([
"dig1 1 2",
"let1 abc",
"dig2 3 4",
"let2 aaa"
]) == [
"let2 aaa",
"let1 abc",
"dig1 1 2",
"dig2 3 4"
]
print("All tests passed.")
| Test | Why |
|---|---|
| Example 1 | Validates standard mixed ordering |
| Example 2 | Validates lexicographic sorting |
| All digit-logs | Ensures stable ordering |
| All letter-logs | Ensures normal sorting works |
| Same content | Verifies identifier tie-breaking |
| Single log | Smallest valid input |
| Mixed repeated digit-logs | Confirms digit-log stability |
Edge Cases
Multiple Letter-Logs With Identical Content
A common bug occurs when two letter-logs share the same content. If the implementation sorts only by content, the final ordering becomes incorrect.
For example:
["b1 act car", "a1 act car"]
Both contents are identical, so the identifiers must determine ordering. The implementation handles this by using (content, identifier) as the sorting key.
All Logs Are Digit-Logs
If every log is a digit-log, no reordering should happen.
A non-stable sort could accidentally rearrange the logs even though the problem explicitly requires preserving their relative order.
The implementation avoids this issue because:
- Python sorting is stable.
- Go uses
sort.SliceStable.
Logs With Very Similar Prefixes
Logs can share similar starting words but differ later:
["let1 art can",
"let2 art zero"]
A naive implementation might compare only the first word after the identifier, which would produce incorrect ordering.
The solution compares the entire content string lexicographically, ensuring complete and correct ordering across all words.