LeetCode 3163 - String Compression III
The problem asks us to implement a specialized string compression algorithm. Given an input string word, we are required to build a compressed version by repeatedly taking prefixes consisting of consecutive repeating characters, limited to a maximum length of 9, and appending…
Difficulty: 🟡 Medium
Topics: String
Solution
Problem Understanding
The problem asks us to implement a specialized string compression algorithm. Given an input string word, we are required to build a compressed version by repeatedly taking prefixes consisting of consecutive repeating characters, limited to a maximum length of 9, and appending their length followed by the character to the result string. The process continues until the input string is fully processed.
For example, the string "aaaaaaaaaaaaaabb" contains a long sequence of as. Since the prefix length cannot exceed 9, the first 9 as are taken, resulting in "9a", then the next 5 as form "5a", and the final "bb" becomes "2b".
Key points about the input and expected output:
- The input is a non-empty string with a maximum length of 200,000 characters, which implies we need an algorithm that runs in linear time.
- The input only contains lowercase English letters, so we do not need to handle digits or other characters.
- The output compresses consecutive repeated characters but respects the maximum prefix length of 9.
Important edge cases to note:
- Strings where no character repeats more than once, e.g.,
"abcde": each character should be represented as"1<character>". - Strings with very long sequences of a single character, e.g.,
"aaaaaaaaaa": the compression should split sequences into chunks of at most 9. - The input string can be very long, so any solution must avoid repeated string concatenation that scales poorly. Using a list or buffer for efficient accumulation is critical.
Approaches
The most straightforward way to solve the problem is a brute-force approach: scan the string, find the maximal prefix of the current character, append its length and character to the result, and repeat. This works correctly but may be inefficient if implemented naively using string slicing and repeated concatenation in Python, which could cause O(n^2) performance for very long strings.
The optimal approach is to iterate through the string once using a pointer and a counter to track consecutive characters. Whenever the count reaches 9 or the current character changes, append the current run length and character to the result. This approach is linear in time and uses a list or buffer to efficiently build the output string.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^2) | O(n) | Uses repeated slicing and string concatenation; correct but inefficient for long strings |
| Optimal | O(n) | O(n) | Single-pass iteration with a counter; uses list buffer to build result efficiently |
Algorithm Walkthrough
- Initialize an empty list
resultto accumulate compressed segments. - Initialize a counter
countto 1 to track consecutive characters and start from the first character ofword. - Iterate through the string from the second character to the end.
- For each character, check if it is the same as the previous one. If it is, increment
count. - If
countreaches 9 or the current character is different from the previous one, append the string"{count}{previous_char}"toresultand resetcountto 1. - After finishing the iteration, append the final run of characters to
result. - Join the
resultlist into a single string and return it.
Why it works: The algorithm maintains the invariant that each run of consecutive characters is correctly segmented into chunks of at most 9 characters. The linear scan ensures all characters are processed exactly once, guaranteeing correctness and efficiency.
Python Solution
class Solution:
def compressedString(self, word: str) -> str:
if not word:
return ""
result = []
count = 1
for i in range(1, len(word)):
if word[i] == word[i-1] and count < 9:
count += 1
else:
result.append(f"{count}{word[i-1]}")
count = 1
# Append the last run
result.append(f"{count}{word[-1]}")
return "".join(result)
The Python implementation uses a list result to efficiently accumulate compressed segments. The count variable tracks the length of the current run, and we append to result whenever the run ends or reaches the maximum allowed length of 9. Finally, join produces the compressed string.
Go Solution
func compressedString(word string) string {
if len(word) == 0 {
return ""
}
result := make([]byte, 0, len(word)*2)
count := 1
for i := 1; i < len(word); i++ {
if word[i] == word[i-1] && count < 9 {
count++
} else {
result = append(result, byte('0'+count))
result = append(result, word[i-1])
count = 1
}
}
result = append(result, byte('0'+count))
result = append(result, word[len(word)-1])
return string(result)
}
The Go implementation is similar, but uses a byte slice for efficient accumulation. Conversion from integer count to character uses '0'+count. The slice is preallocated to roughly twice the length of the input for efficiency.
Worked Examples
Example 1: word = "abcde"
| Step | Current Char | Count | Result |
|---|---|---|---|
| 1 | a | 1 | [] |
| 2 | b | 1 (append 1a) | ["1a"] |
| 3 | c | 1 (append 1b) | ["1a", "1b"] |
| 4 | d | 1 (append 1c) | ["1a", "1b", "1c"] |
| 5 | e | 1 (append 1d) | ["1a", "1b", "1c", "1d"] |
| End | e | append 1e | ["1a", "1b", "1c", "1d", "1e"] |
Output: "1a1b1c1d1e"
Example 2: word = "aaaaaaaaaaaaaabb"
| Step | Current Char | Count | Result |
|---|---|---|---|
| 1-9 | a | 9 | [] |
| Append 9a | - | - | ["9a"] |
| 10-14 | a | 5 | ["9a"] |
| Append 5a | - | - | ["9a", "5a"] |
| 15-16 | b | 2 | ["9a", "5a"] |
| Append 2b | - | - | ["9a", "5a", "2b"] |
Output: "9a5a2b"
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Single pass through the string; each character is visited once |
| Space | O(n) | The result string can be at most twice the length of the input in worst case |
The linear time complexity is achieved because we never perform nested loops, and the space complexity is proportional to the output, which is required.
Test Cases
# Provided examples
assert Solution().compressedString("abcde") == "1a1b1c1d1e" # No repeated characters
assert Solution().compressedString("aaaaaaaaaaaaaabb") == "9a5a2b" # Long repeated sequence
# Edge cases
assert Solution().compressedString("a") == "1a" # Single character
assert Solution().compressedString("aa") == "2a" # Two same characters
assert Solution().compressedString("aaaaaaaaaa") == "9a1a" # Sequence exactly 10, split by 9 limit
assert Solution().compressedString("abababab") == "1a1b1a1b1a1b1a1b" # Alternating characters
assert Solution().compressedString("zzzzzzzzzzzzzz") == "9z5z" # Long repeated sequence
| Test | Why |
|---|---|
"abcde" |
Validates handling of non-repeating characters |
"aaaaaaaaaaaaaabb" |
Validates splitting long sequences into 9-length chunks |
"a" |
Validates minimal input |
"aa" |
Validates small repeated sequences |
"aaaaaaaaaa" |
Validates exact split over 9 limit |
"abababab" |
Validates alternating characters |
"zzzzzzzzzzzzzz" |
Validates long repeated sequence handling |
Edge Cases
- Single Character Input: For
word = "a", the result should be"1a". This tests the algorithm’s handling of minimal input without accessing out-of-bounds indices. The implementation correctly appends the final run after iteration. - Sequence Longer Than 9: For
word = "aaaaaaaaaa", we must split the sequence into"9a1a". This tests that the maximum prefix rule is correctly enforced. The counter resets after reaching 9, which the implementation handles in the loop. - Alternating Characters: For
word = "abababab", no character repeats consecutively beyond 1.