LeetCode 408 - Valid Word Abbreviation
In this problem, we are given two strings: - word, the original full word - abbr, a supposed abbreviation of that word We must determine whether abbr is a valid abbreviation of word. An abbreviation works by replacing one or more non-empty substrings with their lengths.
Difficulty: 🟢 Easy
Topics: Two Pointers, String
Solution
Problem Understanding
In this problem, we are given two strings:
word, the original full wordabbr, a supposed abbreviation of that word
We must determine whether abbr is a valid abbreviation of word.
An abbreviation works by replacing one or more non-empty substrings with their lengths. For example, "internationalization" can become "i12iz4n" because:
"i"stays the same"12"skips twelve characters"iz"stays the same"4"skips four characters"n"stays the same
The key rules are:
- Numbers represent how many characters to skip in
word. - Replaced substrings must be non-empty.
- Adjacent abbreviated sections are not allowed, because they would merge into a larger number automatically.
- Numbers cannot contain leading zeros.
The task is essentially a synchronized scan of both strings. While reading the abbreviation:
- If we see a letter, it must exactly match the corresponding letter in
word. - If we see a number, we skip that many characters in
word.
At the end, both strings must be completely consumed at the same time.
The constraints are small:
word.length <= 20abbr.length <= 10
This means performance is not a major issue, but we should still write a clean and efficient solution.
Several edge cases are important:
- Numbers with leading zeros such as
"01"are invalid. - A skip count may move past the end of
word. - The abbreviation may end before the word does, or vice versa.
- Consecutive digits must be parsed as one complete number.
- Single-digit skips like
"1"are valid, but"0"is not.
Approaches
Brute Force Approach
A brute-force strategy would attempt to reconstruct every possible expansion of the abbreviation and compare it with the original word.
For example, given "i12iz4n", we could interpret each number as a sequence of wildcard characters, generate all possible replacements, and check whether one of them equals the target word.
This works conceptually because an abbreviation defines exactly which characters are fixed and which are skipped. However, generating all possible expansions quickly becomes inefficient and unnecessarily complicated. Even though the constraints are small, constructing strings repeatedly and exploring combinations introduces avoidable overhead.
Another brute-force variation would explicitly build the expanded version character by character, inserting placeholder symbols for skipped sections. This is simpler but still requires extra space and string construction.
Optimal Approach
The better approach uses two pointers:
- One pointer traverses
word - One pointer traverses
abbr
The key observation is that we never need to reconstruct the abbreviated string. We only need to verify whether the abbreviation correctly maps onto the word.
While scanning abbr:
- If the current character is a letter, it must match the current character in
word. - If the current character is a digit, we parse the entire number and advance the
wordpointer by that amount.
This works because the abbreviation already tells us exactly how many characters to skip. There is no ambiguity once the number is parsed.
This solution is efficient, simple, and uses constant extra space.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n²) | O(n) | Builds or reconstructs expanded forms |
| Optimal | O(n + m) | O(1) | Two-pointer scan without reconstruction |
Here:
n = len(word)m = len(abbr)
Algorithm Walkthrough
- Initialize two pointers:
word_indexfor traversingwordabbr_indexfor traversingabbr
- Continue processing while both pointers are within bounds.
- If the current character in
abbris a letter:
- Compare it with the current character in
word. - If they differ, return
False. - Otherwise, advance both pointers by one.
- If the current character in
abbris a digit:
- First check whether it is
'0'. - If it is
'0', returnFalsebecause leading zeros are invalid. - Parse the complete number by consuming consecutive digits.
- Convert that digit sequence into an integer.
- Advance
word_indexby that number.
- After processing finishes, verify:
word_index == len(word)abbr_index == len(abbr)
- Return
Trueonly if both strings were fully consumed together.
Why it works
The algorithm maintains the invariant that:
word_indexalways points to the next unmatched character inwordabbr_indexalways points to the next unprocessed character inabbr
Whenever we encounter a number, we correctly skip exactly that many characters in word. Whenever we encounter a letter, we ensure it matches directly. Since every character in abbr is processed exactly once and every skipped section is accounted for precisely, the algorithm correctly determines whether the abbreviation is valid.
Python Solution
class Solution:
def validWordAbbreviation(self, word: str, abbr: str) -> bool:
word_index = 0
abbr_index = 0
while word_index < len(word) and abbr_index < len(abbr):
if abbr[abbr_index].isalpha():
if word[word_index] != abbr[abbr_index]:
return False
word_index += 1
abbr_index += 1
else:
if abbr[abbr_index] == '0':
return False
number = 0
while abbr_index < len(abbr) and abbr[abbr_index].isdigit():
number = number * 10 + int(abbr[abbr_index])
abbr_index += 1
word_index += number
return word_index == len(word) and abbr_index == len(abbr)
The implementation directly follows the two-pointer algorithm.
The variables word_index and abbr_index track the current positions in the two strings. The main loop continues while neither pointer has reached the end.
When the current abbreviation character is alphabetic, the code performs a direct character comparison. If the characters do not match, the abbreviation cannot be valid.
When the current abbreviation character is numeric, the code first rejects leading zeros immediately. Then it parses the entire number using a loop, which is important because multi-digit values such as "12" represent a single skip amount rather than two separate skips.
After parsing the number, the code advances word_index by the skip amount.
Finally, the solution verifies that both strings were completely processed. This prevents cases where one string finishes earlier than the other.
Go Solution
func validWordAbbreviation(word string, abbr string) bool {
wordIndex := 0
abbrIndex := 0
for wordIndex < len(word) && abbrIndex < len(abbr) {
char := abbr[abbrIndex]
if char >= 'a' && char <= 'z' {
if word[wordIndex] != char {
return false
}
wordIndex++
abbrIndex++
} else {
if char == '0' {
return false
}
number := 0
for abbrIndex < len(abbr) &&
abbr[abbrIndex] >= '0' &&
abbr[abbrIndex] <= '9' {
number = number*10 + int(abbr[abbrIndex]-'0')
abbrIndex++
}
wordIndex += number
}
}
return wordIndex == len(word) && abbrIndex == len(abbr)
}
The Go implementation follows the same logic as the Python version, but there are some language-specific details.
Go strings are indexed as byte slices, so character checks are performed using ASCII comparisons such as:
char >= 'a' && char <= 'z'
Digit parsing is done manually by subtracting '0' from the byte value.
Go does not provide Python-style isdigit() or isalpha() methods directly on characters, so explicit comparisons are used instead.
The problem guarantees that all integers fit within a 32-bit integer, so standard int handling is sufficient.
Worked Examples
Example 1
Input:
word = "internationalization"
abbr = "i12iz4n"
| Step | word_index | abbr_index | Current abbr char | Action |
|---|---|---|---|---|
| 1 | 0 | 0 | i | Match i with i |
| 2 | 1 | 1 | 1 | Parse number 12 |
| 3 | 13 | 3 | i | Match i |
| 4 | 14 | 4 | z | Match z |
| 5 | 15 | 5 | 4 | Parse number 4 |
| 6 | 19 | 6 | n | Match n |
| 7 | 20 | 7 | end | Both strings consumed |
Result:
True
Example 2
Input:
word = "apple"
abbr = "a2e"
| Step | word_index | abbr_index | Current abbr char | Action |
|---|---|---|---|---|
| 1 | 0 | 0 | a | Match a |
| 2 | 1 | 1 | 2 | Skip 2 characters |
| 3 | 3 | 2 | e | Compare l with e, mismatch |
Result:
False
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n + m) | Each character in both strings is processed at most once |
| Space | O(1) | Only a few integer variables are used |
The algorithm is linear because both pointers move strictly forward. Digits are parsed once, and no backtracking occurs. The solution also uses constant extra memory since it does not build auxiliary data structures or reconstructed strings.
Test Cases
solution = Solution()
assert solution.validWordAbbreviation(
"internationalization",
"i12iz4n"
) is True # provided valid example
assert solution.validWordAbbreviation(
"apple",
"a2e"
) is False # provided invalid example
assert solution.validWordAbbreviation(
"substitution",
"s10n"
) is True # large middle skip
assert solution.validWordAbbreviation(
"substitution",
"sub4u4"
) is True # multiple skips
assert solution.validWordAbbreviation(
"substitution",
"12"
) is True # whole word abbreviated
assert solution.validWordAbbreviation(
"substitution",
"s55n"
) is False # adjacent abbreviations invalid logically
assert solution.validWordAbbreviation(
"substitution",
"s010n"
) is False # leading zero invalid
assert solution.validWordAbbreviation(
"substitution",
"s0ubstitution"
) is False # zero-length abbreviation invalid
assert solution.validWordAbbreviation(
"word",
"4"
) is True # skip entire word
assert solution.validWordAbbreviation(
"word",
"5"
) is False # skip exceeds word length
assert solution.validWordAbbreviation(
"word",
"w1r1"
) is False # incorrect final alignment
assert solution.validWordAbbreviation(
"abcdef",
"a2d2"
) is False # mismatch after skipping
assert solution.validWordAbbreviation(
"abcdef",
"a3ef"
) is True # valid middle abbreviation
assert solution.validWordAbbreviation(
"a",
"1"
) is True # smallest valid full abbreviation
assert solution.validWordAbbreviation(
"a",
"0"
) is False # zero not allowed
| Test | Why |
|---|---|
"internationalization", "i12iz4n" |
Standard valid abbreviation |
"apple", "a2e" |
Character mismatch after skipping |
"substitution", "s10n" |
Large numeric skip |
"substitution", "sub4u4" |
Multiple abbreviated regions |
"substitution", "12" |
Entire word abbreviated |
"substitution", "s55n" |
Invalid adjacent abbreviation effect |
"substitution", "s010n" |
Leading zero handling |
"substitution", "s0ubstitution" |
Zero-length replacement invalid |
"word", "4" |
Exact full-word skip |
"word", "5" |
Skip beyond string length |
"word", "w1r1" |
Incorrect alignment at end |
"abcdef", "a2d2" |
Mismatch after intermediate skip |
"abcdef", "a3ef" |
Correct mixed abbreviation |
"a", "1" |
Smallest successful case |
"a", "0" |
Smallest invalid numeric case |
Edge Cases
One important edge case is leading zeros. An abbreviation like "a01b" might accidentally be treated as skipping one character if the implementation simply parses integers normally. However, the problem explicitly forbids leading zeros. The solution handles this by immediately returning False whenever a numeric segment starts with '0'.
Another tricky case occurs when the skip amount moves beyond the end of the word. For example, "word" with abbreviation "5" attempts to skip five characters even though the word only contains four. Some incorrect implementations might stop early and still return True. This implementation avoids that bug because the final validation requires both pointers to finish exactly at their string lengths.
A third important case is incomplete alignment after processing. Consider "word" and "w1r1". Even though the abbreviation structure looks plausible, the skipped positions do not line up correctly with the remaining letters. The algorithm catches this because every literal character comparison is checked immediately during traversal.
A final subtle case involves parsing multi-digit numbers correctly. For example, "12" must mean skipping twelve characters, not skipping one character twice. The implementation correctly consumes consecutive digits together and constructs the complete integer before advancing the pointer in word.