LeetCode 408 - Valid Word Abbreviation

In this problem, we are given two strings: - word, the original full word - abbr, a supposed abbreviation of that word We must determine whether abbr is a valid abbreviation of word. An abbreviation works by replacing one or more non-empty substrings with their lengths.

LeetCode Problem 408

Difficulty: 🟢 Easy
Topics: Two Pointers, String

Solution

Problem Understanding

In this problem, we are given two strings:

  • word, the original full word
  • abbr, a supposed abbreviation of that word

We must determine whether abbr is a valid abbreviation of word.

An abbreviation works by replacing one or more non-empty substrings with their lengths. For example, "internationalization" can become "i12iz4n" because:

  • "i" stays the same
  • "12" skips twelve characters
  • "iz" stays the same
  • "4" skips four characters
  • "n" stays the same

The key rules are:

  1. Numbers represent how many characters to skip in word.
  2. Replaced substrings must be non-empty.
  3. Adjacent abbreviated sections are not allowed, because they would merge into a larger number automatically.
  4. Numbers cannot contain leading zeros.

The task is essentially a synchronized scan of both strings. While reading the abbreviation:

  • If we see a letter, it must exactly match the corresponding letter in word.
  • If we see a number, we skip that many characters in word.

At the end, both strings must be completely consumed at the same time.

The constraints are small:

  • word.length <= 20
  • abbr.length <= 10

This means performance is not a major issue, but we should still write a clean and efficient solution.

Several edge cases are important:

  • Numbers with leading zeros such as "01" are invalid.
  • A skip count may move past the end of word.
  • The abbreviation may end before the word does, or vice versa.
  • Consecutive digits must be parsed as one complete number.
  • Single-digit skips like "1" are valid, but "0" is not.

Approaches

Brute Force Approach

A brute-force strategy would attempt to reconstruct every possible expansion of the abbreviation and compare it with the original word.

For example, given "i12iz4n", we could interpret each number as a sequence of wildcard characters, generate all possible replacements, and check whether one of them equals the target word.

This works conceptually because an abbreviation defines exactly which characters are fixed and which are skipped. However, generating all possible expansions quickly becomes inefficient and unnecessarily complicated. Even though the constraints are small, constructing strings repeatedly and exploring combinations introduces avoidable overhead.

Another brute-force variation would explicitly build the expanded version character by character, inserting placeholder symbols for skipped sections. This is simpler but still requires extra space and string construction.

Optimal Approach

The better approach uses two pointers:

  • One pointer traverses word
  • One pointer traverses abbr

The key observation is that we never need to reconstruct the abbreviated string. We only need to verify whether the abbreviation correctly maps onto the word.

While scanning abbr:

  • If the current character is a letter, it must match the current character in word.
  • If the current character is a digit, we parse the entire number and advance the word pointer by that amount.

This works because the abbreviation already tells us exactly how many characters to skip. There is no ambiguity once the number is parsed.

This solution is efficient, simple, and uses constant extra space.

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Builds or reconstructs expanded forms
Optimal O(n + m) O(1) Two-pointer scan without reconstruction

Here:

  • n = len(word)
  • m = len(abbr)

Algorithm Walkthrough

  1. Initialize two pointers:
  • word_index for traversing word
  • abbr_index for traversing abbr
  1. Continue processing while both pointers are within bounds.
  2. If the current character in abbr is a letter:
  • Compare it with the current character in word.
  • If they differ, return False.
  • Otherwise, advance both pointers by one.
  1. If the current character in abbr is a digit:
  • First check whether it is '0'.
  • If it is '0', return False because leading zeros are invalid.
  • Parse the complete number by consuming consecutive digits.
  • Convert that digit sequence into an integer.
  • Advance word_index by that number.
  1. After processing finishes, verify:
  • word_index == len(word)
  • abbr_index == len(abbr)
  1. Return True only if both strings were fully consumed together.

Why it works

The algorithm maintains the invariant that:

  • word_index always points to the next unmatched character in word
  • abbr_index always points to the next unprocessed character in abbr

Whenever we encounter a number, we correctly skip exactly that many characters in word. Whenever we encounter a letter, we ensure it matches directly. Since every character in abbr is processed exactly once and every skipped section is accounted for precisely, the algorithm correctly determines whether the abbreviation is valid.

Python Solution

class Solution:
    def validWordAbbreviation(self, word: str, abbr: str) -> bool:
        word_index = 0
        abbr_index = 0

        while word_index < len(word) and abbr_index < len(abbr):
            if abbr[abbr_index].isalpha():
                if word[word_index] != abbr[abbr_index]:
                    return False

                word_index += 1
                abbr_index += 1
            else:
                if abbr[abbr_index] == '0':
                    return False

                number = 0

                while abbr_index < len(abbr) and abbr[abbr_index].isdigit():
                    number = number * 10 + int(abbr[abbr_index])
                    abbr_index += 1

                word_index += number

        return word_index == len(word) and abbr_index == len(abbr)

The implementation directly follows the two-pointer algorithm.

The variables word_index and abbr_index track the current positions in the two strings. The main loop continues while neither pointer has reached the end.

When the current abbreviation character is alphabetic, the code performs a direct character comparison. If the characters do not match, the abbreviation cannot be valid.

When the current abbreviation character is numeric, the code first rejects leading zeros immediately. Then it parses the entire number using a loop, which is important because multi-digit values such as "12" represent a single skip amount rather than two separate skips.

After parsing the number, the code advances word_index by the skip amount.

Finally, the solution verifies that both strings were completely processed. This prevents cases where one string finishes earlier than the other.

Go Solution

func validWordAbbreviation(word string, abbr string) bool {
	wordIndex := 0
	abbrIndex := 0

	for wordIndex < len(word) && abbrIndex < len(abbr) {
		char := abbr[abbrIndex]

		if char >= 'a' && char <= 'z' {
			if word[wordIndex] != char {
				return false
			}

			wordIndex++
			abbrIndex++
		} else {
			if char == '0' {
				return false
			}

			number := 0

			for abbrIndex < len(abbr) &&
				abbr[abbrIndex] >= '0' &&
				abbr[abbrIndex] <= '9' {

				number = number*10 + int(abbr[abbrIndex]-'0')
				abbrIndex++
			}

			wordIndex += number
		}
	}

	return wordIndex == len(word) && abbrIndex == len(abbr)
}

The Go implementation follows the same logic as the Python version, but there are some language-specific details.

Go strings are indexed as byte slices, so character checks are performed using ASCII comparisons such as:

char >= 'a' && char <= 'z'

Digit parsing is done manually by subtracting '0' from the byte value.

Go does not provide Python-style isdigit() or isalpha() methods directly on characters, so explicit comparisons are used instead.

The problem guarantees that all integers fit within a 32-bit integer, so standard int handling is sufficient.

Worked Examples

Example 1

Input:

word = "internationalization"
abbr = "i12iz4n"
Step word_index abbr_index Current abbr char Action
1 0 0 i Match i with i
2 1 1 1 Parse number 12
3 13 3 i Match i
4 14 4 z Match z
5 15 5 4 Parse number 4
6 19 6 n Match n
7 20 7 end Both strings consumed

Result:

True

Example 2

Input:

word = "apple"
abbr = "a2e"
Step word_index abbr_index Current abbr char Action
1 0 0 a Match a
2 1 1 2 Skip 2 characters
3 3 2 e Compare l with e, mismatch

Result:

False

Complexity Analysis

Measure Complexity Explanation
Time O(n + m) Each character in both strings is processed at most once
Space O(1) Only a few integer variables are used

The algorithm is linear because both pointers move strictly forward. Digits are parsed once, and no backtracking occurs. The solution also uses constant extra memory since it does not build auxiliary data structures or reconstructed strings.

Test Cases

solution = Solution()

assert solution.validWordAbbreviation(
    "internationalization",
    "i12iz4n"
) is True  # provided valid example

assert solution.validWordAbbreviation(
    "apple",
    "a2e"
) is False  # provided invalid example

assert solution.validWordAbbreviation(
    "substitution",
    "s10n"
) is True  # large middle skip

assert solution.validWordAbbreviation(
    "substitution",
    "sub4u4"
) is True  # multiple skips

assert solution.validWordAbbreviation(
    "substitution",
    "12"
) is True  # whole word abbreviated

assert solution.validWordAbbreviation(
    "substitution",
    "s55n"
) is False  # adjacent abbreviations invalid logically

assert solution.validWordAbbreviation(
    "substitution",
    "s010n"
) is False  # leading zero invalid

assert solution.validWordAbbreviation(
    "substitution",
    "s0ubstitution"
) is False  # zero-length abbreviation invalid

assert solution.validWordAbbreviation(
    "word",
    "4"
) is True  # skip entire word

assert solution.validWordAbbreviation(
    "word",
    "5"
) is False  # skip exceeds word length

assert solution.validWordAbbreviation(
    "word",
    "w1r1"
) is False  # incorrect final alignment

assert solution.validWordAbbreviation(
    "abcdef",
    "a2d2"
) is False  # mismatch after skipping

assert solution.validWordAbbreviation(
    "abcdef",
    "a3ef"
) is True  # valid middle abbreviation

assert solution.validWordAbbreviation(
    "a",
    "1"
) is True  # smallest valid full abbreviation

assert solution.validWordAbbreviation(
    "a",
    "0"
) is False  # zero not allowed
Test Why
"internationalization", "i12iz4n" Standard valid abbreviation
"apple", "a2e" Character mismatch after skipping
"substitution", "s10n" Large numeric skip
"substitution", "sub4u4" Multiple abbreviated regions
"substitution", "12" Entire word abbreviated
"substitution", "s55n" Invalid adjacent abbreviation effect
"substitution", "s010n" Leading zero handling
"substitution", "s0ubstitution" Zero-length replacement invalid
"word", "4" Exact full-word skip
"word", "5" Skip beyond string length
"word", "w1r1" Incorrect alignment at end
"abcdef", "a2d2" Mismatch after intermediate skip
"abcdef", "a3ef" Correct mixed abbreviation
"a", "1" Smallest successful case
"a", "0" Smallest invalid numeric case

Edge Cases

One important edge case is leading zeros. An abbreviation like "a01b" might accidentally be treated as skipping one character if the implementation simply parses integers normally. However, the problem explicitly forbids leading zeros. The solution handles this by immediately returning False whenever a numeric segment starts with '0'.

Another tricky case occurs when the skip amount moves beyond the end of the word. For example, "word" with abbreviation "5" attempts to skip five characters even though the word only contains four. Some incorrect implementations might stop early and still return True. This implementation avoids that bug because the final validation requires both pointers to finish exactly at their string lengths.

A third important case is incomplete alignment after processing. Consider "word" and "w1r1". Even though the abbreviation structure looks plausible, the skipped positions do not line up correctly with the remaining letters. The algorithm catches this because every literal character comparison is checked immediately during traversal.

A final subtle case involves parsing multi-digit numbers correctly. For example, "12" must mean skipping twelve characters, not skipping one character twice. The implementation correctly consumes consecutive digits together and constructs the complete integer before advancing the pointer in word.