LeetCode 423: Reconstruct Original Digits from English

A clear explanation of reconstructing digits from shuffled English words using character frequency counts and unique identifying letters.

Problem Restatement

We are given a string s.

The string contains shuffled letters from English digit words:

"zero", "one", "two", "three", "four",
"five", "six", "seven", "eight", "nine"

The letters are out of order.

We must reconstruct the original digits and return them in ascending order.

The input is guaranteed to be valid. The constraints allow s.length up to 10^5. The source examples include "owoztneoer" -> "012" and "fviefuro" -> "45".

Input and Output

Item Meaning
Input A shuffled string of letters from digit words
Output Digits in ascending order
Digits 0 through 9
Guarantee The input can be reconstructed into valid digit words
Return type String

Example function shape:

def originalDigits(s: str) -> str:
    ...

Examples

Example 1:

s = "owoztneoer"

The answer is:

"012"

The letters can be rearranged into:

zero one two

So the digits are:

0 1 2

Returned in ascending order:

"012"

Example 2:

s = "fviefuro"

The answer is:

"45"

The letters can be rearranged into:

four five

So the digits are:

4 5

First Thought: Try All Digit Combinations

A brute force approach would try to decide how many times each digit appears, then check whether the digit words produce exactly the letters in s.

But there can be many letters, up to 10^5.

Trying combinations is unnecessary.

The digit words have useful identifying letters.

Key Insight

Some letters appear in only one digit word.

Unique letter Digit word Digit
z "zero" 0
w "two" 2
u "four" 4
x "six" 6
g "eight" 8

So if the string contains z three times, then digit 0 appears three times.

After identifying these digits, other digits become identifiable.

For example:

Letter Digit word Reason
h "three" after removing "eight"
f "five" after removing "four"
s "seven" after removing "six"
o "one" after removing "zero", "two", "four"
i "nine" after removing "five", "six", "eight"

This gives a fixed order for reconstruction.

Algorithm

Count all characters in s.

Then compute digit counts:

count[0] = freq["z"]
count[2] = freq["w"]
count[4] = freq["u"]
count[6] = freq["x"]
count[8] = freq["g"]

Then derive the remaining digits:

count[3] = freq["h"] - count[8]
count[5] = freq["f"] - count[4]
count[7] = freq["s"] - count[6]
count[1] = freq["o"] - count[0] - count[2] - count[4]
count[9] = freq["i"] - count[5] - count[6] - count[8]

Finally, build the answer by appending each digit count[d] times from 0 to 9.

Correctness

The letters z, w, u, x, and g appear uniquely in the words for digits 0, 2, 4, 6, and 8. Therefore, their frequencies exactly determine the counts of those digits.

After those digits are known, the remaining formulas subtract already-accounted words.

For digit 3, the letter h appears in "three" and "eight". Since the count of 8 is already known, freq["h"] - count[8] gives the count of 3.

For digit 5, the letter f appears in "five" and "four". Since 4 is already known, freq["f"] - count[4] gives the count of 5.

For digit 7, the letter s appears in "seven" and "six". Since 6 is already known, freq["s"] - count[6] gives the count of 7.

For digit 1, the letter o appears in "zero", "one", "two", and "four". After subtracting 0, 2, and 4, the remaining o letters belong to 1.

For digit 9, the letter i appears in "five", "six", "eight", and "nine". After subtracting 5, 6, and 8, the remaining i letters belong to 9.

Thus every digit count is computed exactly. Building the output from 0 to 9 returns the digits in ascending order.

Complexity

Metric Value Why
Time O(n) We count each character and build the output
Space O(1) There are only 26 lowercase letters and 10 digit counts

Here n is the length of s.

Implementation

from collections import Counter

class Solution:
    def originalDigits(self, s: str) -> str:
        freq = Counter(s)
        count = [0] * 10

        count[0] = freq["z"]
        count[2] = freq["w"]
        count[4] = freq["u"]
        count[6] = freq["x"]
        count[8] = freq["g"]

        count[3] = freq["h"] - count[8]
        count[5] = freq["f"] - count[4]
        count[7] = freq["s"] - count[6]

        count[1] = freq["o"] - count[0] - count[2] - count[4]
        count[9] = freq["i"] - count[5] - count[6] - count[8]

        answer = []

        for digit in range(10):
            answer.append(str(digit) * count[digit])

        return "".join(answer)

Code Explanation

We first count letters:

freq = Counter(s)

Then we store how many times each digit appears:

count = [0] * 10

The first group uses letters unique to one digit word:

count[0] = freq["z"]
count[2] = freq["w"]
count[4] = freq["u"]
count[6] = freq["x"]
count[8] = freq["g"]

Then we compute digits that become identifiable after subtracting known digits:

count[3] = freq["h"] - count[8]
count[5] = freq["f"] - count[4]
count[7] = freq["s"] - count[6]

Finally:

count[1] = freq["o"] - count[0] - count[2] - count[4]
count[9] = freq["i"] - count[5] - count[6] - count[8]

constructs the remaining two digits.

To return ascending order, we append digits from 0 through 9:

for digit in range(10):
    answer.append(str(digit) * count[digit])

If count[2] == 3, this appends:

"222"

Then we join all pieces:

return "".join(answer)

Testing

def test_original_digits():
    s = Solution()

    assert s.originalDigits("owoztneoer") == "012"
    assert s.originalDigits("fviefuro") == "45"
    assert s.originalDigits("zero") == "0"
    assert s.originalDigits("nnei") == "9"
    assert s.originalDigits("zerozero") == "00"
    assert s.originalDigits("zeroonetwothreefourfivesixseveneightnine") == "0123456789"
    assert s.originalDigits("xisowt") == "26"

    print("all tests passed")

Test Notes

Test Why
"owoztneoer" Standard example with 0, 1, 2
"fviefuro" Standard example with 4, 5
"zero" Single uniquely identified digit
"nnei" Digit 9 after deduction
"zerozero" Repeated digit
All digit words Checks every digit count
"xisowt" Shuffled six and two