LeetCode 3039 - Apply Operations to Make String Empty

The problem gives us a lowercase string s and defines a repeated operation. During one operation, we scan through all letters from 'a' to 'z'. For each letter, if that letter appears in the current string, we remove its first occurrence.

LeetCode Problem 3039

Difficulty: 🟡 Medium
Topics: Array, Hash Table, Sorting, Counting

Solution

LeetCode 3039 - Apply Operations to Make String Empty

Problem Understanding

The problem gives us a lowercase string s and defines a repeated operation.

During one operation, we scan through all letters from 'a' to 'z'. For each letter, if that letter appears in the current string, we remove its first occurrence.

This process continues until the string becomes empty.

Instead of returning the final empty string, we must return the string that exists immediately before the final operation is performed.

To understand what this means, consider the example:

s = "aabcbbca"

The first operation removes the first occurrence of every distinct character currently present:

aabcbbca -> abbca

The second operation removes the first occurrence of every remaining character:

abbca -> ba

The third operation removes the first occurrence of a and b:

ba -> ""

The string right before the final operation is "ba", so that is the answer.

The input length can be as large as:

5 * 10^5

This immediately tells us that repeatedly simulating deletions would be too expensive. We need a solution that processes the string in linear time.

Several edge cases are important:

  • A string containing only one character, such as "a".
  • A string where every character is unique, such as "abcd".
  • A string where all characters are identical, such as "aaaaa".
  • Characters appearing with different frequencies.
  • Very large inputs near the constraint limit.

Understanding what survives until the last round is the key to solving the problem efficiently.

Approaches

Brute Force

A direct simulation follows the problem statement exactly.

For each round:

  1. Find the first occurrence of every character.
  2. Remove those positions.
  3. Build the next string.
  4. Repeat until the string becomes empty.

The last non-empty string encountered is the answer.

This approach is correct because it literally performs the required operations. However, each round may scan a large portion of the string, and there can be many rounds. For a string of length n, the total work can approach O(n²).

With n = 500,000, this is far too slow.

Key Insight

Instead of simulating deletions, consider how many rounds each character survives.

Suppose character c appears freq[c] times.

During every operation, at most one occurrence of c is removed. Therefore:

  • The first occurrence disappears in round 1.
  • The second occurrence disappears in round 2.
  • The third occurrence disappears in round 3.
  • And so on.

If a character appears k times, its last occurrence survives until round k.

Let:

maxFreq = maximum frequency among all characters

The final operation occurs during round maxFreq.

Only characters whose frequency equals maxFreq can still exist immediately before that final operation.

Now consider which occurrence survives until round maxFreq.

For a character appearing maxFreq times, its surviving occurrence is its last occurrence in the original string.

Therefore:

  1. Find the maximum frequency.
  2. Identify all characters whose frequency equals that maximum.
  3. Take their last occurrence positions.
  4. Sort those positions by original order.
  5. Build the answer from the corresponding characters.

This produces exactly the string present before the final deletion round.

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(n²) O(n) Repeatedly rebuilds strings after every operation
Optimal O(n) O(1) or O(26) Uses frequency counting and last occurrence tracking

Algorithm Walkthrough

  1. Create an array of size 26 to count the frequency of every lowercase letter.
  2. Traverse the string once and increment the frequency for each character.
  3. Compute the maximum frequency among all characters.
  4. Traverse the string again and record the last position of every character.
  5. Traverse all 26 letters and select only those whose frequency equals the maximum frequency.
  6. For every selected character, collect:
  • Its last occurrence index.
  • The character itself.
  1. Sort the collected entries by their last occurrence index.
  2. Build the answer string by appending the characters in sorted order.
  3. Return the resulting string.

Why it works

A character appearing k times loses exactly one occurrence per round, so its final remaining copy disappears during round k. The last operation occurs during round maxFreq, where maxFreq is the largest frequency in the string.

Therefore, immediately before that final operation, the only surviving characters are those whose frequency equals maxFreq. For each such character, the surviving copy is its last occurrence in the original string. Ordering these surviving copies by their original positions reconstructs exactly the string that exists before the final operation.

Python Solution

class Solution:
    def lastNonEmptyString(self, s: str) -> str:
        freq = [0] * 26

        for ch in s:
            freq[ord(ch) - ord('a')] += 1

        max_freq = max(freq)

        last_pos = [-1] * 26
        for i, ch in enumerate(s):
            last_pos[ord(ch) - ord('a')] = i

        survivors = []

        for i in range(26):
            if freq[i] == max_freq:
                survivors.append((last_pos[i], chr(i + ord('a'))))

        survivors.sort()

        return "".join(ch for _, ch in survivors)

The implementation starts by counting the frequency of every character. This allows us to determine the maximum frequency, which corresponds to the round in which the final operation occurs.

Next, we record the last occurrence position of every character. These positions identify which copies survive until the final round.

After that, we examine all 26 letters and keep only those whose frequency equals the maximum frequency. For each qualifying character, we store its last position and the character itself.

Sorting by last position restores the order in which those surviving characters appear in the final non-empty string. Finally, we concatenate them and return the result.

Go Solution

func lastNonEmptyString(s string) string {
	freq := make([]int, 26)

	for _, ch := range s {
		freq[ch-'a']++
	}

	maxFreq := 0
	for _, f := range freq {
		if f > maxFreq {
			maxFreq = f
		}
	}

	lastPos := make([]int, 26)
	for i := range lastPos {
		lastPos[i] = -1
	}

	for i, ch := range s {
		lastPos[ch-'a'] = i
	}

	type pair struct {
		pos int
		ch  byte
	}

	var survivors []pair

	for i := 0; i < 26; i++ {
		if freq[i] == maxFreq {
			survivors = append(survivors, pair{
				pos: lastPos[i],
				ch:  byte('a' + i),
			})
		}
	}

	sort.Slice(survivors, func(i, j int) bool {
		return survivors[i].pos < survivors[j].pos
	})

	result := make([]byte, 0, len(survivors))
	for _, p := range survivors {
		result = append(result, p.ch)
	}

	return string(result)
}

The Go implementation follows exactly the same logic as the Python version. A fixed-size frequency array and last-position array are used because there are only 26 lowercase letters.

The only notable difference is that Go requires an explicit structure to store (position, character) pairs and uses sort.Slice to order them by position before constructing the final answer string.

Worked Examples

Example 1

s = "aabcbbca"

Frequency Count

Character Frequency
a 3
b 3
c 2
maxFreq = 3

Last Occurrences

Character Last Position
a 7
b 5
c 6

Only characters with frequency 3 survive:

Character Last Position
a 7
b 5

Sort by position:

Position Character
5 b
7 a

Result:

"ba"

Example 2

s = "abcd"

Frequency Count

Character Frequency
a 1
b 1
c 1
d 1
maxFreq = 1

Last Occurrences

Character Last Position
a 0
b 1
c 2
d 3

All characters survive because all frequencies equal the maximum.

Sorted by last occurrence:

Position Character
0 a
1 b
2 c
3 d

Result:

"abcd"

Complexity Analysis

Measure Complexity Explanation
Time O(n) One pass for frequencies, one pass for last positions, plus processing 26 letters
Space O(1) Only fixed-size arrays of length 26 are used

The algorithm performs a constant number of passes over the string. All additional storage depends only on the alphabet size, which is fixed at 26. Therefore the running time is linear in the length of the string, and the extra space is constant.

Test Cases

sol = Solution()

assert sol.lastNonEmptyString("aabcbbca") == "ba"      # provided example
assert sol.lastNonEmptyString("abcd") == "abcd"        # all unique characters

assert sol.lastNonEmptyString("a") == "a"             # single character
assert sol.lastNonEmptyString("aaaaa") == "a"         # all identical
assert sol.lastNonEmptyString("ababa") == "a"         # one character has higher frequency
assert sol.lastNonEmptyString("aabb") == "ab"         # equal maximum frequencies
assert sol.lastNonEmptyString("abcabc") == "abc"      # all tied
assert sol.lastNonEmptyString("zzzzxy") == "z"        # dominant character
assert sol.lastNonEmptyString("bbaaaccc") == "ac"     # multiple maximum-frequency chars
assert sol.lastNonEmptyString("cabcabca") == "a"      # one clear winner
assert sol.lastNonEmptyString("abcdefghijklmnopqrstuvwxyz") == \
       "abcdefghijklmnopqrstuvwxyz"                   # all letters once

Test Summary

Test Why
"aabcbbca" Official example with multiple rounds
"abcd" Every character appears once
"a" Smallest valid input
"aaaaa" Single repeated character
"ababa" One character has strictly larger frequency
"aabb" Multiple characters tied for maximum frequency
"abcabc" All characters tied
"zzzzxy" One dominant character
"bbaaaccc" Several maximum-frequency characters with ordering concerns
"cabcabca" Last occurrence positioning matters
"abcdefghijklmnopqrstuvwxyz" Maximum number of distinct letters

Edge Cases

Single Character String

A string such as "a" can be easy to mishandle if an implementation assumes multiple rounds are required. Here, the first operation is also the last operation, so the answer must be the original string itself. The algorithm handles this naturally because the maximum frequency is 1, and the last occurrence of 'a' forms the answer.

All Characters Have Equal Frequency

Consider "abcabc".

Every character appears exactly twice. All of them survive until the final round. An incorrect solution might return the characters in alphabetical order instead of their actual order in the surviving string. By sorting according to last occurrence positions, the implementation reconstructs the correct result "abc".

One Character Appears Much More Frequently Than Others

Consider "zzzzxy".

The character 'z' appears four times, while 'x' and 'y' appear once. The final operation occurs in round four, so only 'z' survives until that point. The algorithm correctly identifies that only characters with maximum frequency remain and returns "z".

Multiple Maximum-Frequency Characters

Consider "aabb".

Both 'a' and 'b' appear twice. Immediately before the final operation, both characters remain. The answer is not determined alphabetically, it is determined by the order of their last surviving occurrences. Recording and sorting by last positions ensures the correct output "ab" is produced.