LeetCode 3441 - Minimum Cost Good Caption

We are given a lowercase string caption. We may repeatedly change any character one step forward or backward in the alphabet.

LeetCode Problem 3441

Difficulty: 🔴 Hard
Topics: String, Dynamic Programming

Solution

Problem Understanding

We are given a lowercase string caption. We may repeatedly change any character one step forward or backward in the alphabet. Because we can apply this operation multiple times, converting a character x into a character y costs exactly:

$$|x - y|$$

where the difference is measured by alphabet position.

The goal is to transform the original string into a good caption. A caption is good if every character belongs to a contiguous group whose length is at least 3.

For example:

  • "aaabbb" is valid because both groups have length 3.
  • "aaaaccc" is valid because the groups have lengths 4 and 3.
  • "aabbb" is invalid because the group "aa" has length 2.
  • "ccccd" is invalid because "d" appears only once.

We want the transformed string with:

  1. Minimum total operation cost.
  2. If multiple transformed strings have the same minimum cost, return the lexicographically smallest one.
  3. If no good caption can exist, return "".

The constraints are large:

  • n <= 5 * 10^4
  • Only 26 possible characters

A solution that tries all possible partitions or all possible transformed strings is completely infeasible. We need a dynamic programming solution whose complexity is roughly linear in n, multiplied by the constant alphabet size.

One immediate edge case is when n < 3. A good caption requires every group to have length at least 3, so no valid answer exists. We must return "".

Another important observation is that the cost of changing a character is independent from all other positions. The only global constraint comes from the group-length requirement.

Approaches

Brute Force

A brute-force solution would try every possible good caption and compute its conversion cost.

A good caption can be viewed as a sequence of character blocks:

  • "aaaa"
  • "bbb"
  • "ccccc"

where every block length is at least 3.

We could theoretically enumerate:

  • every partition of the string into blocks of size at least 3
  • every character assigned to every block

and compute the total conversion cost.

This approach is correct because it explores every valid target string. However, the number of possible block partitions grows exponentially, and each block can choose among 26 letters. The search space becomes astronomically large even for moderate values of n.

Dynamic Programming

The key observation is that the only thing that matters while constructing the answer is:

  • the current position
  • the chosen character
  • how many consecutive occurrences of that character we already have

The exact length of a run larger than 3 does not matter. Once a run reaches length 3, it already satisfies the constraint and may continue indefinitely.

Therefore, we can compress the state into three categories:

  • run length = 1
  • run length = 2
  • run length >= 3

This produces only:

$$n \times 26 \times 3$$

states.

By processing the string from right to left, we can compute the minimum cost for every state and then reconstruct the lexicographically smallest optimal answer.

Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Tries all valid target captions
Optimal DP O(n * 26) O(n * 26) Tracks position, letter, and run-length category

Algorithm Walkthrough

DP State Definition

Let:

dp[i][c][k]

represent the minimum cost for transforming the suffix starting at index i.

Here:

  • i is the current position.
  • c is the character chosen at position i.
  • k describes the size of the current run beginning at i.

We use:

  • k = 0 → run length exactly 1
  • k = 1 → run length exactly 2
  • k = 2 → run length at least 3

Step 1: Initialize the Last Position

At the final character of the string, the only possible run length is 1.

So:

dp[n - 1][c][0]

is simply the cost of converting the last character into c.

The other states remain invalid.

Step 2: Process from Right to Left

For every position i and every letter c:

The conversion cost at the current position is:

abs(original[i] - c)

Step 3: Transition for Run Length 1

If the run length at position i is exactly 1, then position i + 1 must start a completely valid caption.

Therefore:

dp[i][c][0]
    = changeCost + minimum valid suffix cost

Step 4: Transition for Run Length 2

If the run length is exactly 2, the next character must still be c.

Therefore:

dp[i][c][1]
    = changeCost + dp[i + 1][c][0]

Step 5: Transition for Run Length >= 3

Once the run has reached length 3, we have two choices:

  • continue the same run
  • keep extending an already-valid run

Thus:

dp[i][c][2]
    = changeCost +
      min(
          dp[i + 1][c][1],
          dp[i + 1][c][2]
      )

Step 6: Find the Optimal Starting Letter

The answer must begin with a valid run of length at least 3.

Therefore we examine:

dp[0][c][2]

for all letters c.

The smallest cost is the optimal cost.

When costs tie, we choose the lexicographically smaller character.

Step 7: Reconstruct the String

After the DP table is complete, we rebuild the answer.

At each step we determine whether:

  • extending the current run still preserves optimality, or
  • starting a new valid run produces a lexicographically smaller optimal solution.

Following these choices reconstructs the minimum-cost caption while also enforcing lexicographic ordering.

Why it works

The DP considers every valid way to build runs while remembering only the information that affects future decisions: the current character and whether the run length is 1, 2, or already at least 3. Any longer run behaves identically from that point onward, so no additional information is necessary.

Because every transition represents a valid continuation and every state stores the minimum achievable cost, the DP computes the globally optimal solution. During reconstruction, ties are resolved by always choosing the lexicographically smaller valid continuation, guaranteeing the smallest string among all minimum-cost solutions.

Python Solution

class Solution:
    def minCostGoodCaption(self, caption: str) -> str:
        n = len(caption)

        if n < 3:
            return ""

        INF = 10**9

        dp = [[[INF] * 3 for _ in range(26)] for _ in range(n)]

        last = ord(caption[-1]) - ord("a")

        for c in range(26):
            dp[n - 1][c][0] = abs(last - c)

        min_cost = INF

        for i in range(n - 2, -1, -1):
            cur = ord(caption[i]) - ord("a")
            new_min_cost = INF

            for c in range(26):
                change_cost = abs(cur - c)

                dp[i][c][0] = change_cost + min_cost
                dp[i][c][1] = change_cost + dp[i + 1][c][0]
                dp[i][c][2] = change_cost + min(
                    dp[i + 1][c][1],
                    dp[i + 1][c][2]
                )

                new_min_cost = min(new_min_cost, dp[i][c][2])

            min_cost = new_min_cost

        answer = []

        best_cost = INF
        letter = -1

        for c in range(25, -1, -1):
            if dp[0][c][2] <= best_cost:
                best_cost = dp[0][c][2]
                letter = c

        def append_letter(idx: int, ch: int) -> int:
            answer.append(chr(ord("a") + ch))
            return abs(ord(caption[idx]) - (ord("a") + ch))

        best_cost -= append_letter(0, letter)
        best_cost -= append_letter(1, letter)
        best_cost -= append_letter(2, letter)

        i = 3

        while i < n:
            next_letter = 26

            for c in range(25, -1, -1):
                if dp[i][c][2] == best_cost:
                    next_letter = c

            if (
                next_letter < letter
                or min(dp[i][letter]) > best_cost
            ):
                letter = next_letter

                best_cost -= append_letter(i, letter)
                best_cost -= append_letter(i + 1, letter)
                best_cost -= append_letter(i + 2, letter)

                i += 3
            else:
                best_cost -= append_letter(i, letter)
                i += 1

        return "".join(answer)

The implementation begins by handling the impossible case where the string length is less than 3. It then builds a three-dimensional DP table. Each state stores the minimum cost for a suffix given a chosen character and a run-length category.

The table is filled from right to left because each state depends only on states at index i + 1.

After computing all costs, the algorithm identifies the best starting character. Reconstruction then walks through the string while preserving optimal cost. Whenever multiple optimal continuations exist, the smaller character is chosen, ensuring lexicographic minimality.

Go Solution

func minCostGoodCaption(caption string) string {
	n := len(caption)

	if n < 3 {
		return ""
	}

	const INF = int(1e9)

	dp := make([][][]int, n)

	for i := range dp {
		dp[i] = make([][]int, 26)

		for c := 0; c < 26; c++ {
			dp[i][c] = []int{INF, INF, INF}
		}
	}

	last := int(caption[n-1] - 'a')

	for c := 0; c < 26; c++ {
		if last > c {
			dp[n-1][c][0] = last - c
		} else {
			dp[n-1][c][0] = c - last
		}
	}

	minCost := INF

	for i := n - 2; i >= 0; i-- {
		cur := int(caption[i] - 'a')
		newMinCost := INF

		for c := 0; c < 26; c++ {
			changeCost := cur - c
			if changeCost < 0 {
				changeCost = -changeCost
			}

			dp[i][c][0] = changeCost + minCost
			dp[i][c][1] = changeCost + dp[i+1][c][0]

			if dp[i+1][c][1] < dp[i+1][c][2] {
				dp[i][c][2] = changeCost + dp[i+1][c][1]
			} else {
				dp[i][c][2] = changeCost + dp[i+1][c][2]
			}

			if dp[i][c][2] < newMinCost {
				newMinCost = dp[i][c][2]
			}
		}

		minCost = newMinCost
	}

	var result []byte

	bestCost := INF
	letter := -1

	for c := 25; c >= 0; c-- {
		if dp[0][c][2] <= bestCost {
			bestCost = dp[0][c][2]
			letter = c
		}
	}

	appendLetter := func(idx int, ch int) int {
		result = append(result, byte('a'+ch))

		diff := int(caption[idx]) - int('a'+byte(ch))
		if diff < 0 {
			diff = -diff
		}

		return diff
	}

	bestCost -= appendLetter(0, letter)
	bestCost -= appendLetter(1, letter)
	bestCost -= appendLetter(2, letter)

	i := 3

	for i < n {
		nextLetter := 26

		for c := 25; c >= 0; c-- {
			if dp[i][c][2] == bestCost {
				nextLetter = c
			}
		}

		curMin := dp[i][letter][0]
		if dp[i][letter][1] < curMin {
			curMin = dp[i][letter][1]
		}
		if dp[i][letter][2] < curMin {
			curMin = dp[i][letter][2]
		}

		if nextLetter < letter || curMin > bestCost {
			letter = nextLetter

			bestCost -= appendLetter(i, letter)
			bestCost -= appendLetter(i+1, letter)
			bestCost -= appendLetter(i+2, letter)

			i += 3
		} else {
			bestCost -= appendLetter(i, letter)
			i++
		}
	}

	return string(result)
}

The Go version follows the same DP structure as the Python solution. The primary differences are explicit slice allocation, manual absolute-value computation, and use of a []byte buffer for efficient string construction.

Worked Examples

Example 1

caption = "cdcd"

Character values:

Index Original
0 c
1 d
2 c
3 d

Possible optimal good captions:

Target Cost
cccc 2
dddd 2

The minimum cost is 2.

Because "cccc" is lexicographically smaller than "dddd", reconstruction chooses:

cccc

Final answer:

"cccc"

Example 2

caption = "aca"

Cost to convert to "aaa":

Position Change
a -> a 0
c -> a 2
a -> a 0

Total:

2

No other good caption achieves a smaller cost.

Result:

"aaa"

Example 3

caption = "bc"

Length is only 2.

Every valid group must have length at least 3.

Therefore:

""

is returned immediately.

Complexity Analysis

Measure Complexity Explanation
Time O(n * 26) 26 letters and 3 run states per position
Space O(n * 26) DP table stores 26 letters and 3 states per index

The alphabet size is constant, so the runtime grows linearly with the string length. Even at the maximum constraint of 5 * 10^4, the solution remains efficient. The DP table stores only three run categories per letter, keeping memory usage manageable.

Test Cases

sol = Solution()

assert sol.minCostGoodCaption("cdcd") == "cccc"  # provided example
assert sol.minCostGoodCaption("aca") == "aaa"    # provided example
assert sol.minCostGoodCaption("bc") == ""        # impossible length

assert sol.minCostGoodCaption("aaa") == "aaa"    # already valid
assert sol.minCostGoodCaption("bbbb") == "bbbb"  # valid run longer than 3

assert sol.minCostGoodCaption("abc") == "bbb"    # single valid block

assert sol.minCostGoodCaption("zzz") == "zzz"    # upper alphabet boundary
assert sol.minCostGoodCaption("aaaaz") == "aaaaa"  # extend existing run

assert sol.minCostGoodCaption("abcd") == "bbbb"  # best single block

assert sol.minCostGoodCaption("zzzaaa") == "zzzaaa"  # two valid groups

assert sol.minCostGoodCaption("ababab") == "aaaaaa"  # tie handling

assert sol.minCostGoodCaption("a") == ""  # minimum length
assert sol.minCostGoodCaption("ab") == "" # length 2 impossible

assert len(sol.minCostGoodCaption("a" * 50000)) == 50000  # stress test
Test Why
"cdcd" Cost tie requiring lexicographic comparison
"aca" Requires multiple operations on one position
"bc" Impossible case
"aaa" Already valid
"bbbb" Run longer than minimum
"abc" Entire string becomes one block
"zzz" Alphabet boundary
"aaaaz" Extending an existing valid run
"abcd" Simple conversion optimization
"zzzaaa" Multiple valid groups already present
"ababab" Lexicographic tie-breaking
"a" Smallest input
"ab" Length below required minimum
"a" * 50000 Maximum-size stress case

Edge Cases

Length Less Than Three

A good caption requires every group to have size at least 3. If the entire string has fewer than 3 characters, no valid grouping can exist.

Examples:

"a"
"ab"

The implementation checks this immediately and returns an empty string before allocating any DP structures.

Runs Longer Than Three

Once a run reaches length 3, it already satisfies the constraint. A common bug is continuing to track the exact run length, which unnecessarily increases the state space.

The implementation avoids this by collapsing all lengths greater than or equal to 3 into a single state. This keeps the DP compact and efficient.

Multiple Minimum-Cost Answers

The problem requires the lexicographically smallest answer among all optimal solutions.

For example:

cdcd

Both:

cccc
dddd

have cost 2.

A solution that only tracks cost may return either answer depending on implementation details. During reconstruction, this solution explicitly prefers smaller characters whenever the cost remains optimal, guaranteeing the lexicographically smallest valid caption.