LeetCode 2052 - Minimum Cost to Separate Sentence Into Rows

LeetCode Problem 2052

Difficulty: 🟡 Medium
Topics: String, Dynamic Programming

Solution

Problem Understanding

This problem asks us to format a sentence into multiple rows such that each row has length at most k. The sentence consists of words separated by spaces, and words cannot be split across rows. We may only insert line breaks between words.

The goal is not simply to minimize the number of rows. Instead, every row except the final row contributes a penalty based on unused space. If a row has length n, then its cost is:

$(k-n)^2$

The total cost is the sum of these penalties for all rows except the last one.

This is essentially a text justification optimization problem. We want to decide where to place line breaks so that earlier rows are packed efficiently, because large amounts of unused space produce large quadratic penalties.

The input consists of:

sentence, a string of lowercase words separated by single spaces
k, the maximum allowed row length

The output is the minimum achievable total cost.

The constraints are important:

sentence.length <= 5000
k <= 5000
Every word length is at most k

These constraints immediately suggest that exponential search is impossible. Since there can be thousands of words, we need a polynomial-time solution.

Several edge cases are important:

A sentence containing only one word should return 0, because the last row is free.
Rows that exactly match length k contribute zero cost.
The final row must never contribute to the answer.
Some groupings of words may exceed k, and those combinations must be rejected.
Since the penalty is quadratic, leaving large unused gaps is very expensive, which means greedy choices are not always optimal.

Approaches

Brute Force Approach

A brute force solution would try every possible way to place line breaks between words.

Suppose there are n words. Between each adjacent pair of words, we may either place a line break or not. That creates:

$2^{n-1}$

possible layouts.

For every layout, we would:

Construct all rows
Verify every row length is at most k
Compute the total cost
Track the minimum

This approach is correct because it examines every valid arrangement, but it is computationally infeasible. Even for a few hundred words, the number of layouts becomes enormous.

Key Insight

The important observation is that the problem has optimal substructure.

Suppose we decide that words i through j form the current row. Once that decision is fixed, the remaining problem is independent:

We only need the minimum cost for words starting at j + 1

This naturally leads to dynamic programming.

We define:

dp[i] = minimum cost to arrange words starting from index i

For each starting position i, we try extending the current row to every possible ending position j while the row length remains within k.

If j is not the last word, we add the row penalty:

$(k-\text{rowLength})^2$

If j is the final word, the additional cost is 0.

This converts the exponential search into a polynomial-time solution.

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(2^n)	O(n)	Tries every possible line break configuration
Optimal Dynamic Programming	O(n²)	O(n)	Computes minimum cost for every suffix of words

Algorithm Walkthrough

Split the sentence into individual words.

We first convert the sentence into an array of words because decisions are made at word boundaries. 2. Precompute word lengths.

Since row length calculations happen frequently, storing word lengths avoids repeated calls to len(). 3. Define the DP state.

Let dp[i] represent the minimum cost needed to arrange words starting from index i.

The final answer will be dp[0]. 4. Initialize the DP array.

We create an array of size n + 1.

dp[n] = 0
This represents the empty suffix after all words are placed.

Process words from right to left.

For each starting index i, we try every possible ending index j. 6. Build the current row incrementally.

While extending from i to j, maintain:

total character length of words
number of spaces between words

The row length becomes:

$\text{rowLength}=\text{sumOfWordLengths}+(j-i)$

because there are j - i spaces between words.

Stop when the row exceeds k.

If the row length becomes larger than k, further extensions will only increase the length, so we break early. 2. Compute the transition cost.

If j is the last word:

additional cost = 0

Otherwise:

$(k-\text{rowLength})^2$

Update the DP value.

The recurrence is:

$dp[i]=\min(dp[i],\text{cost}+dp[j+1])$

Return dp[0].

This represents the minimum total cost for the entire sentence.

Why it works

The algorithm works because every valid arrangement begins with some first row containing words i through j. Once that row is chosen, the remaining problem is exactly the same problem applied to the suffix beginning at j + 1.

Dynamic programming guarantees optimality because:

every possible valid first row is considered
each subproblem is solved optimally exactly once
the recurrence combines optimal substructure correctly

Therefore, dp[0] is the minimum possible total cost.

Python Solution

class Solution:
    def minimumCost(self, sentence: str, k: int) -> int:
        words = sentence.split()
        n = len(words)

        word_lengths = [len(word) for word in words]

        INF = float('inf')
        dp = [INF] * (n + 1)

        dp[n] = 0

        for i in range(n - 1, -1, -1):
            current_length = 0

            for j in range(i, n):
                current_length += word_lengths[j]

                row_length = current_length + (j - i)

                if row_length > k:
                    break

                if j == n - 1:
                    cost = 0
                else:
                    remaining_spaces = k - row_length
                    cost = remaining_spaces * remaining_spaces

                dp[i] = min(dp[i], cost + dp[j + 1])

        return dp[0]

The implementation begins by splitting the sentence into words and storing their lengths. This preprocessing simplifies row length calculations later.

The DP array stores the minimum cost for every suffix of the word list. dp[n] is initialized to 0 because arranging no words requires no cost.

The outer loop iterates backward through the words. For each starting index i, the inner loop tries all possible ending indices j.

As the row expands, the code incrementally updates current_length. The actual row length also includes spaces between words, which is why (j - i) is added.

If the row exceeds k, the loop terminates early because adding more words can never reduce the row length.

The final row contributes zero cost, which is handled by checking whether j == n - 1.

Otherwise, the quadratic penalty is computed and combined with the optimal solution for the remaining suffix.

Finally, dp[0] contains the minimum total cost for the entire sentence.

Go Solution

package main

import "math"

func minimumCost(sentence string, k int) int {
	words := splitWords(sentence)
	n := len(words)

	wordLengths := make([]int, n)
	for i, word := range words {
		wordLengths[i] = len(word)
	}

	dp := make([]int, n+1)

	for i := 0; i < n; i++ {
		dp[i] = math.MaxInt
	}

	dp[n] = 0

	for i := n - 1; i >= 0; i-- {
		currentLength := 0

		for j := i; j < n; j++ {
			currentLength += wordLengths[j]

			rowLength := currentLength + (j - i)

			if rowLength > k {
				break
			}

			cost := 0

			if j != n-1 {
				remainingSpaces := k - rowLength
				cost = remainingSpaces * remainingSpaces
			}

			if cost+dp[j+1] < dp[i] {
				dp[i] = cost + dp[j+1]
			}
		}
	}

	return dp[0]
}

func splitWords(sentence string) []string {
	words := []string{}
	start := 0

	for i := 0; i <= len(sentence); i++ {
		if i == len(sentence) || sentence[i] == ' ' {
			words = append(words, sentence[start:i])
			start = i + 1
		}
	}

	return words
}

The Go implementation follows the same dynamic programming strategy as the Python version.

One notable difference is that Go does not provide a built-in split() method identical to Python's string split behavior without importing strings. To keep the implementation self-contained, a small helper function manually parses the sentence into words.

The DP array uses math.MaxInt as the initial large value instead of Python's float('inf').

All calculations fit safely inside Go's int type because the maximum possible cost is well within integer limits for the given constraints.

Worked Examples

Example 1

Input:

sentence = "i love leetcode"
k = 12

Words:

["i", "love", "leetcode"]

Lengths:

[1, 4, 8]

We compute DP from right to left.

i	j	Row	Length	Cost	dp[j+1]	dp[i]
2	2	"leetcode"	8	0	0	0
1	1	"love"	4	64	0	64
1	2	"love leetcode"	13	invalid	-	64
0	0	"i"	1	121	64	185
0	1	"i love"	6	36	0	36
0	2	"i love leetcode"	15	invalid	-	36

Final answer:

Example 2

Input:

sentence = "apples and bananas taste great"
k = 7

Words:

["apples", "and", "bananas", "taste", "great"]

Lengths:

[6, 3, 7, 5, 5]

DP computation:

i	Best Row Choice	Cost
4	"great"	0
3	"taste"	4
2	"bananas"	0
1	"and"	16
0	"apples"	1 + 16 + 0 + 4

Final answer:

Example 3

Input:

sentence = "a"
k = 5

Only one row exists:

"a"

Since the last row contributes no cost:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n²)	Every starting index tries extending to later words
Space	O(n)	DP array stores one value per word position

The outer loop iterates over all word indices. The inner loop may also iterate across all remaining words before exceeding k. In the worst case, this creates quadratic complexity.

The space usage is linear because only the DP array and word lengths are stored.

Test Cases

sol = Solution()

# Provided examples
assert sol.minimumCost("i love leetcode", 12) == 36  # standard example
assert sol.minimumCost("apples and bananas taste great", 7) == 21  # multiple rows
assert sol.minimumCost("a", 5) == 0  # single word, last row free

# Exact fit
assert sol.minimumCost("abc def", 7) == 0  # entire sentence fits exactly

# Every word must be separate
assert sol.minimumCost("a bb ccc dddd", 4) == 6  # forced line breaks

# Large unused spaces
assert sol.minimumCost("a b c", 10) == 0  # all fit into final row

# Two-row optimal split
assert sol.minimumCost("hello world code", 11) == 16  # "hello world" + "code"

# Single long word equal to k
assert sol.minimumCost("abcdefghij", 10) == 0  # exact maximum length word

# Multiple possible layouts
assert sol.minimumCost("aa bb cc dd", 5) == 0  # optimal grouping avoids penalties

# Minimal k
assert sol.minimumCost("a b c", 1) == 2  # every word isolated except last row

# Stress style case
long_sentence = "a " * 100
long_sentence = long_sentence.strip()
assert sol.minimumCost(long_sentence, 3) >= 0  # many small words

Test	Why
`"i love leetcode"`	Verifies standard DP transition behavior
`"apples and bananas taste great"`	Tests many forced breaks
`"a"`	Validates single-row handling
`"abc def"`	Checks exact-fit behavior
`"a bb ccc dddd"`	Ensures invalid oversized rows are rejected
`"a b c"` with large `k`	Verifies free final row logic
`"hello world code"`	Tests choosing between multiple layouts
Single word equal to `k`	Confirms boundary word length handling
`"aa bb cc dd"`	Tests zero-cost arrangements
`"a b c"` with `k = 1`	Validates smallest possible row capacity
Long repeated sentence	Stress test for many DP transitions

Edge Cases

One important edge case is when the sentence contains only one word. Since the final row does not contribute to the cost, the answer must always be 0 regardless of how much unused space remains. A careless implementation might incorrectly apply the quadratic penalty to the only row. This implementation avoids that mistake by assigning zero cost whenever the current row reaches the final word.

Another important edge case occurs when a word length is exactly equal to k. In this case, the word perfectly fills a row and contributes zero penalty. Some implementations incorrectly reject rows whose length equals k by using >= instead of >. This solution correctly allows rows whose length is exactly k.

A third tricky case involves combinations of words that exceed the limit only after spaces are added. For example, two short words may individually fit, but once the mandatory separating space is included, the row becomes too long. The implementation correctly computes row length as the sum of word lengths plus the number of spaces between words.

A final subtle edge case is the handling of the final row. Since the last row is free, the optimal arrangement may intentionally leave extra unused space there to reduce penalties earlier. Greedy strategies that always pack rows tightly can fail because they ignore this property. The dynamic programming solution correctly evaluates all valid row endings and naturally handles this tradeoff.