LeetCode 2262 - Total Appeal of A String
The problem asks us to calculate the total appeal of all substrings of a given string s. The appeal of a string is defined as the number of distinct characters it contains.
Difficulty: 🔴 Hard
Topics: Hash Table, String, Dynamic Programming
Solution
Problem Understanding
The problem asks us to calculate the total appeal of all substrings of a given string s. The appeal of a string is defined as the number of distinct characters it contains. In other words, for every possible substring of s, we need to count how many unique letters it has and sum these counts across all substrings.
The input is a single string s consisting of lowercase English letters, and its length can be up to 10^5. The output is a single integer representing the total sum of appeals for all substrings.
Key points to understand:
- A substring is a contiguous sequence of characters. For example,
"abc"has substrings"a","b","c","ab","bc","abc". - A naive approach that generates all substrings and counts distinct characters will be too slow because there can be roughly
O(n^2)substrings for a string of lengthn, and counting distinct characters in each substring takes additional time. - Edge cases include strings with all identical characters, single-character strings, and strings where every character is unique.
The goal is to find an efficient approach that avoids explicitly generating all substrings.
Approaches
Brute Force Approach
The brute force method would generate every substring of s and compute its appeal by using a set to track distinct characters. This method is correct because it directly counts the appeal for all substrings. However, its complexity is O(n^3) in the worst case: there are O(n^2) substrings and counting unique characters in each substring takes up to O(n).
Optimal Approach
The optimal solution leverages the insight that we can calculate the contribution of each character directly to all substrings that include it. Specifically, if a character c appears at index i, then it contributes to all substrings that start after the previous occurrence of c and end at or after i. Using a hash map to track the last occurrence of each character allows us to compute the total contribution in linear time. This reduces the complexity to O(n).
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n^3) | O(n) | Generates all substrings and counts distinct characters |
| Optimal | O(n) | O(1) | Tracks last occurrence of each character to compute contributions |
Algorithm Walkthrough
- Initialize a hash map
last_occurrenceto store the last index where each character appeared. Initializetotal_appealto0andcurrent_sumto0. - Iterate over each character
cat indexiin the string. - Compute the number of new substrings that include
cas a new distinct character: this is(i - last_occurrence[c]). - Update
current_sumby adding the number of new substrings.current_sumrepresents the total appeal of all substrings ending at indexi. - Update
last_occurrence[c]toi. - Add
current_sumtototal_appeal. - After iterating through the string,
total_appealcontains the total appeal of all substrings.
Why it works
The key invariant is that current_sum always reflects the sum of appeals for all substrings ending at the current index. By adding contributions from new distinct occurrences, we account for every substring exactly once. Tracking the last occurrence ensures that repeated characters do not inflate the count incorrectly.
Python Solution
class Solution:
def appealSum(self, s: str) -> int:
last_occurrence = {}
total_appeal = 0
current_sum = 0
for i, c in enumerate(s):
prev_index = last_occurrence.get(c, -1)
current_sum += (i - prev_index)
last_occurrence[c] = i
total_appeal += current_sum
return total_appeal
Explanation
We use last_occurrence to track the last index of each character. For each character at index i, the number of substrings where it contributes as a new distinct character is (i - prev_index). We accumulate this in current_sum which represents the sum of appeals of all substrings ending at i. Adding current_sum to total_appeal at each step gives the final result.
Go Solution
func appealSum(s string) int64 {
lastOccurrence := [26]int{}
for i := range lastOccurrence {
lastOccurrence[i] = -1
}
var totalAppeal int64 = 0
var currentSum int64 = 0
for i, c := range s {
index := c - 'a'
prevIndex := lastOccurrence[index]
currentSum += int64(i - prevIndex)
lastOccurrence[index] = i
totalAppeal += currentSum
}
return totalAppeal
}
Explanation
Go uses a fixed-size array lastOccurrence for the 26 lowercase letters. We initialize all entries to -1 to handle characters that have not appeared yet. The rest of the logic mirrors the Python version, with int64 used to prevent overflow on large strings.
Worked Examples
Example 1: s = "abbca"
| i | char | prev_index | i - prev_index | current_sum | total_appeal |
|---|---|---|---|---|---|
| 0 | a | -1 | 1 | 1 | 1 |
| 1 | b | -1 | 2 | 3 | 4 |
| 2 | b | 1 | 1 | 4 | 8 |
| 3 | c | -1 | 4 | 8 | 16 |
| 4 | a | 0 | 4 | 12 | 28 |
Final output is 28.
Example 2: s = "code"
| i | char | prev_index | i - prev_index | current_sum | total_appeal |
|---|---|---|---|---|---|
| 0 | c | -1 | 1 | 1 | 1 |
| 1 | o | -1 | 2 | 3 | 4 |
| 2 | d | -1 | 3 | 6 | 10 |
| 3 | e | -1 | 4 | 10 | 20 |
Final output is 20.
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n) | Single pass through the string, with O(1) operations per character |
| Space | O(1) | Hash map of 26 characters, independent of input size |
Because we only need to track the last occurrence of each character, space usage does not scale with the string length.
Test Cases
# Provided examples
assert Solution().appealSum("abbca") == 28 # Example 1
assert Solution().appealSum("code") == 20 # Example 2
# Edge cases
assert Solution().appealSum("a") == 1 # Single character
assert Solution().appealSum("aaaa") == 10 # All characters identical
assert Solution().appealSum("abcde") == 35 # All characters distinct
assert Solution().appealSum("abac") == 17 # Repeated characters interleaved
# Stress test
assert Solution().appealSum("a"*100000) == 5000050000 # Large identical characters
| Test | Why |
|---|---|
| "abbca" | Validates standard example with repeated characters |
| "code" | Validates example with all unique letters |
| "a" | Single-character edge case |
| "aaaa" | Repeated identical characters edge case |
| "abcde" | All distinct letters edge case |
| "abac" | Interleaved repeats |
| "a"*100000 | Stress test for performance and integer handling |
Edge Cases
Single Character String
A string with one character should return an appeal of 1, because there is only one substring. The algorithm correctly handles this because the last_occurrence default ensures (i - prev_index) is computed as 1.
All Characters Identical
When the string contains only one unique character repeated, each new character contributes only once per new substring that ends at its position. The algorithm calculates contributions linearly and sums correctly without double counting.
Interleaved Repeated Characters
For a string like "abac", repeated characters are separated by other characters. Tracking the last occurrence ensures that the contribution for a repeated character only accounts for substrings where it is newly distinct, correctly adjusting current_sum.