LeetCode 616 - Add Bold Tag in String
The problem asks us to insert HTML-style bold tags into a string whenever a substring matches any word from a given dictionary of words.
Difficulty: 🟡 Medium
Topics: Array, Hash Table, String, Trie, String Matching
Solution
Problem Understanding
The problem asks us to insert HTML-style bold tags into a string whenever a substring matches any word from a given dictionary of words.
We are given:
- A string
s - An array
words, where each element is a pattern we want to highlight
For every occurrence of every word inside s, we must wrap that matching portion with <b> and </b> tags.
The tricky part is that overlapping or adjacent matches must be merged into a single bold section.
For example, if one word matches indices [0,2] and another matches [2,4], we do not create nested or separate tags. Instead, we combine them into one continuous bold region covering [0,4].
The output is the original string with the minimum number of bold tag pairs inserted.
The constraints are relatively small:
s.length <= 1000words.length <= 100words[i].length <= 1000
Because the input size is modest, we can afford solutions that examine many substrings directly. We do not necessarily need highly advanced string matching algorithms like KMP or Aho-Corasick, although those could also solve the problem.
Several edge cases are important:
- Multiple words can overlap
- Multiple words can be consecutive
- A word may appear many times
wordsmay be empty- One word may be a substring of another
- The entire string may become bold
- No words may match at all
A naive implementation that inserts tags immediately upon finding a match can easily produce incorrect nested or fragmented tags. The key challenge is correctly merging intervals before generating the final string.
Approaches
Brute Force Approach
The brute-force approach is to examine every substring of s and check whether it equals any word in words.
Whenever a match is found, we mark the indices covered by that substring as bold.
After processing all possible substrings, we build the final result by inserting <b> before entering a bold region and </b> after leaving one.
This approach is correct because every possible matching substring is checked explicitly. Any character that belongs to a matched word becomes marked bold.
However, generating all substrings is expensive. A string of length n has O(n^2) substrings, and comparing them against all words increases the cost further.
Optimal Approach
A better approach is to iterate through each position in s and directly test whether any word starts there.
Instead of generating all substrings, we only check substrings whose lengths correspond to actual words.
We maintain a boolean array bold, where:
bold[i] = Truemeans characters[i]should be boldedbold[i] = Falsemeans it should remain normal
For every index i and every word:
- If
s[i:i+len(word)] == word - Then mark all characters in that range as bold
After all matches are processed, we traverse the string once to construct the final answer.
The main insight is that the problem is fundamentally an interval merging problem. Instead of inserting tags during matching, we first determine which characters belong to bold intervals, then generate tags only when transitioning between normal and bold regions.
| Approach | Time Complexity | Space Complexity | Notes |
|---|---|---|---|
| Brute Force | O(n² × m × k) | O(n) | Generates all substrings and compares against all words |
| Optimal | O(n × m × k) | O(n) | Checks only relevant word lengths and marks bold intervals |
Where:
n = len(s)m = len(words)k = average word length
Algorithm Walkthrough
- Create a boolean array called
boldof lengthlen(s).
Each position represents whether that character should appear inside a bold tag.
2. Iterate through every starting index i in the string.
At each position, test every word in words.
3. For each word, check whether the substring starting at i matches the word.
This is done using:
s[i:i+len(word)] == word
- If a match is found, mark all characters covered by that word as bold.
For example, if "abc" matches starting at index 2, then indices 2, 3, and 4 become bold.
5. After all matches are processed, traverse the string again to build the result.
6. Before appending character s[i], determine whether we are entering a bold region.
We insert <b> when:
bold[i] == True- and either
i == 0orbold[i-1] == False
- Append the current character.
- After appending the character, determine whether we are leaving a bold region.
We insert </b> when:
bold[i] == True- and either
i == len(s)-1orbold[i+1] == False
- Return the constructed string.
Why it works
The algorithm works because every matching word explicitly marks its covered characters as bold.
Overlapping matches naturally merge because overlapping ranges mark the same boolean positions as True.
Consecutive matches also merge automatically because adjacent positions remain continuously bold, so no closing and reopening tags are inserted between them.
The final traversal inserts tags exactly at transitions between normal and bold regions, guaranteeing the minimum valid number of tag pairs.
Python Solution
from typing import List
class Solution:
def addBoldTag(self, s: str, words: List[str]) -> str:
n = len(s)
bold = [False] * n
# Mark all bold positions
for i in range(n):
for word in words:
word_len = len(word)
if s[i:i + word_len] == word:
for j in range(i, i + word_len):
if j < n:
bold[j] = True
# Build final result
result = []
for i in range(n):
# Start of bold region
if bold[i] and (i == 0 or not bold[i - 1]):
result.append("<b>")
result.append(s[i])
# End of bold region
if bold[i] and (i == n - 1 or not bold[i + 1]):
result.append("</b>")
return "".join(result)
The implementation follows the algorithm directly.
The bold array is the core data structure. It stores whether each character belongs to any matched word.
The first nested loop scans every position and every word. Whenever a match occurs, the corresponding indices are marked True.
The second traversal constructs the output string.
The condition:
bold[i] and (i == 0 or not bold[i - 1])
detects the start of a bold region.
Similarly:
bold[i] and (i == n - 1 or not bold[i + 1])
detects the end of a bold region.
Because tags are inserted only at transitions, overlapping and consecutive intervals are merged automatically.
Go Solution
func addBoldTag(s string, words []string) string {
n := len(s)
bold := make([]bool, n)
// Mark bold positions
for i := 0; i < n; i++ {
for _, word := range words {
wordLen := len(word)
if i+wordLen <= n && s[i:i+wordLen] == word {
for j := i; j < i+wordLen; j++ {
bold[j] = true
}
}
}
}
// Build result
result := make([]byte, 0)
for i := 0; i < n; i++ {
// Start bold region
if bold[i] && (i == 0 || !bold[i-1]) {
result = append(result, []byte("<b>")...)
}
result = append(result, s[i])
// End bold region
if bold[i] && (i == n-1 || !bold[i+1]) {
result = append(result, []byte("</b>")...)
}
}
return string(result)
}
The Go implementation mirrors the Python logic closely.
One important difference is bounds handling. In Go, slicing beyond the string length causes a runtime panic, so we explicitly check:
i + wordLen <= n
before comparing substrings.
The result is constructed using a byte slice for efficiency, since repeated string concatenation in Go can be expensive.
Worked Examples
Example 1
Input:
s = "abcxyz123"
words = ["abc", "123"]
Step 1: Initialize bold array
| Index | Character | Bold |
|---|---|---|
| 0 | a | False |
| 1 | b | False |
| 2 | c | False |
| 3 | x | False |
| 4 | y | False |
| 5 | z | False |
| 6 | 1 | False |
| 7 | 2 | False |
| 8 | 3 | False |
Step 2: Find matches
"abc" matches at index 0
Mark indices [0,1,2]
"123" matches at index 6
Mark indices [6,7,8]
Updated array:
| Index | Character | Bold |
|---|---|---|
| 0 | a | True |
| 1 | b | True |
| 2 | c | True |
| 3 | x | False |
| 4 | y | False |
| 5 | z | False |
| 6 | 1 | True |
| 7 | 2 | True |
| 8 | 3 | True |
Step 3: Build result
| Index | Action | Result |
|---|---|---|
| 0 | Open <b> |
<b> |
| 0 | Append a |
<b>a |
| 1 | Append b |
<b>ab |
| 2 | Append c |
<b>abc |
| 2 | Close </b> |
<b>abc</b> |
| 3-5 | Append xyz | <b>abc</b>xyz |
| 6 | Open <b> |
<b>abc</b>xyz<b> |
| 6-8 | Append 123 | <b>abc</b>xyz<b>123 |
| 8 | Close </b> |
<b>abc</b>xyz<b>123</b> |
Final output:
<b>abc</b>xyz<b>123</b>
Example 2
Input:
s = "aaabbb"
words = ["aa", "b"]
Step 1: Match "aa"
Occurrences:
- indices
[0,1] - indices
[1,2]
Current bold array:
| Index | Character | Bold |
|---|---|---|
| 0 | a | True |
| 1 | a | True |
| 2 | a | True |
| 3 | b | False |
| 4 | b | False |
| 5 | b | False |
Step 2: Match "b"
Occurrences:
- index
3 - index
4 - index
5
Updated array:
| Index | Character | Bold |
|---|---|---|
| 0 | a | True |
| 1 | a | True |
| 2 | a | True |
| 3 | b | True |
| 4 | b | True |
| 5 | b | True |
Now every character is bold.
Step 3: Build result
We enter bold mode at index 0 and never leave it until the end.
Final output:
<b>aaabbb</b>
Complexity Analysis
| Measure | Complexity | Explanation |
|---|---|---|
| Time | O(n × m × k) | For each position, we compare against every word |
| Space | O(n) | The boolean array stores bold status for each character |
The dominant cost comes from substring comparisons.
For each of the n positions, we may compare against all m words, and each comparison may examine up to k characters.
The additional space usage is linear because the bold array stores one boolean per character.
Test Cases
from typing import List
class Solution:
def addBoldTag(self, s: str, words: List[str]) -> str:
n = len(s)
bold = [False] * n
for i in range(n):
for word in words:
if s[i:i + len(word)] == word:
for j in range(i, i + len(word)):
if j < n:
bold[j] = True
result = []
for i in range(n):
if bold[i] and (i == 0 or not bold[i - 1]):
result.append("<b>")
result.append(s[i])
if bold[i] and (i == n - 1 or not bold[i + 1]):
result.append("</b>")
return "".join(result)
sol = Solution()
assert sol.addBoldTag("abcxyz123", ["abc", "123"]) == "<b>abc</b>xyz<b>123</b>" # basic separate matches
assert sol.addBoldTag("aaabbb", ["aa", "b"]) == "<b>aaabbb</b>" # overlapping and consecutive merge
assert sol.addBoldTag("abcdef", ["gh"]) == "abcdef" # no matches
assert sol.addBoldTag("aaaaa", ["aa"]) == "<b>aaaaa</b>" # repeated overlapping matches
assert sol.addBoldTag("abc", ["abc"]) == "<b>abc</b>" # entire string bold
assert sol.addBoldTag("abc", []) == "abc" # empty words array
assert sol.addBoldTag("a", ["a"]) == "<b>a</b>" # single character match
assert sol.addBoldTag("a", ["b"]) == "a" # single character no match
assert sol.addBoldTag("abcxyz", ["abc", "xyz"]) == "<b>abc</b><b>xyz</b>" # consecutive regions remain merged by adjacency rule
assert sol.addBoldTag("abababcd", ["ab", "abcd"]) == "<b>abababcd</b>" # nested and overlapping intervals
| Test | Why |
|---|---|
"abcxyz123", ["abc", "123"] |
Validates separate bold regions |
"aaabbb", ["aa", "b"] |
Validates overlap and adjacency merging |
"abcdef", ["gh"] |
Validates no-match behavior |
"aaaaa", ["aa"] |
Validates repeated overlapping matches |
"abc", ["abc"] |
Validates whole-string bolding |
"abc", [] |
Validates empty dictionary handling |
"a", ["a"] |
Validates smallest matching input |
"a", ["b"] |
Validates smallest non-matching input |
"abcxyz", ["abc", "xyz"] |
Validates multiple independent regions |
"abababcd", ["ab", "abcd"] |
Validates nested overlaps |
Edge Cases
One important edge case is overlapping matches. For example, in "aaaaa" with ["aa"], matches occur at indices [0,1], [1,2], [2,3], and [3,4]. A naive implementation might generate multiple nested tags. The boolean marking approach avoids this entirely because all overlapping positions simply become True, producing one continuous bold interval.
Another critical edge case is consecutive matches. Suppose one word ends exactly where another begins. The problem requires consecutive bold sections to merge into one larger section. Because the algorithm inserts tags only when transitioning between bold and non-bold characters, adjacent matches automatically combine into a single region.
A third important case is when no words match the string at all. Some implementations accidentally insert empty tags or produce malformed output. In this solution, the bold array remains entirely False, so no tags are ever inserted, and the original string is returned unchanged.
A final subtle edge case occurs when the entire string becomes bold. Incorrect implementations sometimes forget to close the final </b> tag. This solution explicitly checks whether the current index is the last character in the string, ensuring the closing tag is inserted correctly at the end.