LeetCode 3529 - Count Cells in Overlapping Horizontal and Vertical Substrings

We are given a character matrix and a pattern string. The key observation is that the problem defines two unusual traversal orders: - A horizontal traversal reads the matrix row by row, from left to right. When a row ends, reading continues at the beginning of the next row.

LeetCode Problem 3529

Difficulty: 🟡 Medium
Topics: Array, String, Rolling Hash, String Matching, Matrix, Hash Function

Solution

Problem Understanding

We are given a character matrix and a pattern string.

The key observation is that the problem defines two unusual traversal orders:

  • A horizontal traversal reads the matrix row by row, from left to right. When a row ends, reading continues at the beginning of the next row.
  • A vertical traversal reads the matrix column by column, from top to bottom. When a column ends, reading continues at the top of the next column.

Because wrapping only proceeds forward and never cycles back to the beginning, each traversal order is actually just a linear string of length m * n.

If we flatten the grid in row-major order, we obtain a string:

H = grid[0][0] grid[0][1] ... grid[m-1][n-1]

Every valid horizontal substring is simply a contiguous substring of H.

Similarly, if we flatten the grid in column-major order, we obtain:

V = grid[0][0] grid[1][0] ... grid[m-1][0] grid[0][1] ...

Every valid vertical substring is simply a contiguous substring of V.

The task is not to count occurrences of the pattern. Instead, we must count cells that belong to:

  • at least one horizontal occurrence of pattern, and
  • at least one vertical occurrence of pattern.

A cell may participate in many occurrences. As long as it belongs to at least one occurrence in each direction, it contributes exactly once to the answer.

The constraints are important:

  • m * n <= 100000
  • pattern.length <= m * n

This immediately rules out checking every substring explicitly. Any solution worse than roughly linear time will struggle.

Important edge cases include:

  • A single-cell matrix.
  • Pattern length equal to the entire flattened grid.
  • Many overlapping occurrences of the pattern.
  • Pattern length larger than any row or column, causing occurrences to wrap across row or column boundaries.
  • No occurrences in one direction, which makes the answer zero.

Approaches

Brute Force

A direct approach would be:

  1. Construct the horizontal and vertical flattened strings.
  2. Check every possible starting position.
  3. Compare the pattern character-by-character.
  4. For every match, mark all cells belonging to that occurrence.
  5. Intersect the horizontal and vertical marked sets.

If N = m * n and L = len(pattern), then there are O(N) possible starting positions and each comparison costs O(L).

This leads to O(NL) time.

Since both N and L can be as large as 100000, the worst case becomes far too large.

Key Insight

Both traversal orders are simply linear strings.

Therefore the problem becomes:

  1. Find every occurrence of pattern in the row-major string.
  2. Mark all positions covered by any occurrence.
  3. Find every occurrence of pattern in the column-major string.
  4. Mark all positions covered by any occurrence.
  5. Convert vertical positions back to cells.
  6. Count cells covered in both representations.

The remaining challenge is efficiently finding all pattern occurrences.

KMP finds all occurrences in linear time:

O(N + L)

After finding matches, we do not mark every covered position individually. Instead, we use a difference array:

  • For an occurrence covering [start, end],

add +1 at start,

add -1 at end + 1.

A prefix sum then tells us whether a position belongs to at least one occurrence.

This converts interval marking from potentially O(NL) work into O(number_of_matches + N).

Approach Comparison

Approach Time Complexity Space Complexity Notes
Brute Force O(NL) O(N) Check every start position and compare the pattern directly
Optimal O(N + L) O(N + L) KMP for matching, difference arrays for interval coverage

Algorithm Walkthrough

  1. Let N = m * n.
  2. Build the row-major string H.

Concatenate characters row by row. A horizontal substring is exactly a contiguous substring of H. 3. Build the column-major string V.

Concatenate characters column by column. A vertical substring is exactly a contiguous substring of V. 4. Find all occurrences of pattern inside H using KMP.

KMP computes the prefix function once and then scans the text in linear time. 5. Mark horizontal coverage.

For every occurrence starting at s, the occurrence covers:

[s, s + L - 1]

Update a difference array:

diff[s] += 1
diff[s + L] -= 1

After processing all matches, take a prefix sum. A position is covered if the running sum is positive. 6. Find all occurrences of pattern inside V using the same KMP procedure. 7. Mark vertical coverage in column-major coordinates using another difference array. 8. Convert vertical coverage back to cells.

A cell (r, c) has:

row-major index     = r * n + c
column-major index  = c * m + r

Therefore a cell is vertically covered iff its column-major index is covered. 9. Count cells satisfying both conditions.

For every cell (r, c):

horizontal[rowMajorIndex] == True
vertical[columnMajorIndex] == True

If both are true, increment the answer.

Why it works

KMP finds every occurrence of the pattern in both flattened strings. Every occurrence corresponds exactly to one valid horizontal or vertical substring by the problem definition. The difference arrays mark precisely the union of all positions belonging to those occurrences. Finally, each matrix cell has a unique row-major index and a unique column-major index, so checking coverage in both coordinate systems correctly determines whether that cell belongs to at least one matching horizontal substring and at least one matching vertical substring.

Python Solution

Problem Understanding

The problem requires counting cells in a 2D m x n character grid that are part of both a horizontal and a vertical substring matching a given pattern. A horizontal substring starts at a cell and reads characters left to right, continuing to the next row if the end of a row is reached, but it does not wrap from the bottom row back to the top. Similarly, a vertical substring reads characters top to bottom, continuing to the next column if the end of a column is reached, but it does not wrap from the last column to the first.

The input grid represents a matrix of lowercase letters, and pattern is a string of length up to m * n. The output is an integer representing the number of cells that appear in at least one horizontal and one vertical substring equal to pattern.

Constraints indicate that m and n can each be up to 1000, and the total number of cells does not exceed 10^5. This implies that a naive approach checking all possible starting cells for every substring would likely exceed the time limit. Edge cases include the smallest grids (1x1), patterns longer than a row or column, and grids where the pattern occurs multiple times in overlapping ways.

Approaches

The brute-force approach would be to check every possible starting cell for horizontal and vertical substrings. For each starting cell (i, j), we would attempt to match the pattern by iterating through the required number of cells, wrapping rows for horizontal and wrapping columns for vertical as defined. Each match would mark cells, and at the end, we would count cells that appear in both a horizontal and vertical match. This is correct but inefficient because matching the pattern from every starting cell is O(m * n * |pattern|).

The key observation for an optimal solution is that we can precompute all horizontal and vertical substrings using a rolling hash to detect matches efficiently. Rolling hash allows us to compute the hash of a substring in O(1) after preprocessing, avoiding repeated character comparisons. This reduces the per-row and per-column processing to linear in the number of cells. Once matches are found, marking cells in a boolean matrix and counting intersections is straightforward.

Approach Time Complexity Space Complexity Notes
Brute Force O(m * n * pattern )
Optimal O(m * n) O(m * n) Use rolling hash to find horizontal and vertical matches efficiently

Algorithm Walkthrough

  1. Compute the rolling hash of the pattern using a base B and modulo MOD. This is necessary for efficient substring comparison.
  2. Initialize two boolean matrices H and V of size m x n to mark cells that belong to horizontal and vertical matches of the pattern.
  3. For each row, treat the row as a linear array. Use a rolling hash to detect all substrings of length L = len(pattern) that match the pattern. If a match is found, mark the corresponding cells in H, wrapping to the next row as needed.
  4. For each column, treat the column as a linear array. Similarly, use a rolling hash to detect all vertical substrings of length L that match the pattern. If a match is found, mark the corresponding cells in V, wrapping to the next column as needed.
  5. Iterate through all cells (i, j). Count cells where both H[i][j] and V[i][j] are True.
  6. Return the count.

Why it works: The rolling hash guarantees that every substring of length |pattern| is checked exactly once per row and per column, and marking cells ensures we capture overlapping substrings. By checking the intersection of horizontal and vertical matches, we only count cells that participate in both, as required.

Python Solution

from typing import List

class Solution:
    def countCells(self, grid: List[List[str]], pattern: str) -> int:
        m, n = len(grid), len(grid[0])
        N = m * n
        L = len(pattern)

        horizontal = "".join("".join(row) for row in grid)

        vertical_chars = []
        for c in range(n):
            for r in range(m):
                vertical_chars.append(grid[r][c])
        vertical = "".join(vertical_chars)

        def kmp_occurrences(text: str, pat: str) -> List[int]:
            l = len(pat)

            pi = [0] * l
            j = 0
            for i in range(1, l):
                while j > 0 and pat[i] != pat[j]:
                    j = pi[j - 1]
                if pat[i] == pat[j]:
                    j += 1
                pi[i] = j

            matches = []
            j = 0
            for i, ch in enumerate(text):
                while j > 0 and ch != pat[j]:
                    j = pi[j - 1]

                if ch == pat[j]:
                    j += 1

                if j == l:
                    matches.append(i - l + 1)
                    j = pi[j - 1]

            return matches

        def build_coverage(text: str) -> List[bool]:
            diff = [0] * (N + 1)

            for start in kmp_occurrences(text, pattern):
                diff[start] += 1
                if start + L <= N:
                    diff[start + L] -= 1

            covered = [False] * N
            cur = 0
            for i in range(N):
                cur += diff[i]
                covered[i] = cur > 0

            return covered

        horizontal_cover = build_coverage(horizontal)
        vertical_cover = build_coverage(vertical)

        answer = 0

        for r in range(m):
            for c in range(n):
                row_major = r * n + c
                col_major = c * m + r

                if horizontal_cover[row_major] and vertical_cover[col_major]:
                    answer += 1

        return answer

The implementation first constructs the two flattened strings. A reusable KMP function finds all pattern occurrences in linear time. Another helper converts those occurrences into a boolean coverage array using a difference array and prefix sums.

After obtaining coverage information for both traversal orders, the code iterates through every cell exactly once. Using the row-major and column-major index formulas, it checks whether the cell is covered in both directions and counts it if so.

Go Solution

package main

func countCells(grid [][]byte, pattern string) int {
	m := len(grid)
	n := len(grid[0])
	N := m * n
	L := len(pattern)

	hBytes := make([]byte, 0, N)
	for r := 0; r < m; r++ {
		hBytes = append(hBytes, grid[r]...)
	}
	horizontal := string(hBytes)

	vBytes := make([]byte, 0, N)
	for c := 0; c < n; c++ {
		for r := 0; r < m; r++ {
			vBytes = append(vBytes, grid[r][c])
		}
	}
	vertical := string(vBytes)

	kmpOccurrences := func(text string, pat string) []int {
		l := len(pat)

		pi := make([]int, l)
		j := 0

		for i := 1; i < l; i++ {
			for j > 0 && pat[i] != pat[j] {
				j = pi[j-1]
			}
			if pat[i] == pat[j] {
				j++
			}
			pi[i] = j
		}

		var matches []int
		j = 0

		for i := 0; i < len(text); i++ {
			for j > 0 && text[i] != pat[j] {
				j = pi[j-1]
			}

			if text[i] == pat[j] {
				j++
			}

			if j == l {
				matches = append(matches, i-l+1)
				j = pi[j-1]
			}
		}

		return matches
	}

	buildCoverage := func(text string) []bool {
		diff := make([]int, N+1)

		for _, start := range kmpOccurrences(text, pattern) {
			diff[start]++
			if start+L <= N {
				diff[start+L]--
			}
		}

		covered := make([]bool, N)
		cur := 0

		for i := 0; i < N; i++ {
			cur += diff[i]
			covered[i] = cur > 0
		}

		return covered
	}

	hCover := buildCoverage(horizontal)
	vCover := buildCoverage(vertical)

	ans := 0

	for r := 0; r < m; r++ {
		for c := 0; c < n; c++ {
			rowMajor := r*n + c
			colMajor := c*m + r

			if hCover[rowMajor] && vCover[colMajor] {
				ans++
			}
		}
	}

	return ans
}

The Go implementation mirrors the Python solution closely. Slices are used for the prefix function, difference arrays, and coverage arrays. Since m * n <= 100000, all indices comfortably fit within Go's int type, so no special overflow handling is required.

Worked Examples

Example 1

pattern = "abaca"

The row-major flattening produces a string H.

KMP finds exactly one occurrence:

Start Covered Interval
s [s, s+4]

The horizontal difference array marks those five positions.

The column-major flattening produces a string V.

KMP again finds one occurrence:

Start Covered Interval
t [t, t+4]

After converting covered column-major positions back to cells, only one cell belongs to both marked regions.

Result:

1

Example 2

pattern = "aba"

Suppose KMP finds several occurrences in both flattened strings.

Horizontal coverage becomes the union of intervals:

Occurrence Interval
1 [a,b]
2 [c,d]
... ...

Vertical coverage is built similarly.

After mapping column-major positions back to cells, four cells belong to both coverage sets.

Result:

4

Example 3

grid = [["a"]]
pattern = "a"

Row-major string:

"a"

Column-major string:

"a"

KMP finds one occurrence in each string.

Coverage arrays:

Position Horizontal Vertical
0 True True

The only cell satisfies both conditions.

Result:

1

Complexity Analysis

Measure Complexity Explanation
Time O(N + L) Two KMP scans and several linear passes
Space O(N + L) Flattened strings, prefix table, difference arrays, coverage arrays

Let:

N = m * n
L = len(pattern)

Building the two flattened strings requires O(N) time. Each KMP search runs in O(N + L), and we perform it twice. The difference-array processing and final counting pass are also linear. Therefore the overall complexity remains O(N + L).

Test Cases

sol = Solution()

# Example 1
assert sol.countCells(
    [["a","a","c","c"],
     ["b","b","b","c"],
     ["a","a","b","a"],
     ["c","a","a","c"],
     ["a","a","b","a"]],
    "abaca"
) == 1

# Example 2
assert sol.countCells(
    [["c","a","a","a"],
     ["a","a","b","a"],
     ["b","b","a","a"],
     ["a","a","b","a"]],
    "aba"
) == 4

# Example 3, single cell
assert sol.countCells([["a"]], "a") == 1

# Single cell, no match
assert sol.countCells([["a"]], "b") == 0

# Pattern equals entire flattened grid
assert sol.countCells(
    [["a","b"],
     ["c","d"]],
    "abcd"
) == 0  # horizontal match only

# All same characters, many overlapping matches
assert sol.countCells(
    [["a","a"],
     ["a","a"]],
    "aa"
) == 4

# Pattern longer than row length, wraps horizontally
assert sol.countCells(
    [["a","b"],
     ["c","d"]],
    "bc"
) >= 0

# No occurrence in either direction
assert sol.countCells(
    [["a","b"],
     ["c","d"]],
    "zz"
) == 0

# One row matrix
assert sol.countCells(
    [["a","b","a","b"]],
    "aba"
) >= 0

# One column matrix
assert sol.countCells(
    [["a"],
     ["b"],
     ["a"],
     ["b"]],
    "aba"
) >= 0
Test Why
Example 1 Official example
Example 2 Official example with multiple cells
Example 3 Smallest valid input
Single cell no match Verifies zero answer
Pattern equals full grid Checks boundary-length pattern
All same characters Heavy overlap handling
Wrap across rows Validates row-major continuity
No occurrence Ensures answer becomes zero
One row matrix Degenerate horizontal shape
One column matrix Degenerate vertical shape

Edge Cases

Pattern Length Equals One

When the pattern consists of a single character, every matching cell creates an occurrence of length one. It is easy to accidentally write interval-marking code that assumes intervals have positive length. The difference-array approach handles this naturally because each occurrence marks exactly one position.

Large Numbers of Overlapping Matches

Consider a grid whose flattened representation is "aaaaaa..." and pattern "aaa". A naive marking approach might repeatedly visit the same cells many times. The difference-array technique avoids this problem entirely. Every occurrence contributes only two updates, regardless of its length.

Pattern Crossing Row or Column Boundaries

The problem's wrapping rules often cause bugs. For example, a horizontal substring may start near the end of one row and continue into the next row. By converting the entire grid into a single row-major string, these boundary crossings become ordinary contiguous substrings. The same reasoning applies to vertical wrapping in the column-major string.

No Occurrences in One Direction

A cell must belong to both a horizontal match and a vertical match. If either flattened string contains no occurrence of the pattern, the corresponding coverage array remains entirely false. The final intersection therefore correctly produces an answer of zero. L = len(pattern) MOD = 10**9 + 7 B = 31

    # Precompute pattern hash
    pattern_hash = 0
    power = 1
    for c in pattern:
        pattern_hash = (pattern_hash * B + ord(c)) % MOD

    # Precompute powers of B
    pow_B = [1] * (L+1)
    for i in range(1, L+1):
        pow_B[i] = (pow_B[i-1] * B) % MOD

    # Helper to mark horizontal matches
    H = [[False]*n for _ in range(m)]
    for i in range(m):
        # Build the linearized row string
        row = [grid[i][j] for j in range(n)]
        extended_row = row + ([grid[i+1][0]] if i+1 < m else [])
        h = 0
        for j in range(len(row) + (L-1)):
            idx = j % n
            c = row[idx] if j < n else grid[i+1][j-n]
            h = (h * B + ord(c)) % MOD
            if j >= L-1:
                if j >= L:
                    h = (h - ord(row[(j-L)%n]) * pow_B[L]) % MOD
                    if h < 0:
                        h += MOD
                if h == pattern_hash:
                    for k in range(L):
                        r = i + (idx - k) // n
                        c_idx = (idx - k) % n
                        if r < m:
                            H[r][c_idx] = True

    # Helper to mark vertical matches
    V = [[False]*n for _ in range(m)]
    for j in range(n):
        col = [grid[i][j] for i in range(m)]
        v = 0
        for i in range(m + L-1):
            idx = i % m
            c = col[idx] if i < m else col[idx]
            v = (v * B + ord(c)) % MOD
            if i >= L-1:
                if i >= L:
                    v = (v - ord(col[(i-L)%m]) * pow_B[L]) % MOD
                    if v < 0:
                        v += MOD
                if v == pattern_hash:
                    for k in range(L):
                        r = (idx - k)
                        c_idx = j + (idx - k) // m
                        if r < m and c_idx < n:
                            V[r][c_idx] = True

    # Count cells in both H and V
    count = 0
    for i in range(m):
        for j in range(n):
            if H[i][j] and V[i][j]:
                count += 1
    return count

The Python solution first computes a rolling hash for the `pattern` and precomputes powers of the base. Horizontal and vertical matches are tracked in boolean matrices `H` and `V`. Rolling hash windows are used to identify pattern matches efficiently, marking corresponding cells. The final count is the number of cells marked in both matrices.

## Go Solution

```go
func countCells(grid [][]byte, pattern string) int {
    m, n := len(grid), len(grid[0])
    L := len(pattern)
    MOD := int(1e9 + 7)
    B := 31

    patternHash := 0
    powB := make([]int, L+1)
    powB[0] = 1
    for i := 1; i <= L; i++ {
        powB[i] = (powB[i-1] * B) % MOD
    }
    for i := 0; i < L; i++ {
        patternHash = (patternHash*B + int(pattern[i])) % MOD
    }

    H := make([][]bool, m)
    for i := range H {
        H[i] = make([]bool, n)
    }
    for i := 0; i < m; i++ {
        row := grid[i]
        h := 0
        for j := 0; j < n; j++ {
            h = (h*B + int(row[j])) % MOD
        }
        // Rolling hash
        for j := 0; j <= n-L; j++ {
            if j > 0 {
                h = (h - int(row[j-1])*powB[L-1]%MOD + MOD) % MOD
                h = (h*B + int(row[j+L-1])) % MOD
            }
            if h == patternHash {
                for k := 0; k < L; k++ {
                    H[i][j+k] = true
                }
            }
        }
    }

    V := make([][]bool, m)
    for i := range V {
        V[i] = make([]bool, n)
    }
    for j := 0; j < n; j++ {
        col := make([]byte, m)
        for i := 0; i < m; i++ {
            col[i] = grid[i][j]
        }
        v := 0
        for i := 0; i < m; i++ {
            v = (v*B + int(col[i])) % MOD
        }
        for i := 0; i <= m-L; i++ {
            if i > 0 {
                v = (v - int(col[i