LeetCode 2732 - Find a Good Subset of the Matrix

The problem asks us to find a subset of rows in a binary matrix where each column’s sum in that subset is at most half of the number of rows chosen, rounded down.

LeetCode Problem 2732

Difficulty: 🔴 Hard
Topics: Array, Hash Table, Bit Manipulation, Matrix

Solution

Problem Understanding

The problem asks us to find a subset of rows in a binary matrix where each column’s sum in that subset is at most half of the number of rows chosen, rounded down. In simpler terms, if you pick some rows, no column should have too many 1s compared to the number of rows you selected. The input is a m x n binary matrix grid, where each element is either 0 or 1. The output is a list of row indices that form a "good" subset, sorted in ascending order. If multiple subsets satisfy the condition, any valid subset can be returned, and if no subset works, the output should be an empty array.

Constraints highlight that the matrix can be large in terms of rows (m up to 10^4), but the number of columns is small (n <= 5). This suggests that any solution should leverage the small number of columns for efficiency, possibly via bit manipulation. Important edge cases include matrices with a single row, matrices full of 1s, and matrices where no subset satisfies the condition.

Approaches

The brute-force approach would involve generating all possible subsets of rows and checking each subset for the column sum condition. This is guaranteed to produce the correct answer but is computationally infeasible because there are $2^m$ possible subsets, which is far too large for $m$ up to 10^4.

The key insight is that with n <= 5, we can represent each row as a bitmask of length n. The subset condition translates into a check that the bitwise OR (or combination) of chosen row masks does not exceed half of the subset length in each column. Using a map to store masks seen so far, we can efficiently check for combinations of 1 or 2 rows that satisfy the condition. Specifically, a row with all zeros is trivially a good subset by itself, and any two rows that do not overlap in 1s in the same column can form a valid subset of length 2.

Approach Time Complexity Space Complexity Notes
Brute Force O(2^m * n) O(2^m) Generate all subsets of rows and check column sums
Optimal O(m * 2^n) O(2^n) Use bitmask representation of rows and hash map to find valid 1- or 2-row subsets efficiently

Algorithm Walkthrough

  1. Convert each row in the matrix to a bitmask integer representation, where the j-th bit represents the j-th column in the row. This allows fast bitwise operations.
  2. Create a dictionary mask_to_index mapping each bitmask to its row index. This helps to quickly retrieve rows that satisfy subset conditions.
  3. Check if any row’s mask is all zeros. If found, return that row index as a single-row good subset.
  4. Iterate over pairs of unique row masks. For each pair (mask1, mask2), check if their bitwise OR results in a mask that satisfies the condition mask | mask <= floor(k / 2) for a subset of length 2. With n <= 5, this is equivalent to checking if mask1 & mask2 == 0.
  5. If such a pair exists, return their row indices.
  6. If no valid subset is found, return an empty array.

Why it works: By representing each row as a bitmask, the subset condition can be checked efficiently using bitwise operations. With at most 5 columns, there are only 32 possible masks. This allows us to quickly determine if a single row or a pair of rows satisfies the condition without enumerating all subsets of rows.

Python Solution

from typing import List

class Solution:
    def goodSubsetofBinaryMatrix(self, grid: List[List[int]]) -> List[int]:
        mask_to_index = {}
        n = len(grid[0])
        
        for i, row in enumerate(grid):
            mask = 0
            for j, val in enumerate(row):
                mask |= val << j
            if mask == 0:
                return [i]
            mask_to_index[mask] = i
        
        masks = list(mask_to_index.keys())
        for i in range(len(masks)):
            for j in range(i+1, len(masks)):
                if masks[i] & masks[j] == 0:
                    return [mask_to_index[masks[i]], mask_to_index[masks[j]]]
        
        return []

The Python code first converts each row into a bitmask and checks for a row with all zeros. If none exist, it then checks all pairs of masks for disjoint 1s using bitwise AND. If a valid pair exists, it returns the corresponding indices. Otherwise, it returns an empty list.

Go Solution

func goodSubsetofBinaryMatrix(grid [][]int) []int {
    maskToIndex := make(map[int]int)
    n := len(grid[0])
    
    for i, row := range grid {
        mask := 0
        for j, val := range row {
            if val == 1 {
                mask |= 1 << j
            }
        }
        if mask == 0 {
            return []int{i}
        }
        maskToIndex[mask] = i
    }
    
    masks := make([]int, 0, len(maskToIndex))
    for mask := range maskToIndex {
        masks = append(masks, mask)
    }
    
    for i := 0; i < len(masks); i++ {
        for j := i+1; j < len(masks); j++ {
            if masks[i]&masks[j] == 0 {
                return []int{maskToIndex[masks[i]], maskToIndex[masks[j]]}
            }
        }
    }
    
    return []int{}
}

The Go implementation mirrors the Python version. It uses a map to store mask-to-index mappings and iterates over all pairs of masks. Bitwise operations ensure the column sum conditions are satisfied. Go’s int type is sufficient since n <= 5 and masks fit within a standard integer.

Worked Examples

Example: grid = [[0,1,1,0],[0,0,0,1],[1,1,1,1]]

Row Index Row Mask
0 [0,1,1,0] 6
1 [0,0,0,1] 8
2 [1,1,1,1] 15

Check single-row good subset: no mask is 0.

Check pairs:

Mask 6 & 8: 6 & 8 = 0 → valid subset → return [0,1]

Complexity Analysis

Measure Complexity Explanation
Time O(m * 2^n) Convert rows to masks takes O(m*n), checking pairs of masks takes O(2^n * 2^n) but n <= 5, so constant
Space O(2^n) Stores mapping of mask to index, max 32 entries for n=5

The small number of columns allows the algorithm to scale linearly with the number of rows, making it efficient even for large matrices.

Test Cases

# Provided examples
assert Solution().goodSubsetofBinaryMatrix([[0,1,1,0],[0,0,0,1],[1,1,1,1]]) == [0,1]  # multiple valid subsets
assert Solution().goodSubsetofBinaryMatrix([[0]]) == [0]  # single-row matrix
assert Solution().goodSubsetofBinaryMatrix([[1,1,1],[1,1,1]]) == []  # no valid subset

# Additional tests
assert Solution().goodSubsetofBinaryMatrix([[0,0,0],[0,1,0]]) == [0]  # row with all zeros exists
assert Solution().goodSubsetofBinaryMatrix([[1,0],[0,1]]) == [0,1]  # disjoint masks
assert Solution().goodSubsetofBinaryMatrix([[1,1],[1,1]]) == []  # all ones, no subset
Test Why
Single-row all zeros validates immediate good subset detection
Disjoint masks validates correct pair detection
All ones validates empty result for impossible subsets

Edge Cases

One important edge case is a matrix with a single row, which is trivially a good subset if it contains all zeros or no 1 exceeds the threshold. Another edge case is a matrix where all rows are ones; this cannot form a good subset, so the implementation must return an empty list. A third edge case is the presence of multiple rows with the same bitmask; the algorithm must correctly handle duplicates by storing only the last index but still return a valid subset if any two distinct indices are valid. The bitmask approach ensures that these edge cases are handled efficiently without iterating over all subsets.