LeetCode 770 - Basic Calculator IV

The problem requires parsing and evaluating a mathematical expression that includes integers, variables, addition, subtraction, multiplication, and parentheses, while also applying a substitution map for certain variables.

LeetCode Problem 770

Difficulty: 🔴 Hard
Topics: Hash Table, Math, String, Stack, Recursion

Solution

Problem Understanding

The problem requires parsing and evaluating a mathematical expression that includes integers, variables, addition, subtraction, multiplication, and parentheses, while also applying a substitution map for certain variables. Essentially, we are building a symbolic algebra engine that simplifies expressions according to standard arithmetic rules.

The input consists of a string expression that may contain integers, variables (alphabetic strings), operators (+, -, *), and parentheses. Additionally, two arrays evalvars and evalints provide values for specific variables; any variable not in this map remains symbolic.

The output is a list of terms in a standardized form where each term consists of a coefficient followed by zero or more variables multiplied together in lexicographical order. Terms are ordered first by decreasing degree (number of variables) and then lexicographically by variable names. Coefficients of zero are omitted.

Key constraints include:

  • Expression length up to 250 characters, which is modest.
  • Up to 100 variables in the evaluation map.
  • Valid arithmetic expression guaranteed.
  • Intermediate results fit in a 32-bit integer.

Edge cases that could trip up a naive implementation include:

  • Nested parentheses, which require recursive or stack-based evaluation.
  • Multiplying symbolic expressions, which produces cross terms.
  • Zero coefficients, which must be omitted.
  • Ordering of variables and terms, which affects output formatting.

Approaches

A brute-force approach would be to evaluate the expression by converting it into a standard infix tree and then recursively computing all possible combinations while substituting known variables. This method would be correct but slow due to repeated symbolic multiplications and the need to manage multiple terms manually.

The optimal approach leverages a map (dictionary) to represent terms. Each term is stored as a key-value pair, where the key is a sorted tuple of variables and the value is the coefficient. Addition and subtraction merge dictionaries by summing coefficients, while multiplication performs a cross-product of terms. Using recursion with a tokenizer handles parentheses correctly. After evaluating, terms are sorted by degree and lexicographical order to produce the final output.

Approach Time Complexity Space Complexity Notes
Brute Force Exponential Exponential Recursively expands all symbolic terms without aggregation, slow for nested expressions
Optimal O(N * M^2) O(M) Uses dictionary for term aggregation; N is the number of tokens, M is the number of unique terms generated during evaluation

Algorithm Walkthrough

  1. Tokenization: Split the expression into tokens separated by spaces, distinguishing integers, variables, operators, and parentheses.
  2. Evaluation Map Construction: Build a dictionary mapping evalvars[i] to evalints[i] for constant substitution.
  3. Recursive Parsing Function: Implement a recursive parse() function that evaluates tokens from left to right, respecting parentheses.
  4. Term Representation: Represent each term as a pair (coefficient, tuple_of_sorted_variables). For constants, the tuple is empty.
  5. Addition and Subtraction: Merge two term maps by summing coefficients for identical variable tuples.
  6. Multiplication: Compute a cross-product of terms from the two operands, concatenate and sort the variable tuples, and multiply coefficients.
  7. Parentheses Handling: When encountering '(', recursively evaluate the subexpression until the matching ')'.
  8. Formatting Output: After evaluation, filter out zero-coefficient terms. Sort terms first by degree descending, then lexicographically by variables. Convert each term into the required string format.
  9. Return Result: Return the final list of strings representing simplified terms.

Why it works: This approach correctly models algebraic operations at the term level, ensuring that like terms are combined and multiplication generates the correct cross terms. Recursion and tokenization handle operator precedence and parentheses naturally, guaranteeing correct simplification.

Python Solution

from typing import List
from collections import defaultdict
import re

class Solution:
    def basicCalculatorIV(self, expression: str, evalvars: List[str], evalints: List[int]) -> List[str]:
        eval_map = dict(zip(evalvars, evalints))

        def tokenize(expr: str):
            return expr.replace('(', ' ( ').replace(')', ' ) ').split()

        def add_terms(a, b):
            res = defaultdict(int, a)
            for k, v in b.items():
                res[k] += v
            return res

        def sub_terms(a, b):
            res = defaultdict(int, a)
            for k, v in b.items():
                res[k] -= v
            return res

        def mul_terms(a, b):
            res = defaultdict(int)
            for k1, v1 in a.items():
                for k2, v2 in b.items():
                    new_key = tuple(sorted(k1 + k2))
                    res[new_key] += v1 * v2
            return res

        def parse(tokens):
            stack = []
            op_stack = []
            i = 0

            def apply_op():
                if not op_stack:
                    return
                op = op_stack.pop()
                b = stack.pop()
                a = stack.pop()
                if op == '+':
                    stack.append(add_terms(a, b))
                elif op == '-':
                    stack.append(sub_terms(a, b))
                elif op == '*':
                    stack.append(mul_terms(a, b))

            while i < len(tokens):
                token = tokens[i]
                if token.isdigit():
                    stack.append({(): int(token)})
                elif token.isalpha():
                    if token in eval_map:
                        stack.append({(): eval_map[token]})
                    else:
                        stack.append({(token,): 1})
                elif token == '(':
                    j = i
                    count = 0
                    for j in range(i, len(tokens)):
                        if tokens[j] == '(':
                            count += 1
                        elif tokens[j] == ')':
                            count -= 1
                            if count == 0:
                                break
                    stack.append(parse(tokens[i+1:j]))
                    i = j
                elif token in '+-*':
                    while op_stack and precedence(op_stack[-1]) >= precedence(token):
                        apply_op()
                    op_stack.append(token)
                i += 1

            while op_stack:
                apply_op()

            return stack[0]

        def precedence(op):
            if op == '*':
                return 2
            return 1

        tokens = tokenize(expression)
        terms = parse(tokens)

        def format_term(term):
            coeff, vars_tuple = term
            if vars_tuple:
                return f"{coeff}*" + "*".join(vars_tuple)
            return str(coeff)

        result = [(v, k) for k, v in terms.items() if v != 0]
        result.sort(key=lambda x: (-len(x[1]), x[1]))
        return [format_term(t) for t in result]

This Python solution follows the algorithm precisely. Tokenization ensures parentheses are handled correctly. Terms are represented as tuples for variable names and integers for coefficients, allowing easy merging. Addition, subtraction, and multiplication are implemented as dictionary operations, ensuring efficiency and correctness. Sorting and formatting produce the final string output.

Go Solution

package main

import (
    "sort"
    "strconv"
    "strings"
)

func basicCalculatorIV(expression string, evalvars []string, evalints []int) []string {
    evalMap := make(map[string]int)
    for i, v := range evalvars {
        evalMap[v] = evalints[i]
    }

    type term struct {
        coeff int
        vars  []string
    }

    tokenize := func(expr string) []string {
        expr = strings.ReplaceAll(expr, "(", " ( ")
        expr = strings.ReplaceAll(expr, ")", " ) ")
        return strings.Fields(expr)
    }

    addTerms := func(a, b map[string]int) map[string]int {
        res := make(map[string]int)
        for k, v := range a {
            res[k] = v
        }
        for k, v := range b {
            res[k] += v
        }
        return res
    }

    subTerms := func(a, b map[string]int) map[string]int {
        res := make(map[string]int)
        for k, v := range a {
            res[k] = v
        }
        for k, v := range b {
            res[k] -= v
        }
        return res
    }

    mulTerms := func(a, b map[string]int) map[string]int {
        res := make(map[string]int)
        for k1, v1 := range a {
            for k2, v2 := range b {
                combined := strings.Split(k1, "*")
                if k1 == "" {
                    combined = []string{}
                }
                k2split := strings.Split(k2, "*")
                if k2 == "" {
                    k2split = []string{}
                }
                combined = append(combined, k2split...)
                sort.Strings(combined)
                key := strings.Join(combined, "*")
                res[key] += v1 * v2
            }
        }
        return res
    }

    var parse func(tokens []string) map[string]int
    parse = func(tokens []string) map[string]int {
        stack := []map[string]int{}
        ops := []string{}
        i := 0

        precedence := func(op string) int {
            if op == "*" {
                return 2
            }
            return 1
        }

        applyOp := func() {
            if len(ops) == 0 {
                return
            }
            op := ops[len(ops)-1]
            ops = ops[:len(ops)-1]
            b := stack[len(stack)-1]
            stack = stack[:len