LeetCode 2759 - Convert JSON String to Object

The problem asks us to take a JSON-formatted string str and convert it into the equivalent native data structure.

LeetCode Problem 2759

Difficulty: 🔴 Hard
Topics:

Solution

Problem Understanding

The problem asks us to take a JSON-formatted string str and convert it into the equivalent native data structure. This means if the string represents an object, array, number, boolean, or null, we need to return the corresponding Python object (dict, list, int/float, bool, or None) or Go equivalent (map[string]interface{}, []interface{}, numeric types, bool, or nil).

We are explicitly forbidden from using built-in JSON parsing methods such as JSON.parse in JavaScript or json.loads in Python, so the solution must manually tokenize and interpret the string. The input is guaranteed to be valid JSON, which simplifies handling of malformed input but still requires careful parsing of nested arrays and objects, string delimiters, numbers, booleans, and null values. The string may be large (up to 105 characters), so efficiency in time and space is important.

Important edge cases include a single primitive (true, false, null, a number, or a string), deeply nested objects or arrays, empty arrays [] and empty objects {}, and string values that may include digits or letters resembling other JSON types.

Approaches

A brute-force approach would be to recursively scan the string, manually trying to match patterns using string slicing or regex. While this could work, regex is insufficient for nested structures, and repeatedly slicing strings would be inefficient, leading to O(n^2) time in the worst case due to copying substrings.

The optimal approach is to implement a recursive descent parser, processing the string character by character. We maintain an index pointer and parse elements based on their starting character: { indicates an object, [ an array, " a string, digits or - a number, t/f a boolean, and n null. This method efficiently handles nested structures with a single pass and constant-time substring operations (by using pointers rather than slicing for nested parsing).

Approach Time Complexity Space Complexity Notes
Brute Force O(n^2) O(n) Uses repeated slicing or regex; inefficient for deep nesting
Optimal O(n) O(n) Recursive descent parser with index pointer, handles nesting efficiently

Algorithm Walkthrough

  1. Initialize an index pointer at the start of the string. The pointer will be used to traverse the string without creating new substrings unnecessarily.
  2. Define a recursive function parse_value that decides what type of value to parse based on the current character at the pointer.
  3. If the character is {, start parsing an object: increment the pointer, parse key-value pairs recursively until } is reached, and store results in a dictionary/map. Skip commas as delimiters.
  4. If the character is [, start parsing an array: increment the pointer, recursively parse elements until ] is reached, and store results in a list/slice. Skip commas as delimiters.
  5. If the character is ", parse a string literal until the closing ". All characters inside are treated as the string content since escapes are not present.
  6. If the character is a digit or -, parse a number: iterate through consecutive digits, optionally handling a decimal point, and convert to integer or float.
  7. If the character begins true, false, or null, parse the corresponding boolean or null value and advance the pointer accordingly.
  8. At the top level, call parse_value starting at index 0. After parsing, the pointer should reach the end of the string.
  9. Return the resulting object from parse_value.

Why it works: The parser examines the input string exactly once, recursively handling nested structures. Each character is processed according to JSON syntax, guaranteeing correct interpretation of objects, arrays, primitives, and null values. The pointer ensures no character is skipped or double-counted, preserving correctness.

Python Solution

class Solution:
    def parseJson(self, str: str) -> object:
        self.s = str
        self.i = 0
        n = len(str)

        def parse_value():
            if self.s[self.i] == '{':
                return parse_object()
            elif self.s[self.i] == '[':
                return parse_array()
            elif self.s[self.i] == '"':
                return parse_string()
            elif self.s[self.i] in '-0123456789':
                return parse_number()
            elif self.s[self.i:self.i+4] == 'true':
                self.i += 4
                return True
            elif self.s[self.i:self.i+5] == 'false':
                self.i += 5
                return False
            elif self.s[self.i:self.i+4] == 'null':
                self.i += 4
                return None

        def parse_object():
            obj = {}
            self.i += 1  # skip '{'
            while self.s[self.i] != '}':
                key = parse_string()
                self.i += 1  # skip ':'
                value = parse_value()
                obj[key] = value
                if self.s[self.i] == ',':
                    self.i += 1  # skip ','
            self.i += 1  # skip '}'
            return obj

        def parse_array():
            arr = []
            self.i += 1  # skip '['
            while self.s[self.i] != ']':
                arr.append(parse_value())
                if self.s[self.i] == ',':
                    self.i += 1  # skip ','
            self.i += 1  # skip ']'
            return arr

        def parse_string():
            self.i += 1  # skip opening '"'
            start = self.i
            while self.s[self.i] != '"':
                self.i += 1
            result = self.s[start:self.i]
            self.i += 1  # skip closing '"'
            return result

        def parse_number():
            start = self.i
            if self.s[self.i] == '-':
                self.i += 1
            while self.i < len(self.s) and self.s[self.i].isdigit():
                self.i += 1
            if self.i < len(self.s) and self.s[self.i] == '.':
                self.i += 1
                while self.i < len(self.s) and self.s[self.i].isdigit():
                    self.i += 1
                return float(self.s[start:self.i])
            return int(self.s[start:self.i])

        return parse_value()

The Python implementation follows the algorithm directly. Each parsing function updates self.i as it consumes characters. The recursive structure handles nested arrays and objects, while primitive types and strings are parsed inline without recursion.

Go Solution

package main

import (
	"strconv"
)

func parseJson(s string) interface{} {
	i := 0
	var parseValue func() interface{}
	var parseObject func() map[string]interface{}
	var parseArray func() []interface{}
	var parseString func() string
	var parseNumber func() interface{}

	parseValue = func() interface{} {
		switch s[i] {
		case '{':
			return parseObject()
		case '[':
			return parseArray()
		case '"':
			return parseString()
		default:
			if s[i] == 't' {
				i += 4
				return true
			} else if s[i] == 'f' {
				i += 5
				return false
			} else if s[i] == 'n' {
				i += 4
				return nil
			} else {
				return parseNumber()
			}
		}
	}

	parseObject = func() map[string]interface{} {
		obj := make(map[string]interface{})
		i++ // skip '{'
		for s[i] != '}' {
			key := parseString()
			i++ // skip ':'
			value := parseValue()
			obj[key] = value
			if s[i] == ',' {
				i++
			}
		}
		i++ // skip '}'
		return obj
	}

	parseArray = func() []interface{} {
		arr := []interface{}{}
		i++ // skip '['
		for s[i] != ']' {
			arr = append(arr, parseValue())
			if s[i] == ',' {
				i++
			}
		}
		i++ // skip ']'
		return arr
	}

	parseString = func() string {
		i++ // skip opening '"'
		start := i
		for s[i] != '"' {
			i++
		}
		str := s[start:i]
		i++ // skip closing '"'
		return str
	}

	parseNumber = func() interface{} {
		start := i
		if s[i] == '-' {
			i++
		}
		for i < len(s) && s[i] >= '0' && s[i] <= '9' {
			i++
		}
		if i < len(s) && s[i] == '.' {
			i++
			for i < len(s) && s[i] >= '0' && s[i] <= '9' {
				i++
			}
			num, _ := strconv.ParseFloat(s[start:i], 64)
			return num
		}
		num, _ := strconv.Atoi(s[start:i])
		return num
	}

	return parseValue()
}

In Go, we need explicit handling for interface{} types. The recursion mirrors the Python version, with slices for arrays and maps for objects. Numeric parsing distinguishes integers and floats using strconv.

Worked Examples

Example 1: {"a":2,"b":[1,2,3]}

|