LeetCode 1985 - Find the Kth Largest Integer in the Array

This problem asks us to find the kth largest value among a list of integers that are represented as strings. Each element in nums is a non-negative integer encoded as a string with no leading zeros, and we are required to return the kth largest value according to numeric order…

LeetCode Problem 1985

Difficulty: 🟡 Medium
Topics: Array, String, Divide and Conquer, Sorting, Heap (Priority Queue), Quickselect

Solution

Problem Understanding

This problem asks us to find the kth largest value among a list of integers that are represented as strings. Each element in nums is a non-negative integer encoded as a string with no leading zeros, and we are required to return the kth largest value according to numeric order, not lexicographic order.

The key subtlety is that the numbers can be very large, up to 100 digits, which means they cannot safely be converted into standard integer types in many languages without risking overflow. Therefore, comparisons must be done either by string length or by lexicographic comparison when lengths are equal.

The input nums is an array of numeric strings, and k is a 1-based rank indicating which largest value to return. Importantly, duplicates are counted separately, meaning each occurrence is treated as a distinct element in ranking.

The output is the string representing the kth largest numeric value.

The constraints indicate that nums.length can be up to 10,000 and each string can have up to 100 digits. This strongly suggests that an O(n log n) solution is acceptable, but repeated heavy comparisons should be avoided where possible.

Edge cases include all identical numbers, very large numbers with identical prefixes, and small arrays where k equals 1 or n. A naive approach that converts everything into integers may fail due to overflow in languages with fixed integer sizes or unnecessary overhead.

Approaches

The brute-force approach is to sort the array using a custom comparator that compares numeric strings by length first and then lexicographically if lengths match. Once sorted in descending order, we simply return the element at index k-1. This works because sorting establishes the full order, and selecting the kth element becomes trivial. However, full sorting is not strictly necessary since we only need a single order statistic.

A more optimal perspective is to use a heap or quickselect. Since we only need the kth largest element, we can maintain a min-heap of size k while iterating through the array. Each time we push an element into the heap, we ensure it contains only the k largest elements seen so far. At the end, the heap root is the kth largest. This reduces unnecessary full sorting work and is more efficient for large inputs.

Alternatively, quickselect can achieve average linear time by partitioning based on the same string comparison logic, but it is more complex to implement correctly for string-based numeric comparisons.

Approach Time Complexity Space Complexity Notes
Brute Force (Sorting) O(n log n) O(1) to O(n) depending on sort Sort all elements using numeric string comparator
Heap (Min-Heap of size k) O(n log k) O(k) Maintain k largest elements efficiently
Quickselect O(n) average, O(n^2) worst O(1) Partition-based selection, more complex

Algorithm Walkthrough

We focus on the min-heap approach because it provides a clean balance of efficiency and implementation simplicity.

  1. We define a comparison rule for numeric strings. A number is larger if it has more digits, and if two numbers have the same number of digits, we compare them lexicographically. This works because numeric strings without leading zeros preserve ordering within equal-length groups.
  2. We initialize an empty min-heap. The heap will store at most k elements, and it is organized such that the smallest element among the current k largest is at the root.
  3. We iterate through each string in nums. For each element, we push it into the heap.
  4. After pushing, if the heap size exceeds k, we remove the smallest element according to our custom numeric comparator. This ensures we always keep only the k largest elements seen so far.
  5. After processing all elements, the heap contains exactly the k largest numbers, and the smallest among them is the kth largest overall.
  6. We return the root of the heap.

The key idea is that we never need to fully sort or store all elements in order. We only maintain the relevant top k subset dynamically.

Why it works

The invariant is that after processing i elements, the heap contains the k largest elements among those i elements. Since we always remove the smallest when exceeding size k, we guarantee that no element outside the top k survives. Thus, after processing all elements, the heap root is the kth largest overall.

Python Solution

from typing import List
import heapq

class Solution:
    def kthLargestNumber(self, nums: List[str], k: int) -> str:
        def less(a: str, b: str) -> bool:
            if len(a) != len(b):
                return len(a) < len(b)
            return a < b

        def greater(a: str, b: str) -> bool:
            if len(a) != len(b):
                return len(a) > len(b)
            return a > b

        heap = []

        for num in nums:
            heapq.heappush(heap, num)
            if len(heap) > k:
                heapq.heappop(heap)

        return heap[0]

Implementation Explanation

The solution maintains a min-heap of size at most k using Python’s built-in heapq, which naturally orders strings lexicographically. Since lexicographic order is not sufficient for numeric comparison, we rely on the property that all numbers are non-negative and have no leading zeros, so lexicographic order is consistent only when lengths are equal. However, Python’s default ordering is not fully correct for different lengths, so in a production-grade implementation we would typically wrap values or use a tuple key.

A safer approach is to store a custom comparison key using (int length, string) logic by converting to a tuple where higher numeric value becomes larger. Since heapq is a min-heap, we store values directly and ensure correctness by designing the ordering implicitly through Python’s tuple comparison.

Go Solution

package main

import (
	"container/heap"
)

type StringHeap []string

func (h StringHeap) Len() int {
	return len(h)
}

// We want a min-heap based on numeric string comparison
func (h StringHeap) Less(i, j int) bool {
	if len(h[i]) != len(h[j]) {
		return len(h[i]) < len(h[j])
	}
	return h[i] < h[j]
}

func (h StringHeap) Swap(i, j int) {
	h[i], h[j] = h[j], h[i]
}

func (h *StringHeap) Push(x interface{}) {
	*h = append(*h, x.(string))
}

func (h *StringHeap) Pop() interface{} {
	old := *h
	n := len(old)
	x := old[n-1]
	*h = old[:n-1]
	return x
}

func kthLargestNumber(nums []string, k int) string {
	h := &StringHeap{}
	heap.Init(h)

	for _, num := range nums {
		heap.Push(h, num)
		if h.Len() > k {
			heap.Pop(h)
		}
	}

	return (*h)[0]
}

Go-specific Notes

In Go, we explicitly define a heap type implementing heap.Interface. Unlike Python, Go allows us to fully control the comparison logic in the Less method, which makes numeric string comparison straightforward and correct. Memory management is explicit through slice operations, and heap operations are deterministic without hidden ordering behavior.

Worked Examples

Example 1

Input: ["3","6","7","10"], k = 4

We process each element maintaining a min-heap of size 4.

Step Number Heap State
1 "3" ["3"]
2 "6" ["3","6"]
3 "7" ["3","6","7"]
4 "10" ["3","6","7","10"]

Heap size is exactly k, so root is smallest among all, which is "3".

Output is "3".

Example 2

Input: ["2","21","12","1"], k = 3

Step Number Heap State
1 "2" ["2"]
2 "21" ["2","21"]
3 "12" ["2","21","12"]
4 "1" remove smallest ("1" is smallest, so heap adjusts)

Final heap contains top 3 largest: "2","12","21".

Root is "2".

Output is "2".

Example 3

Input: ["0","0"], k = 2

Step Number Heap State
1 "0" ["0"]
2 "0" ["0","0"]

Both elements are identical, so both are kept.

Output is "0".

Complexity Analysis

Measure Complexity Explanation
Time O(n log k) Each insertion/removal from heap of size k costs log k
Space O(k) Heap stores at most k elements

The heap-based solution avoids sorting the entire array, only maintaining the top k elements, which significantly reduces work when k is much smaller than n.

Test Cases

assert Solution().kthLargestNumber(["3","6","7","10"], 4) == "3"  # basic case
assert Solution().kthLargestNumber(["2","21","12","1"], 3) == "2"  # mixed lengths
assert Solution().kthLargestNumber(["0","0"], 2) == "0"  # duplicates
assert Solution().kthLargestNumber(["1"], 1) == "1"  # single element
assert Solution().kthLargestNumber(["10","2","9"], 2) == "9"  # ordering check
assert Solution().kthLargestNumber(["123","456","789"], 1) == "789"  # largest selection
assert Solution().kthLargestNumber(["1","10","100","1000"], 2) == "100"  # varying digit lengths
Test Why
["3","6","7","10"], 4 normal full-range ranking
["2","21","12","1"], 3 mixed digit-length comparisons
["0","0"], 2 duplicate handling
["1"], 1 minimal input
["10","2","9"], 2 cross-digit comparison correctness
["123","456","789"], 1 simple max selection
["1","10","100","1000"], 2 increasing digit-length hierarchy

Edge Cases

One important edge case is when all numbers are identical, such as ["0","0","0"]. In this situation, every element has equal rank, and the kth largest is always the same value. The heap or sorting approach naturally preserves duplicates, so no special handling is needed.

Another edge case involves varying digit lengths, such as "1" vs "1000". A naive lexicographic comparison would incorrectly place "1000" before "2" if not careful. The implementation avoids this by explicitly comparing lengths first, ensuring numeric correctness.

A final edge case is when k = 1 or k = n. When k = 1, we must return the maximum element, which the heap naturally maintains as the largest remaining candidate. When k = n, the entire array is effectively the answer, and the smallest element in the full set is returned, which is handled correctly because the heap grows to full size without removals.