Difficulty: 🟡 Medium
Topics: Array, Sorting, Counting Sort

Solution

LeetCode 274, H-Index

Problem Understanding

The problem gives an array called citations, where each element represents how many citations a research paper has received. Each index corresponds to one paper, and the value at that index is the number of times that paper has been cited.

We must compute the researcher's h-index. The h-index is defined as the largest integer h such that the researcher has at least h papers with at least h citations each.

This definition is very important because it combines both productivity and impact. A researcher with many papers but few citations will have a low h-index. Similarly, a researcher with one extremely famous paper but very few other papers may also have a limited h-index.

For example, consider:

[3,0,6,1,5]

If we sort this array in descending order:

[6,5,3,1,0]

We can check how many papers satisfy the h-index condition:

At least 1 paper has at least 1 citation
At least 2 papers have at least 2 citations
At least 3 papers have at least 3 citations
Not at least 4 papers have at least 4 citations

So the maximum valid value is 3.

The constraints are relatively small:

1 <= n <= 5000
0 <= citations[i] <= 1000

An O(n log n) solution is completely acceptable for these limits. However, the problem also hints at counting sort as a topic, which suggests there may be a linear-time solution as well.

Several edge cases are important:

All citation counts may be zero
Citation counts may be much larger than the number of papers
The array may already be sorted or completely unsorted
Multiple papers may have the same citation count
There may only be one paper

A naive implementation can easily make off-by-one mistakes when checking whether a candidate h is valid.

Approaches

Brute Force Approach

The brute-force approach tries every possible h-index from 0 to n.

For each candidate value h, we count how many papers have at least h citations. If at least h papers satisfy that condition, then h is valid.

For example:

citations = [3,0,6,1,5]

To test h = 3:

Papers with at least 3 citations:
3
6
5

There are 3 such papers, so h = 3 is valid.

To test h = 4:

Papers with at least 4 citations:
6
5

Only 2 papers satisfy the condition, so h = 4 is invalid.

This approach is correct because it directly checks the definition of h-index. However, it is inefficient because for every possible h, we scan the entire array again.

The time complexity becomes O(n^2).

Optimal Approach

The key observation is that the exact ordering of papers does not matter, only how many papers have citation counts above a threshold.

If we sort the array in descending order, we can determine the h-index in one pass.

Suppose the sorted array is:

[6,5,3,1,0]

At index i, there are exactly i + 1 papers to the left including the current paper.

If:

citations[i] >= i + 1

then there are at least i + 1 papers with at least i + 1 citations.

We continue until this condition fails.

This transforms the problem into a simple scan after sorting.

Approach Comparison

Approach	Time Complexity	Space Complexity	Notes
Brute Force	O(n²)	O(1)	Try every possible h-index and rescan the array each time
Optimal	O(n log n)	O(1) or O(n)	Sort the array and scan once to find the maximum valid h-index

Algorithm Walkthrough

Optimal Sorting Algorithm

Sort the citations array in descending order.

We want the most cited papers first because the h-index depends on how many papers exceed a threshold. Sorting makes it easy to compare the paper count with citation counts directly. 2. Iterate through the sorted array using index i.

At position i, there are exactly i + 1 papers considered so far. 3. For each position, check whether:

citations[i] >= i + 1

This means the current paper has enough citations to support an h-index of i + 1. 4. If the condition is true, update the answer.

We continue scanning because a larger h-index may still be possible. 5. If the condition becomes false, stop checking larger values.

Since the array is sorted in descending order, all future citation counts will be smaller or equal. Therefore, no larger h-index can exist. 6. Return the maximum valid h-index found.

Why it works

After sorting in descending order, the paper at index i is the (i + 1)th most cited paper. If that paper has at least i + 1 citations, then the first i + 1 papers all satisfy the h-index condition.

Because the array is sorted, once the condition fails, every later paper will also fail for larger h-values. This guarantees that the largest valid value encountered during the scan is the correct h-index.

Python Solution

from typing import List

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)

        h_index = 0

        for index, citation_count in enumerate(citations):
            paper_count = index + 1

            if citation_count >= paper_count:
                h_index = paper_count
            else:
                break

        return h_index

The implementation begins by sorting the citations in descending order. This ensures that the most cited papers appear first.

The variable h_index stores the best valid value found so far.

The loop iterates through the sorted papers using enumerate, which provides both the index and citation count. At each step:

paper_count = index + 1

This represents how many papers have been examined.

The condition:

citation_count >= paper_count

checks whether there are at least paper_count papers with at least paper_count citations each.

If the condition is satisfied, the h-index can safely be updated. Otherwise, the loop breaks because further papers will only have smaller citation counts.

Finally, the method returns the maximum valid h-index discovered during the scan.

Go Solution

package main

import "sort"

func hIndex(citations []int) int {
	sort.Sort(sort.Reverse(sort.IntSlice(citations)))

	hIndex := 0

	for index, citationCount := range citations {
		paperCount := index + 1

		if citationCount >= paperCount {
			hIndex = paperCount
		} else {
			break
		}
	}

	return hIndex
}

The Go solution follows the same algorithmic structure as the Python version.

Go does not provide a direct descending sort for integer slices, so the implementation uses:

sort.Sort(sort.Reverse(sort.IntSlice(citations)))

The rest of the logic is identical. The loop checks whether the current citation count supports a larger h-index.

There are no integer overflow concerns because the constraints are very small. Go slices naturally handle both nil and empty behavior safely, although the problem guarantees at least one paper.

Worked Examples

Example 1

citations = [3,0,6,1,5]

Step 1: Sort Descending

[6,5,3,1,0]

Step 2: Scan Through Array

Index	Citation Count	Papers Considered	Condition	h-index
0	6	1	6 >= 1	1
1	5	2	5 >= 2	2
2	3	3	3 >= 3	3
3	1	4	1 >= 4 is false	stop

Final answer:

Example 2

citations = [1,3,1]

Step 1: Sort Descending

[3,1,1]

Step 2: Scan Through Array

Index	Citation Count	Papers Considered	Condition	h-index
0	3	1	3 >= 1	1
1	1	2	1 >= 2 is false	stop

Final answer:

Complexity Analysis

Measure	Complexity	Explanation
Time	O(n log n)	Sorting dominates the runtime
Space	O(1) or O(n)	Depends on the sorting implementation used internally

The main cost comes from sorting the array. After sorting, the algorithm performs a single linear scan through the citations.

The scan itself is O(n), but sorting requires O(n log n) time, which dominates the total complexity.

The auxiliary space complexity depends on the language implementation of sorting. Python's sorting may use additional memory internally, while Go's sort implementation is closer to in-place behavior.

Test Cases

from typing import List

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)

        h_index = 0

        for index, citation_count in enumerate(citations):
            if citation_count >= index + 1:
                h_index = index + 1
            else:
                break

        return h_index

solution = Solution()

assert solution.hIndex([3, 0, 6, 1, 5]) == 3  # standard example
assert solution.hIndex([1, 3, 1]) == 1  # small unsorted input
assert solution.hIndex([0]) == 0  # single paper with zero citations
assert solution.hIndex([100]) == 1  # single highly cited paper
assert solution.hIndex([0, 0, 0, 0]) == 0  # all papers uncited
assert solution.hIndex([1, 1, 1, 1]) == 1  # all papers cited once
assert solution.hIndex([4, 4, 4, 4]) == 4  # every paper supports maximum h-index
assert solution.hIndex([10, 8, 5, 4, 3]) == 4  # descending order input
assert solution.hIndex([25, 8, 5, 3, 3]) == 3  # citation counts larger than n
assert solution.hIndex([6, 6, 6, 6, 1]) == 4  # failure occurs near end
assert solution.hIndex([2, 2, 2]) == 2  # exact equality condition
assert solution.hIndex([11, 15]) == 2  # small array with large citations

Test Case Summary

Test	Why
`[3,0,6,1,5]`	Standard example from the problem
`[1,3,1]`	Small unsorted input
`[0]`	Minimum citation value
`[100]`	Single highly cited paper
`[0,0,0,0]`	No valid h-index greater than zero
`[1,1,1,1]`	Duplicate low citation counts
`[4,4,4,4]`	Maximum possible h-index
`[10,8,5,4,3]`	Already sorted descending
`[25,8,5,3,3]`	Citation counts larger than paper count
`[6,6,6,6,1]`	Failure condition appears late
`[2,2,2]`	Exact equality boundary
`[11,15]`	Small array with large values

Edge Cases

One important edge case occurs when every paper has zero citations, such as:

[0,0,0]

A buggy implementation might incorrectly return 1 simply because there is at least one paper. However, the h-index requires at least one paper with at least one citation. Since none exist, the correct answer is 0. The implementation handles this because the condition:

citation_count >= paper_count

fails immediately on the first iteration.

Another important case occurs when citation counts are much larger than the number of papers:

[100,100,100]

The h-index can never exceed the total number of papers. Even though each paper has 100 citations, there are only 3 papers total, so the answer must be 3. The implementation naturally enforces this because the largest possible paper_count is n.

A third edge case involves exact equality:

[3,3,3]

The h-index should be 3, not 2. Some incorrect implementations use a strict greater-than comparison instead of greater-than-or-equal-to. The correct condition is:

citation_count >= paper_count

which properly handles equality boundaries.

Another subtle edge case is when the failure occurs in the middle of the scan:

[10,8,2,1]

The algorithm correctly stops once the condition fails. Since the array is sorted descending, no later paper can restore validity for a larger h-index. This guarantees correctness and avoids unnecessary work.

LeetCode 274 - H-Index

Solution

LeetCode 274, H-Index

Problem Understanding

Approaches

Brute Force Approach

Optimal Approach

Approach Comparison

Algorithm Walkthrough

Optimal Sorting Algorithm

Why it works

Python Solution

Go Solution

Worked Examples

Example 1

Step 1: Sort Descending

Step 2: Scan Through Array

Example 2

Step 1: Sort Descending

Step 2: Scan Through Array

Complexity Analysis

Test Cases

Test Case Summary

Edge Cases