Add GRPC endpoint with auth that can merge results from multiple zoekt nodes

Problem to solve

Currently, we combine search results from multiple Zoekt nodes using threads, which is inefficient and can lead to excessive memory usage when processing up to 5,000 results. We need a more efficient approach that allows for streaming results and early termination when limits are reached.

Proposal

Implement a gRPC endpoint in the gitlab-zoekt-webserver that enables:

Streaming search results from multiple nodes concurrently
Early termination when result limits are reached
Result ranking and deduplication across nodes
Secure access with JWT authentication
Client-side pagination without fetching all results

This will improve scalability, reduce memory usage, and enable GitLab to display search results as they become available.

Implementation Components

1. gRPC Service Definition

Define protobuf service with streaming search endpoint
Include support for pagination, filtering, and result limits
Add metadata fields for ranking and relevance scoring

2. Node Discovery and Coordination

Design aggregation strategy to combine results efficiently
- Use a channel-based approach to collect results from multiple nodes
- Implement early cancellation when result limit is reached
- Consider using a priority queue for merging results based on relevance
Implement load balancing across nodes

3. Results Streaming and Ranking

Stream results as they become available from each node
Implement relevance ranking algorithm
Support early termination when limits are reached

4. Authentication and Security

Implement JWT validation for secure access
Secure node-to-node communication

5. Error Handling and Resilience

Gracefully handle node failures
Support partial results when some nodes are unavailable
Implement timeout and cancellation mechanisms

6. Client Integration (separate issue)

Update GitLab search service to consume streaming API
Support pagination without fetching all results

Technical Considerations

Backward Compatibility:
- Maintain existing JSON API for backward compatibility
- Gradually migrate clients to gRPC API
Security:
- Ensure JWT authentication uses industry best practices
Deployment:
- Update nginx config in helm chart to route gRPC endpoint correctly

Success Metrics

Reduce memory usage by at least 50% during large searches
Efficiently handle up to 5,000 results without performance degradation

References

Click to see previous issue description

Problem to solve

Currently, we have to use threads to combine results from multiple nodes. We also have gitlab-zoekt-indexer!310 (merged) to move the existing logic to the indexer, but I think our long-term solution should be an gRPC endpoint added to the webserver binary.

Proposal

If we use a GRPC endpoint, that allows us to stream results from multiple nodes at the same time and stopping it as soon as we reach 5,000 results (our limit). Also, this will open a possibility to stream results to GitLab (or other clients).

sequenceDiagram
    participant Client as Client
    participant Zoekt1 as Zoekt Node 1
    participant Zoekt2 as Zoekt Node 2
    participant Zoekt3 as Zoekt Node 3

    Client->>Zoekt1: Send search request via gRPC
    Zoekt1-->>Zoekt1: Process local search
    Zoekt1->>Zoekt2: Send search request to Zoekt Node 2
    Zoekt1->>Zoekt3: Send search request to Zoekt Node 3
    Zoekt2-->>Zoekt1: Return partial search results
    Zoekt3-->>Zoekt1: Return partial search results
    Zoekt1-->>Client: Aggregate and return results

Context

https://github.com/sourcegraph/zoekt/pull/756

Other important items:

The new endpoint should be a part of gitlab-zoekt-webserver, not as part of the indexer since we want to split reads and writes to different processes. That was the reason behind gitlab-zoekt-indexer!218 (merged) and gitlab-org/build/CNG!2227 (merged)
We should add JWT authentication to the new GRPC endpoint
In this comment, the Zoekt maintainer shares some ideas how to aggregate shards efficiently

Technical Challenges Discussed

Current JSON Endpoint Issues:
- Sometimes returns hundreds of megabytes of data
- Must receive entire payload of 5,000 results for parsing
Results Streaming Discussion:
- Current needs are typically first 10 pages of results
- Page size requirements differ between legacy and new UI
- Proposal to implement streaming to receive only necessary results
- Total required results can potentially reach up to 5,000 matches, especially for new UI
Pagination and Limits:
- Discussion around defining "necessary" results and predicting final limits
- Challenge in determining optimal limit due to varying use cases between UIs

Edited Feb 27, 2025 by Dmitry Gruzd