hyperboria/nexus/meta_api/mergers/reservoir_sampling.py
the-superpirate 43be16e4bc - [nexus] Update schema
- [nexus] Remove outdated protos
  - [nexus] Development
  - [nexus] Development
  - [nexus] Development
  - [nexus] Development
  - [nexus] Development
  - [nexus] Refactor views
  - [nexus] Update aiosumma
  - [nexus] Add tags
  - [nexus] Development
  - [nexus] Update repository
  - [nexus] Update repository
  - [nexus] Update dependencies
  - [nexus] Update dependencies
  - [nexus] Fixes for MetaAPI
  - [nexus] Support for new queries
  - [nexus] Adopt new versions of search
  - [nexus] Improving Nexus
  - [nexus] Various fixes
  - [nexus] Add profile
  - [nexus] Fixes for ingestion
  - [nexus] Refactorings and bugfixes
  - [idm] Add profile methods
  - [nexus] Fix stalled nexus-meta bugs
  - [nexus] Various bugfixes
  - [nexus] Restore IDM API functionality

GitOrigin-RevId: a0842345a6dde5b321279ab5510a50c0def0e71a
2022-09-02 19:15:47 +03:00

22 lines
830 B
Python

import random
import sys
from typing import List
from summa.proto import search_service_pb2
class ReservoirSamplingMerger:
def __init__(self, reservoir_sampling_collectors: List[search_service_pb2.ReservoirSamplingCollectorOutput]):
self.reservoir_sampling_collectors = reservoir_sampling_collectors
def merge(self) -> search_service_pb2.ReservoirSamplingCollectorOutput:
random_documents = []
for reservoir_sampling_collector in self.reservoir_sampling_collectors:
random_documents += reservoir_sampling_collector.random_documents
random.shuffle(random_documents)
return search_service_pb2.CollectorOutput(
reservoir_sampling=search_service_pb2.ReservoirSamplingCollectorOutput(
random_documents=random_documents
)
)