In Apache Lucene, queries are responsible for creating sorted streams of matching doc IDs. Implementing a disjunctive query boils down to taking N input queries that produce sorted streams of doc IDs and combining them into a merged sorted stream of doc IDs. The textbook approach to this problem consists of putting input streams into a min-heap data structure ordered by their … [Read more...] about Bringing speedups to top-k queries with many and/or high-frequency terms