Bringing speedups to top-k queries with many and/or high-frequency terms

Sep 11, 2023 by iHash Leave a Comment

In Apache Lucene, queries are responsible for creating sorted streams of matching doc IDs. Implementing a disjunctive query boils down to taking N input queries that produce sorted streams of doc IDs and combining them into a merged sorted stream of doc IDs. The textbook approach to this problem consists of putting input streams into a min-heap data structure ordered by their current doc ID. This approach has been referred to as BooleanScorer2 (BS2) in Lucene.

While BS2 works nicely, it gets a bit of overhead from having to rebalance the heap every time that it needs to move to the next match. BS1 tries to reduce this overhead by splitting the doc ID space into windows of 2,048 documents. In every window, BS1 iterates through all matching doc IDs, one clause at a time. On every doc ID, it computes the index of this doc ID in the window, sets the corresponding bit in a bitset, and adds the current score to the corresponding index in a double[2048]. Iterating matches within the window, then consists of iterating bits of the bitset and looking up the score at the corresponding index in the double[2048]. This approach often runs faster with queries that have many clauses or high-frequency clauses.

These two approaches have been described in a 1997 paper called “Space Optimizations for Total Ranking” by Doug Cutting, the creator of Lucene. BS2 is called “Parallel Merge” in this paper and described in section 4.1, while BS1 is called “Block Merge” and described in section 4.2. These are arguably more descriptive names than BS1 and BS2. Note that the description of “Block Merge” in the paper is quite different from what it looks like in Lucene today, but the underlying idea is the same.

Source link

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Apple Intelligence comes to Apple Vision Pro in April

February 21, 2025 UPDATE Apple Intelligence comes to Apple Vision Pro in April visionOS 2.4 advances spatial computing with a powerful set of Apple Intelligence features — including Writing Tools, Image Playground, and Genmoji — and introduces Spatial Gallery, the Apple Vision Pro app for iPhone, and more Today, Apple announced Apple Intelligence is coming to Apple Vision […]

Apple introduces News+ Food – Apple

Coming with iOS 18.4 and iPadOS 18.4 in April, Apple News+ subscribers will have access to Apple News+ Food, a new section that will feature tens of thousands of recipes — as well as stories about restaurants, healthy eating, kitchen essentials, and more — from the world’s top food publishers, including Allrecipes, Bon Appétit, Food […]

What to do if your WhatsApp is hacked: a step-by-step guide

Your messaging-app account might be of interest to more than just jealous spouses or nosy coworkers. Stolen WhatsApp accounts fuel large-scale criminal activity — ranging from spam distribution to complex scam schemes. That’s why cybercriminals are constantly on the lookout for WhatsApp accounts — using various methods to hijack them. Here are eight signs your […]

North Korean Hackers Target Freelance Developers in Job Scam to Deploy Malware

Freelance software developers are the target of an ongoing campaign that leverages job interview-themed lures to deliver cross-platform malware families known as BeaverTail and InvisibleFerret. The activity, linked to North Korea, has been codenamed DeceptiveDevelopment, which overlaps with clusters tracked under the names Contagious Interview (aka CL-STA-0240), DEV#POPPER, Famous Chollima, PurpleBravo, and Tenacious Pungsan. The […]

Apple debuts iPhone 16e: A powerful new member of the iPhone 16 family

February 19, 2025 PRESS RELEASE Apple debuts iPhone 16e: A powerful new member of the iPhone 16 family iPhone 16e joins the iPhone 16 lineup, featuring the fast performance of the A18 chip, Apple Intelligence, extraordinary battery life, and a 48MP 2-in-1 camera system — all at an incredible value CUPERTINO, CALIFORNIA Apple today announced iPhone 16e, a new […]

Hackers Exploit Signal’s Linked Devices Feature to Hijack Accounts via Malicious QR Codes

Feb 19, 2025Ravie LakshmananMobile Security / Cyber Espionage Multiple Russia-aligned threat actors have been observed targeting individuals of interest via the privacy-focused messaging app Signal to gain unauthorized access to their accounts. “The most novel and widely used technique underpinning Russian-aligned attempts to compromise Signal accounts is the abuse of the app’s legitimate ‘linked devices’ […]

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.2.0

Pangu has updated its jailbreak utility for iOS 9.0 to 9.0.2 with a fix for the manage storage bug and the latest version of Cydia. Change log V1.2.0 (2015-10-27) 1. Bundle latest Cydia with new Patcyh which fixed failure to open url scheme in MobileSafari 2. Fixed the bug that “preferences -> Storage&iCloud Usage -> […]

Apple Blocks Pangu Jailbreak Exploits With Release of iOS 9.1

Apple has blocked exploits used by the Pangu Jailbreak with the release of iOS 9.1. Pangu was able to jailbreak iOS 9.0 to 9.0.2; however, in Apple’s document on the security content of iOS 9.1, PanguTeam is credited with discovering two vulnerabilities that have been patched.

Pangu Releases Updated Jailbreak of iOS 9 Pangu9 v1.1.0

Pangu has released an update to its jailbreak utility for iOS 9 that improves its reliability and success rate. Change log V1.1.0 (2015-10-21) 1. Improve the success rate and reliability of jailbreak program for 64bit devices 2. Optimize backup process and improve jailbreak speed, and fix an issue that leads to fail to […]

Activator 1.9.6 Released With Support for iOS 9, 3D Touch

Ryan Petrich has released Activator 1.9.6, an update to the centralized gesture, button, and shortcut manager, that brings support for iOS 9 and 3D Touch.

JBL Flip 6 Portable Bluetooth Speaker (Open Box) for $74

Navee V25 300W Foldable e-Scooter for $299

Smart Tracker Includes Key Ring – Works with Apple Find My App (2-Pack) for $34

Harmony Premium Plan Lifetime Subscription for $99

Lenovo 11.6" 100e Chromebook 2nd Gen (2019) MediaTek MT8173C 4GB RAM 16GB eMMC (Refurbished) for $54