Rafał Kuć

Rafał Kuć

Authologic

Author, software engineer, trainer and consultant focused on information retrieval. In his work helping companies throughout the whole software lifecycle – from requirements gathering and architecture, through implementation and deployment ending with scaling and tuning. In his free time a novice carpenter and ultra runner with varying degree of success.

Sessions

Data Science
Search

Circular Dependency Fixes when Bootstrapping a Golden Set

Short Talk
For a golden set, you need queries. Even if you have them, you can’t judge all docs for each query. Only the top N. How do we rank the top N? See the circular dependency? We’ll talk about ways to untangle it: lexical search, significant terms, training an embedder from scratch, etc. By iteratively refining data and queries, we'll get there.