Rafał Kuć
Author, software engineer, trainer and consultant focused on information retrieval. In his work helping companies throughout the whole software lifecycle – from requirements gathering and architecture, through implementation and deployment ending with scaling and tuning. In his free time a novice carpenter and ultra runner with varying degree of success.
Sessions
Data Science
Search
Circular Dependency Fixes when Bootstrapping a Golden Set
Short Talk
For a golden set, you need queries. Even if you have them, you can’t judge all docs for each query. Only the top N. How do we rank the top N? See the circular dependency? We’ll talk about ways to untangle it: lexical search, significant terms, training an embedder from scratch, etc. By iteratively refining data and queries, we'll get there.