Text-to-Struct: Fine-tuning SLMs for Query Intent
Session Abstract
Hybrid search fails on complex intent: vector search misses constraints, keywords miss nuance. This talk explores fine-tuning SLMs for ‘Query Understanding’—transforming vague inputs into structured requests. Learn to extract metadata, expand terms, and route intent to build a search engine that does the hard work for your users.
Session Description
Building a search experience that feels “intelligent” requires more than just embedding user input or matching keywords. Real-world financial queries—whether from an analyst or a “lazy” Agentic LLM—are rarely optimized for your index. They are a messy mix of semantic intent (“tech stocks sensitive to rate hikes”) and rigid constraints that simple hybrid search often ignores.
We typically see three “Intent Killers” in production:
- Time: “European bank guidance last quarter” (Vector search ignores recency; Keywords miss the fiscal calendar).
- Entities & Content Types: “CEO remarks on AI in 10-K risk factors” (Often conflated with general news or 10-Q tables).
- Ambiguity: Generic LLMs often spam search APIs with broad, unrefined queries like “crypto regulation risks” that return noise instead of specific regulatory filings.
In this session, we present a robust approach: Fine-tuning a Small Language Model (SLM) to act as a dedicated “Query Understanding” layer.
We will move beyond simple RAG architectures and demonstrate how to train a small, deterministic model to parse raw text and output a valid Structured Semantic Query. The training dataset for this is created/prepared by combining real user queries with synthetic data, and we used an LLM to assist in the initial annotation (a form of knowledge distillation) which was then meticulously reviewed to ensure the model captures the necessary constraints and financial nuance. This shifts the burden of “knowing how to search” from the user to the system.
We will cover:
- The “Hybrid Gap”: Why combining Semantic + Lexical search is not enough. We will analyze failure cases involving strict fiscal periods, specific tickers (e.g., distinguishing “META” the company from “meta” the prefix), and document sub-types.
- The “LLM as User” Problem: How to handle the influx of queries from generic LLM Agents. We show how to translate their broad requests (e.g., “Give me macro trends”) into the specific, optimized queries your engine actually needs.
- Why Not Just Prompt a Giant Model? We demonstrate why “Prompt Engineering” generic LLMs is a dead end for high-performance finance search. We show how generalist models lack the necessary domain expertise to ensure schema adherence, and compare the latency/cost against specialized SLMs that offer 99% schema adherence
- Query Expansion & Intent Routing is a process where a fine-tuned Small Language Model (SLM) intercepts the user’s initial search phrase and automatically enriches it with specific, structured search terms before sending it to the index. Instead of just matching keywords, the SLM translates the user’s semantic intent into precise, optimized queries. For instance, a vague term like “greenwashing” is expanded and routed as multiple concepts, such as
regulatory_riskOResg_controversy. - Impact on Relevance: Real-world comparisons showing how “translating” intent upstream drastically improves retrieval quality for complex financial instruments compared to standard Hybrid Search.