Correctness Too Cheap To Meter: Formal Verification and LLMs
Session Abstract
Formal methods are powerful tools to verify software systems’ correctness and reliability. However, manually writing system specs is time-consuming and hard to maintain. LLMs can help with this burden.
We’ll share new research into tools to automate formal methods workflows and learnings from how LLMs currently perform.
Session Description
Formal methods can mathematically prove certain properties of software: for example, we can guarantee a database is deadlock free or avoids crashes. Major infrastructure providers like AWS and Azure all leverage verification, but it’s currently too expensive and time-consuming to deploy for most use-cases. However, LLMs can automate much of this toil.
This talk demonstrates how we can scale formal methods from an academic luxury to a tractable tool. We share novel research on applying LLMs to formalize real-world systems, including popular DBs and libraries. We present benchmark results, our automated formal spec generation framework, and current model shortcomings.
In particular, we’ll touch on:
- What formal verification is, why it’s key for critical systems, and how it’s typically done
- SysMoBench: an LLM benchmark grounded in practical formal verification metrics instead of toy tasks
- Specula: an automated framework to synthesize formal specifications directly from source code, eliminating tedious dev work
- New, unpublished research on connecting specs to real source code more efficiently
Our approach decreases the implementation cost of formal methods, enabling industry to more efficiently avoid outages and bugs. Audience members will take away knowledge of what formal methods are and how to effectively deploy them by taking advantage of automation opportunities.