Semantics Lifting via SPIRAL

What is it?

We investigate reversing the well-established code generation process in SPIRAL to stepwisely lift the semantics of source code, with a particular focus on scientific kernels. To achieve this, we develop a novel extension to the SPIRAL system leveraging symbolic execution and theorem proving.

Overview

The growing adoption of neural-based code generation tools, such as large language models (LLMs), presents significant challenges in ensuring the correctness and efficiency of scientific software. Although LLM-generated code may be syntactically valid, it often falls short of meeting the rigorous correctness and performance standards required for complex scientific kernels. Scientific computing demands numerical stability, domain-specific optimizations, and accurate floating-point arithmetic, which are challenging to achieve in code generated without domain expertise.

By deriving the semantics of generated kernels statically (at compile time) for cases with unknown (runtime) size parameters, we can identify potential bugs, inefficiencies, and performance bottlenecks before deploying the code. This statically derived information can also be fed back into neural-based code generation tools to iteratively improve the generated code. However, scientific computing poses unique challenges for static analysis tools. Accurate handling of floating-point arithmetic requires managing rounding errors and numerical precision, while pointers, recursion, and transcendental functions like sine and cosine further complicate static analysis.

To address these issues, this work proposes stepwise semantics lifting as an early-stage experimental solution within constrained boundaries. We develop a novel extension to the SPIRAL system, which is equipped with symbolic execution and theorem-proving capabilities. Through an LLVM-to-SPIRAL parser, we import LLM-generated scientific kernels into SPIRAL and derive their semantics using SPIRAL's formal framework and engine.

References

N. Zhang, S. Rao, M. Franusich, F. Franchetti
Towards Semantics Lifting for Scientific Computing: A Case Study on FFT
Theory and Practice of Static Analysis Workshop (TPSA), in conjunction with the ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), 2025, Extended abstract with presentation

Copyrights to many of the above papers are held by the publishers. The attached PDF files are preprints. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. Some links to papers above connect to IEEE Xplore with permission from IEEE, and viewers must follow all of IEEE's copyright policies.