-
Chatgpt-Based Test Generation for Refactoring Engines Enhanced by Feature Analysis on Examples, (ICSE2025)
- Abstract: Software refactoring is widely employed to improve software quality. However, conducting refactorings manually is tedious, time-consuming, and error-prone. Consequently, automated and semi-automated tool support is highly desirable for software refactoring in the industry, and most of the main-stream IDEs provide powerful tool support for refactoring. However, complex refactoring engines are prone to errors, which in turn may result in imperfect and incorrect refactorings. To this end, in this p...
- Labels: program testing, differential testing
-
DiffSpec: Differential Testing with LLMs using Natural Language Specifications and Code Artifacts, (arXiv2024)
- Abstract: Differential testing can be an effective way to find bugs in software systems with multiple implementations that conform to the same specification, like compilers, network protocol parsers, and language runtimes. Specifications for such systems are often standardized in natural language documents, like Instruction Set Architecture (ISA) specifications, Wasm specifications or IETF RFC's. Large Language Models (LLMs) have demonstrated potential in both generating tests and handling large volumes o...
- Labels: program testing, differential testing, static analysis, specification inference
-
LWDIFF: an LLM-Assisted Differential Testing Framework for Webassembly Runtimes, (ICSE2025)
- Abstract: WebAssembly (Wasm) runtimes execute Wasm programs, a popular low-level language for efficiently executing high-level languages in browsers, with broad applications across diverse domains. The correctness of those runtimes is critical for both functionality and security of Wasm execution, motivating testing approaches that target Wasm runtimes specifically. However, existing Wasm testing frameworks fail to generate test cases that effectively test all three phases of runtime, i.e., decoding, vali...
- Labels: program testing, differential testing
-
METAMON: Finding Inconsistencies between Program Documentation and Behavior using Metamorphic LLM Queries, (LLM4Code2025)
- Abstract: Code documentation can, if written precisely, help developers better understand the code they accompany. However, unlike code, code documentation cannot be automatically verified via execution, potentially leading to inconsistencies between documentation and the actual behavior. While such inconsistencies can harmful for developer’s understanding of the code, checking and finding them remains a costly task due to the involvement of human engineers. This paper proposes METAMON, which uses an exis...
- Labels: program testing, differential testing
-
Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting, (ASE2023)
- Abstract: Automated detection of software failures is an important but challenging software engineering task. It involves finding in a vast search space the failure-inducing test cases that contain an input triggering the software fault and an oracle asserting the incorrect execution. We are motivated to study how far this outstanding challenge can be solved by recent advances in large language models (LLMs) such as ChatGPT. However, our study reveals that ChatGPT has a relatively low success rate (28.8%)...
- Labels: program testing, differential testing
-
OpDiffer: LLM-Assisted Opcode-Level Differential Testing of Ethereum Virtual Machine, (ISSTA2025)
- Abstract: As Ethereum continues to thrive, the Ethereum Virtual Machine (EVM) has become the cornerstone powering tens of millions of active smart contracts. Intuitively, security issues in EVMs could lead to inconsistent behaviors among smart contracts or even denial-of-service of the entire blockchain network. However, to the best of our knowledge, only a limited number of studies focus on the security of EVMs. Moreover, they suffer from 1) insufficient test input diversity and invalid semantics; and 2)...
- Labels: program testing, differential testing