Follow

Lazyweb: does each new programming language require its own CSmith-like program generator in order to fuzz?

I'm pretty out of the loop when it comes to this stuff, but for example, YARPGen is a program generator for fuzzing C/C++ compilers. To fuzz, for example, Julia, would one need to write yet-another generator, JuliaGen? And for Rust, yet yet another?

Or is there some sort of reusable component that can be used to generate programs based on some definition of the language's syntax?

@modocache this is basically an open research problem. There have been some efforts towards this, like work on the redex generator of terms that satisfy arbitrary judgements. There is also Xsmith (docs.racket-lang.org/xsmith/in) which I am not sure of the current usability of.

It is a very tricky problem because the nature of each language makes the types of terms you want to generate to really stress the language implementation quite different. For example for Rust you're going to get very few well-typed programs if the generator isn't built to follow the borrow checking rules, or if it only creates fairly trivial heap usage scenarios.

@regehr You are talking about generative fuzzing, right?

I suspect that in mutation-based fuzzing most part of mutator could be generic for different languages with some specific mutation and grammar is, of course, specific for every programming language.

Last summer we implemented grammar-based fuzzer for Lua (libFuzzer, libProtobufMutator, Lua grammar described in Protobuf and serializer). Job was done by a student thar worked part-time in about a couple of months.

In short: yes, you need reimplement fuzzer for each language but most parts are common and reusable.

Sign in to participate in the conversation
types.pl

A Mastodon instance for programming language theorists and mathematicians. Or just anyone who wants to hang out.