Had my compilers class try to break 133 different compilers, which just wrapped up.

OK first, this was fun and I recommend it.

The assignment was to write input programs that would be run on the collection of compilers submitted from the *previous* assignment. If results differed from a reference interpreter, that compiler was considered broken. The goal was to break as many things as possible.

The learning objective here was to learn how to read an informal spec and write test cases that are likely to exercise bugs. I think that worked. For many students it was clear they hadn't done this kind of task before and didn't really know where to start, which was surprising to me.

Students very quickly (like in hours) found overspecifications in the behavior of the interpreter and used it to "win", but I was able to adjust the interpreter and after the first day, that kind of exploit went away.

Many students did what I expected: they wrote small tests based on the assignment spec that broke a good chunk of the compilers. With some effort, they could get ~70-80 of the 133 compilers this way.

A few students wrote tests *not* guided by the assignment spec, but instead just wrote small examples drawn from the whole language. They found bugs in the starter code that was given to students, and thereby knocked out all 133 compilers.

One student found a bug in the parser which in two characters broke all the compilers!

Another student found a bug in the run-time system which read some memory as a uint, when it should've been int.

I look forward to refining and iterating on this in the future.

Almost forgot: having a leaderboard made this more of a game, which I think really helped.

@dvanhorn this is brilliant - I already knew it took three iterations to get an assignment right, because actual students will break the first two in non-trivial ways, but getting them to do it on purpose is smart.

Do you worry that the students doing the two assignments will collude with each other (e.g., with themself)?

@ccshan Nobody knew about the second part (me included) until after the first part was complete.

@dvanhorn Haha "me included". Do you worry about the future? Maybe use this semester's first part for next semester's second part?

@ccshan Maybe I should be a little worried. Without colluding with others, the most you can gain is 1 point. You also have to embed a bug that won't be found during part 1, otherwise you'll lose points. I'll have to think about it a bit more, but I think staggering past solutions is fine.

Have you explicitly introduced them to randomized property-based testing? :ferris_question:


Sign in to participate in the conversation

A Mastodon instance for programming language theorists and mathematicians. Or just anyone who wants to hang out.