This is the mail archive of the mailing list for the binutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Fuzzing objdump (PR 17512) and readelf (PR 17531)


I was privately asked how I fuzzed objdump in PR 17512 and I figured it could be interesting to others too. So here it is.

Short version: I used the most naive way.

Longer version: I started with the most simple approach I could get results with and improved it only a little bit so far. There was just no need for improvements -- until recently I was getting more crashes than I can analyze (i.e. run through valgrind:-). Thanks to the excellent work of Nick Clifton, crashes are harder to get now. But there is a long way to go.

Hardware resources: one several years old desktop. Previous batches of crashes required from minutes to half an hour to get. The last one is the result of one night.

binutils was built like `./configure && make`. Using address-sanitizer would probably improve the process.

Commands fuzzed: `objdump -x $file` in PR 17512[1] and `readelf -a $file` in PR 17531. Someone more familiar with binutils could choose a command involving more parsing and hence improve code coverage.


Fuzzer: zzuf with more-or-less default options. It was used like this:

  zzuf -s 0:1000000 -c -C 0 -q -T 5 -M 100 -j 4 objdump -x "$file" 2> log


  -s 0:1000000 -- seeds to try, change as you wish
  -c           -- only fuzz files specified in command line (just in case)
  -C 0         -- don't stop after the first crash
  -q           -- suppress output from objdump
  -T 5         -- limit cputime to not hang on infinite loops
  -M 100       -- limit memory to not eat all of it
  -j 4         -- number of simultaneous jobs

Another option of interest is -r -- ratio of changed bits.

After you get a crash like this:

  zzuf[s=1448,r=0.004]: signal 11 (SIGSEGV)

you can get a fuzzed sample with the following command:

  zzuf -s 1448 -r 0.004 < "$file" > fuzzed-file

In fact, I run zzuf from a script against different samples in batches of 10K seeds, then run random selection from found crashes under valgrind and collect unique errors. When crashes are rare it's possible to run all of them through valgrind. This part of the process is also quite naive but it you are fixing crashes as you find them it's not needed at all.

Samples: I started with clam.exe from ClamAV, then used 'main() { return 0; }' compiled in different ways and then switched to samples from . Looking at code coverage and optimizing the set of samples could improve the process. My uploaded samples named as (\d+)-(0.004) are clam.exe zzufed with seed $1 and ratio $2, samples named as (\d+)-(\d+)-(0.004) are from radare2-regressions zzufed in the same way. If someone is interested I can provide a precise list.

License -- short version: AFAICT clam.exe is under GPLv2 and "was entirely written by hand using HIEW"[1], radare2-regressions is under GPLv3+. Feel free to use in testsuites etc.


Alexander Cherepanov

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]