Benchmark Intro
In Computer Architecture, Hennessy and Patterson classify benchmarks according to the following hierarchy, from best to worst.
1. Real applications 2. Modified applications (eg. with I/O removed to make it CPU-bound). 3. Kernels (key fragments of real applications). 4. Toy benchmarks (eg. sieve of Erastosthenes). 5. Synthetic benchmarks (code created artificially to fit a profile of particular operations, eg. Dhrystone)
Then there are microbenchmarks, which typically measure a single short-running operation by repeating it many times. I (njn) would put microbenchmarks around level four -- they they don't contain code from a real program, but at least they measure something that will occur in real programs, unlike synthetic benchmarks.