Sampling Error - GNU gprof

6.1 Statistical Sampling Error

The run-time figures that gprof gives you are based on a sampling process, so they are subject to statistical inaccuracy. If a function runs only a small amount of time, so that on the average the sampling process ought to catch that function in the act only once, there is a pretty good chance it will actually find that function zero times, or twice.

By contrast, the number-of-calls and basic-block figures are derived by counting, not sampling. They are completely accurate and will not vary from run to run if your program is deterministic and single threaded. In multi-threaded applications, or single threaded applications that link with multi-threaded libraries, the counts are only deterministic if the counting function is thread-safe. (Note: beware that the mcount counting function in glibc is not thread-safe). See Implementation of Profiling.

The sampling period that is printed at the beginning of the flat profile says how often samples are taken. The rule of thumb is that a run-time figure is accurate if it is considerably bigger than the sampling period.

The actual amount of error can be predicted. For n samples, the expected error is the square-root of n. For example, if the sampling period is 0.01 seconds and foo's run-time is 1 second, n is 100 samples (1 second/0.01 seconds), sqrt(n) is 10 samples, so the expected error in foo's run-time is 0.1 seconds (10*0.01 seconds), or ten percent of the observed value. Again, if the sampling period is 0.01 seconds and bar's run-time is 100 seconds, n is 10000 samples, sqrt(n) is 100 samples, so the expected error in bar's run-time is 1 second, or one percent of the observed value. It is likely to vary this much on the average from one profiling run to the next. (Sometimes it will vary more.)

This does not mean that a small run-time figure is devoid of information. If the program's total run-time is large, a small run-time for one function does tell you that that function used an insignificant fraction of the whole program's time. Usually this means it is not worth optimizing.

One way to get more accuracy is to give your program more (but similar) input data so it will take longer. Another way is to combine the data from several runs, using the `-s' option of gprof. Here is how:

Run your program once.
Issue the command `mv gmon.out gmon.sum'.
Run your program again, the same as before.

Merge the new data in gmon.out into gmon.sum with this command:

          gprof -s executable-file gmon.out gmon.sum

Repeat the last two steps as often as you wish.

Analyze the cumulative data using this command:

          gprof executable-file gmon.sum > output-file