Necko/MobileCache/MicroBenchmarks: Difference between revisions

Line 53: Line 53:


==== Real-life verification of results from the lab ====
==== Real-life verification of results from the lab ====
Telemetry monitors real-life efficiency of the cache, hence by monitoring the right values we can ensure that improvements we see in the lab also benefits real users. Exactly which measurements we need is not entirely clear yet (work is in progress to determine this).
Telemetry monitors real-life efficiency of the cache, hence by monitoring the right values we can ensure that improvements we see in the lab also benefit real users. Exactly which measurements we need is not entirely clear yet (work is in progress to determine this).


An important point is that in order to verify results like this we should measure the same (or at least very similar) values in microbenchmarks and in telemetry. We should also, of-course, measure other values from telemetry in order to cross-check our results, but in order to verify lab-improvements in real life we should align measurements. The rationale for this is as follows: Let's say we want to improve characteristic A. We make a benchmark which measure A, fix the code, and see a clear improvement of A in the lab so we push the code-fix. Let's say telemetry do not measure A but rather some other characteristic B. What if B did not improve (or even degraded)? Is it because A and B are unrelated? Or is it because A actually didn't improve in real-life? We want to first verify that A actually did improve in real-life, then we can discuss why B did not improve, and then decide whether the code-change is worth keeping. Such results will also increase our understanding of how the caching works in general.
An important point is that in order to verify results like this we must measure the same (or at least very similar) values in microbenchmarks and in telemetry. We should also, of-course, measure other values from telemetry in order to cross-check our results, but in order to verify lab-improvements in real life we must align measurements. The rationale for this is as follows: Let's say we want to improve characteristic A. We make a benchmark to measure A, fix the code, and see a clear improvement of A in the lab so we push the code-fix. Let's say telemetry do ''not'' measure A but rather some other characteristic B. What if B didn't improve (or even degraded)? Is it because A and B are unrelated? Or is it because A actually didn't improve in real-life at all? We must first verify that A actually ''did'' improve in real-life, ''then'' we can discuss why B did ''not'' improve, and finally we can decide whether the code-change is worth keeping. Such results will also increase our understanding of how the caching works in general.


Thus, the suggested strategy is to first introduce a telemetry-probe to gather baseline-data for some characteristic we want to improve, then introduce a code-fix which improves the characteristic in the lab, then finally monitor changes from telemetry to verify that the improvement from the lab. In the last step, the particular characteristic should be monitored as well as other (more general) performance-probes for cross-checking.
Thus, the suggested strategy is to first introduce a telemetry-probe to gather baseline-data for some characteristic we want to improve, then introduce a code-fix which improves the characteristic in the lab, then finally monitor changes from telemetry to verify that the improvement from the lab. In the last step, the particular characteristic should be monitored as well as other (more general) performance-probes for cross-checking.
97

edits