148
edits
Changyihsin (talk | contribs) |
No edit summary |
||
| Line 101: | Line 101: | ||
</pre> | </pre> | ||
= Profiling with perf = | = Profiling B2G with perf = | ||
The perf utility is a performance analysis tools for Linux.<br> | The perf utility is a performance analysis tools for Linux. | ||
== Setup == | |||
The profiling data is collected at target device, and the report been generated at host side.<br> | |||
You need to install perf tool at host side, and create a directory for kernel and libraries with symbols. | |||
* Install perf at host side for Ubuntu | |||
$ sudo apt-get install linux-tools | |||
$ perf --version | |||
perf version 3.0.17 | |||
* Create direcotry for libaries with symbols<br>Here's a B2G makefile helper to create this directory. | |||
$ make perf-create-symfs | |||
== Real time report == | |||
On target device, use perf top to generate and display performance counters in real time. | |||
# perf top -p `pidof b2g` | |||
The output will be like this: | |||
PerfTop: 388 irqs/sec kernel:13.1% exact: 0.0% [1000Hz cycles], (target_pid: 7852) | |||
------------------------------------------------------------------------------- | |||
samples pcnt function DSO | |||
_______ _____ __________________________________ _________________ | |||
403.00 31.8% _downsample_2x2_rgba8888 libGLESv2_mali.so | |||
119.00 9.4% JaegerStubVeneer libxul.so | |||
93.00 7.3% _raw_spin_unlock_irqrestore [kernel.kallsyms] | |||
59.00 4.7% _m200_texture_deinterleave_16x16_b libMali.so | |||
56.00 4.4% memcpy libc.so | |||
40.00 3.2% finish_task_switch [kernel.kallsyms] | |||
37.00 2.9% vfprintf libc.so | |||
23.00 1.8% _gles_fb_tex_sub_image_2d libGLESv2_mali.so | |||
16.00 1.3% __sfvwrite libc.so | |||
16.00 1.3% __do_softirq [kernel.kallsyms] | |||
15.00 1.2% __memzero [kernel.kallsyms] | |||
13.00 1.0% getnstimeofday [kernel.kallsyms] | |||
12.00 0.9% _gles_generate_mipmaps_sw_16x16blo libGLESv2_mali.so | |||
12.00 0.9% snprintf libc.so | |||
12.00 0.9% __divsi3 libmozglue.so | |||
10.00 0.8% v7_dma_clean_range [kernel.kallsyms] | |||
== Recording for a period and generating report == | |||
Record at target side: (Hit CTRL-C to stop recording) | |||
# perf record -o /data/local/perf.data -p `pidof b2g` | |||
Generate report at host side: | |||
$ adb pull /data/local/perf.data . | |||
$ perf report --symfs=/tmp/b2g_symfs_galaxys2 --vmlinux=/vmlinux | |||
The output will be like this: | |||
# Events: 4K cycles | |||
# | |||
# Overhead Command Shared Object | |||
# ........ ....... ................. ............................................................................................... | |||
# | |||
8.00% b2g perf-7852.map [.] 0x438413fc | |||
4.46% b2g [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore | |||
4.36% b2g [unknown] [.] 0x43843500 | |||
2.61% b2g [kernel.kallsyms] [k] finish_task_switch | |||
1.69% b2g libxul.so [.] JaegerStubVeneer | |||
1.20% b2g libxul.so [.] TypedArrayTemplate<float>::obj_getElement(JSContext*, JSObject*, JSObject*, unsigned int, J | |||
1.06% b2g libxul.so [.] void js::mjit::stubs::SetElem<0>(js::VMFrame&) | |||
1.05% b2g libxul.so [.] js::mjit::stubs::GetElem(js::VMFrame&) | |||
1.01% b2g libc.so [.] pthread_mutex_lock | |||
1.00% b2g libc.so [.] memcpy | |||
0.90% b2g libxul.so [.] JSObject::nativeLookup(JSContext*, int) | |||
0.88% b2g [kernel.kallsyms] [k] sub_preempt_count | |||
0.86% b2g libGLESv2_mali.so [.] 0xa3a0 | |||
0.82% b2g [kernel.kallsyms] [k] add_preempt_count | |||
0.80% b2g [kernel.kallsyms] [k] __do_softirq | |||
0.79% b2g libxul.so [.] js_IsTypedArray(JSObject*) | |||
0.78% b2g libMali.so [.] 0x13be8 | |||
0.67% b2g libxul.so [.] js::GetPropertyHelper(JSContext*, JSObject*, int, unsigned int, JS::Value*) | |||
0.66% b2g libxul.so [.] js::PropertyTable::search(int, bool) | |||
0.66% b2g libxul.so [.] js_GetProperty(JSContext*, JSObject*, JSObject*, int, JS::Value*) | |||
0.65% b2g libc.so [.] pthread_mutex_unlock | |||
0.59% b2g libxul.so [.] castNativeFromWrapper(JSContext*, JSObject*, unsigned int, nsISupports**, JS::Value*, XPCLa | |||
0.57% b2g libmozglue.so [.] __udivsi3 | |||
0.53% b2g libxul.so [.] mozilla::gl::GLContextEGL::MakeCurrentImpl(bool) | |||
0.52% b2g libxul.so [.] XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) | |||
0.49% b2g libxul.so [.] js::TypedArray::getTypedArray(JSObject*) | |||
0.49% b2g libxul.so [.] js::GetPropertyOperation(JSContext*, unsigned char*, JS::Value const&, JS::Value*) | |||
0.48% b2g [kernel.kallsyms] [k] vector_swi | |||
0.47% b2g [kernel.kallsyms] [k] get_parent_ip | |||
0.42% b2g libxul.so [.] DisabledGetElem(js::VMFrame&, js::mjit::ic::GetElementIC*) | |||
== Recording with callgraph == | |||
Use option '-g' to do callgraph recording: | |||
# perf record -g -o /data/local/perf.data -p `pidof b2g` | |||
Note: | |||
# To get correct call graph report, you need to compile libaries with "-fno-omit-frame-pointer". | |||
# On SGS2 device, it's easy to crash when doing perf with callgraph, this is an issue to be fixed. | |||
== System-wide and specific application profiling == | |||
Use option '-a' to do system-wide profiling: | |||
# perf record -o /data/local/perf.data -a | |||
Profiling on specified command: | |||
# perf -o /data/local/perf.data /system/b2g/b2g | |||
Use option '-p' to profile an existing process: (On some devices there's no pidof, and you need to use ps to find out b2g PID) | |||
# perf record -o /data/local/perf.data -p `pidof b2g` | |||
== Makefile helpers for perf == | |||
Here are B2G makefile helpers to generate perf reports at host side. | |||
* Create direcotry for libaries with symbols | |||
$ make perf-create-symfs | |||
* Remove directory for libaries with symbols | |||
$ make perf-clean-symfs | |||
* Real time perf report for system wide | |||
$ make perf-top | |||
* Real time report for B2G process | |||
$ make perf-top-b2g | |||
* Summary perf report for system wide | |||
$ make perf-report | |||
* Summary perf report for B2G process | |||
$ make perf-report-b2g | |||
* Change recording duration<br>For perf-report-*, it automatically records for 10 seconds then generate report. You can change it by giving argument "RECORD_DURATION".<br>Below is an example to record for 30 seconds: | |||
$ make perf-report RECORD_DURATION=30 | |||
edits