User:GijsKruitbosch/JS Debugging/Profiling

From MozillaWiki
Jump to: navigation, search

Results

For the impatient:

http://www.gijsk.com/mozilla/soc/jprof2.html (790kb, fast machine, less data) http://www.gijsk.com/mozilla/soc/jprof.html (1.7MB, slow machine, more data)


Main machine (fast)

On a 3.4Ghz P4, 1GB of DDR2 RAM in Dual Channel, with lots of diskspace to spare, running Ubuntu Dapper and GNOME, Venkman took less than 3 seconds to start. The full profile can be found here (roughly 790kb html file). A summary of the top of the flat profile is as follows:

68   5.1     js_LineNumberToPC
61   4.5     js_SrcNoteLength
45   3.4     js_Interpret
33   2.5     js_PCToLineNumber
23   1.7     js_SearchScope
21   1.6     free
17   1.3     pthread_mutex_lock
15   1.1     __i686.get_pc_thunk.bx
13   1.0     _PR_x86_AtomicDecrement
13   1.0     XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
13   1.0     js_LookupPropertyWithFlags
11   0.8     JS_HashTableRawLookup
11   0.8     pthread_mutex_unlock
10   0.7     PR_Unlock
10   0.7     js_Mark
10   0.7     .L206
10   0.7     js_GetSrcNoteOffset
9   0.7     __i686.get_pc_thunk.bx
9   0.7     _PR_x86_AtomicIncrement
8   0.6     js_GetProperty
8   0.6     __i686.get_pc_thunk.bx
8   0.6     JS_GetPrivate
7   0.5     pthread_equal
7   0.5     .L2285
7   0.5     jsd_FunctionCallHook
7   0.5     js_CompareAndSwap
7   0.5     .plt
7   0.5     nsCOMPtr_base::~nsCOMPtr_base()
7   0.5     js_GetGCThingFlags
7   0.5     js_Invoke
6   0.4     js_GetToken
6   0.4     nsXPTType::nsXPTType()
6   0.4     nsID::Equals(nsID const&) const
6   0.4     __pthread_getspecific
6   0.4     PR_AtomicIncrement
6   0.4     js_AddScopeProperty
6   0.4     PR_Lock
6   0.4     __libc_malloc
5   0.4     js_atom_marker
5   0.4     JS_GetClass
5   0.4     fegetexcept
5   0.4     memcpy
5   0.4     PR_GetCurrentThread
5   0.4     JS_GetContextThread
5   0.4     .L2258
5   0.4     XPCCallContext::XPCCallContext(XPCContext::LangType, JSContext*, JSObject*, JSObject*, long, unsigned int, long*, long*)
5   0.4     __libc_enable_secure
5   0.4     GetPropertyTreeChild
5   0.4     memmove
5   0.4     js_SetProperty
5   0.4     js_NewGCThing

Laptop (slow)

On a 850Mhz Celeron, 384MB RAM, running the same Ubuntu Dapper and GNOME, Venkman took slightly less than 20 seconds to start. The full profile can be found here (warning: 1.7MB html file). A summary of the top of the flat profile is as follows:

359   4.9     js_Interpret
254   3.5     js_LineNumberToPC
171   2.3     js_SearchScope
165   2.2     js_SrcNoteLength
148   2.0     js_LookupPropertyWithFlags
143   1.9     js_PCToLineNumber
137   1.9     free
91   1.2     js_Invoke
79   1.1     JS_HashTableRawLookup
75   1.0     js_GetProperty
74   1.0     pthread_mutex_lock
72   1.0     __i686.get_pc_thunk.bx
66   0.9     nsID::Equals(nsID const&) const
64   0.9     .L1306
55   0.7     __libc_malloc
55   0.7     .L2258
54   0.7     js_Mark
51   0.7     js_GetSrcNoteOffset
51   0.7     .L206
51   0.7     .L2285
49   0.7     JS_GetPrivate
49   0.7     __i686.get_pc_thunk.bx
48   0.7     js_SetProperty
47   0.6     XPCWrappedNative::GetWrappedNativeOfJSObject(JSContext*, JSObject*, JSObject*, JSObject**, XPCWrappedNativeTearOff**)
46   0.6     malloc_usable_size
43   0.6     pthread_mutex_unlock
42   0.6     XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
37   0.5     PR_Unlock
37   0.5     .plt
36   0.5     __i686.get_pc_thunk.bx
36   0.5     __libc_enable_secure
35   0.5     js_AddScopeProperty
34   0.5     PR_GetCurrentThread
34   0.5     __pthread_getspecific
33   0.4     __libc_calloc
33   0.4     JS_DHashTableOperate
32   0.4     js_FindProperty
32   0.4     __i686.get_pc_thunk.bx
32   0.4     js_NewGCThing
31   0.4     JS_GetClass
31   0.4     nsAString_internal::~nsAString_internal()
30   0.4     PR_Lock
29   0.4     GetPropertyTreeChild
29   0.4     __i686.get_pc_thunk.bx
29   0.4     js_GC
28   0.4     GetChar
28   0.4     SearchTable
27   0.4     js_MarkScript
26   0.4     js_GetGCThingFlags
26   0.4     nsWindowSH::GetProperty(nsIXPConnectWrappedNative*, JSContext*, JSObject*, long, long*, int*)

Preliminary conclusions

First of all, major kudos to Silver aka James Ross for helping me out here. It was much appreciated.

Now, if you look at the tops of the profiles, you can see that js_LineNumberToPC is taking up a fair amount of time. The (speculated) reason for this is that this line is taking up a large amount of time, because Venkman will be called back and do magic for each script jsd knows about (note: this eventually calling this, which is what you see happening in the profile).

So, this call seems to be taking up a lot of time, but without them you can't debug anything but new scripts you load. At which point I thought, well, a webdeveloper won't care about most of the scripts are loaded (compare the 'exclude chrome' filter in Venkman). It would be much better if we were to somehow only care about a subset of the loaded scripts, reducing the amount of time spent trying to figure out everything we have loaded.

Silver came up with a number of possible solutions:

  1. filter inside JSD so it only enumerates what you claim you're interested in. That'd mean Venkman needs to re-ask it when the user says to load them all.
  2. filter in the enumerator hook in Venkman, so it knows about them all but only loads the full shebang for each when it wants.
  3. see if you can't stop Venkman getting the line map in onScriptCreated.

(of course, for 'Venkman', one could read $FRONTEND_DEBUGGER if necessary - but the separation between front end and jsd is relevant).

Filtering inside JSD would require new API functions/methods/constants/what-have-you. We can't do that on the 1.8 Branch without new interfaces, and we can't do it at all after June 20th, at least such is the current schedule. I'm reluctant in making promises that I can produce a new API like that, being sufficiently stable to be used as-is in less than two weeks (keep in mind it needs review and baking). Thoughts? Filtering in Venkman/FE would be easier, though I'm wondering what the tradeoff between filtering in the backend compared to filtering in the front-end would be.

I think that's all for now - anyone reading this, feel free to direct comments to email, discussion page here, comment on my blog, or bug 338978, etc.