User:GijsKruitbosch/JS Debugging/Profiling
Results
For the impatient:
http://www.gijsk.com/mozilla/soc/jprof2.html (790kb, fast machine, less data) http://www.gijsk.com/mozilla/soc/jprof.html (1.7MB, slow machine, more data)
Main machine (fast)
On a 3.4Ghz P4, 1GB of DDR2 RAM in Dual Channel, with lots of diskspace to spare, running Ubuntu Dapper and GNOME, Venkman took less than 3 seconds to start. The full profile can be found here (roughly 790kb html file). A summary of the top of the flat profile is as follows:
68 5.1 js_LineNumberToPC 61 4.5 js_SrcNoteLength 45 3.4 js_Interpret 33 2.5 js_PCToLineNumber 23 1.7 js_SearchScope 21 1.6 free 17 1.3 pthread_mutex_lock 15 1.1 __i686.get_pc_thunk.bx 13 1.0 _PR_x86_AtomicDecrement 13 1.0 XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) 13 1.0 js_LookupPropertyWithFlags 11 0.8 JS_HashTableRawLookup 11 0.8 pthread_mutex_unlock 10 0.7 PR_Unlock 10 0.7 js_Mark 10 0.7 .L206 10 0.7 js_GetSrcNoteOffset 9 0.7 __i686.get_pc_thunk.bx 9 0.7 _PR_x86_AtomicIncrement 8 0.6 js_GetProperty 8 0.6 __i686.get_pc_thunk.bx 8 0.6 JS_GetPrivate 7 0.5 pthread_equal 7 0.5 .L2285 7 0.5 jsd_FunctionCallHook 7 0.5 js_CompareAndSwap 7 0.5 .plt 7 0.5 nsCOMPtr_base::~nsCOMPtr_base() 7 0.5 js_GetGCThingFlags 7 0.5 js_Invoke 6 0.4 js_GetToken 6 0.4 nsXPTType::nsXPTType() 6 0.4 nsID::Equals(nsID const&) const 6 0.4 __pthread_getspecific 6 0.4 PR_AtomicIncrement 6 0.4 js_AddScopeProperty 6 0.4 PR_Lock 6 0.4 __libc_malloc 5 0.4 js_atom_marker 5 0.4 JS_GetClass 5 0.4 fegetexcept 5 0.4 memcpy 5 0.4 PR_GetCurrentThread 5 0.4 JS_GetContextThread 5 0.4 .L2258 5 0.4 XPCCallContext::XPCCallContext(XPCContext::LangType, JSContext*, JSObject*, JSObject*, long, unsigned int, long*, long*) 5 0.4 __libc_enable_secure 5 0.4 GetPropertyTreeChild 5 0.4 memmove 5 0.4 js_SetProperty 5 0.4 js_NewGCThing
Laptop (slow)
On a 850Mhz Celeron, 384MB RAM, running the same Ubuntu Dapper and GNOME, Venkman took slightly less than 20 seconds to start. The full profile can be found here (warning: 1.7MB html file). A summary of the top of the flat profile is as follows:
359 4.9 js_Interpret 254 3.5 js_LineNumberToPC 171 2.3 js_SearchScope 165 2.2 js_SrcNoteLength 148 2.0 js_LookupPropertyWithFlags 143 1.9 js_PCToLineNumber 137 1.9 free 91 1.2 js_Invoke 79 1.1 JS_HashTableRawLookup 75 1.0 js_GetProperty 74 1.0 pthread_mutex_lock 72 1.0 __i686.get_pc_thunk.bx 66 0.9 nsID::Equals(nsID const&) const 64 0.9 .L1306 55 0.7 __libc_malloc 55 0.7 .L2258 54 0.7 js_Mark 51 0.7 js_GetSrcNoteOffset 51 0.7 .L206 51 0.7 .L2285 49 0.7 JS_GetPrivate 49 0.7 __i686.get_pc_thunk.bx 48 0.7 js_SetProperty 47 0.6 XPCWrappedNative::GetWrappedNativeOfJSObject(JSContext*, JSObject*, JSObject*, JSObject**, XPCWrappedNativeTearOff**) 46 0.6 malloc_usable_size 43 0.6 pthread_mutex_unlock 42 0.6 XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) 37 0.5 PR_Unlock 37 0.5 .plt 36 0.5 __i686.get_pc_thunk.bx 36 0.5 __libc_enable_secure 35 0.5 js_AddScopeProperty 34 0.5 PR_GetCurrentThread 34 0.5 __pthread_getspecific 33 0.4 __libc_calloc 33 0.4 JS_DHashTableOperate 32 0.4 js_FindProperty 32 0.4 __i686.get_pc_thunk.bx 32 0.4 js_NewGCThing 31 0.4 JS_GetClass 31 0.4 nsAString_internal::~nsAString_internal() 30 0.4 PR_Lock 29 0.4 GetPropertyTreeChild 29 0.4 __i686.get_pc_thunk.bx 29 0.4 js_GC 28 0.4 GetChar 28 0.4 SearchTable 27 0.4 js_MarkScript 26 0.4 js_GetGCThingFlags 26 0.4 nsWindowSH::GetProperty(nsIXPConnectWrappedNative*, JSContext*, JSObject*, long, long*, int*)
Preliminary conclusions
First of all, major kudos to Silver aka James Ross for helping me out here. It was much appreciated.
Now, if you look at the tops of the profiles, you can see that js_LineNumberToPC is taking up a fair amount of time. The (speculated) reason for this is that this line is taking up a large amount of time, because Venkman will be called back and do magic for each script jsd knows about (note: this eventually calling this, which is what you see happening in the profile).
So, this call seems to be taking up a lot of time, but without them you can't debug anything but new scripts you load. At which point I thought, well, a webdeveloper won't care about most of the scripts are loaded (compare the 'exclude chrome' filter in Venkman). It would be much better if we were to somehow only care about a subset of the loaded scripts, reducing the amount of time spent trying to figure out everything we have loaded.
Silver came up with a number of possible solutions:
- filter inside JSD so it only enumerates what you claim you're interested in. That'd mean Venkman needs to re-ask it when the user says to load them all.
- filter in the enumerator hook in Venkman, so it knows about them all but only loads the full shebang for each when it wants.
- see if you can't stop Venkman getting the line map in onScriptCreated.
(of course, for 'Venkman', one could read $FRONTEND_DEBUGGER if necessary - but the separation between front end and jsd is relevant).
Filtering inside JSD would require new API functions/methods/constants/what-have-you. We can't do that on the 1.8 Branch without new interfaces, and we can't do it at all after June 20th, at least such is the current schedule. I'm reluctant in making promises that I can produce a new API like that, being sufficiently stable to be used as-is in less than two weeks (keep in mind it needs review and baking). Thoughts? Filtering in Venkman/FE would be easier, though I'm wondering what the tradeoff between filtering in the backend compared to filtering in the front-end would be.
I think that's all for now - anyone reading this, feel free to direct comments to email, discussion page here, comment on my blog, or bug 338978, etc.