Tamarin:WeeklyUpdates: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
Line 11: Line 11:
'''Next meeting's Agenda Items:'''
'''Next meeting's Agenda Items:'''


*(Andreas, Gregor) Register usage analysis for builtins.
*
*(Moh) Speeding up the hot loop of MMGc PinStackObjects using SSE2 parallel compares:
*
** https://bugzilla.mozilla.org/show_bug.cgi?id=446556
*(Dave A) AMD64 backend has landed.
*(Andreas) Update on SSE2 calling convention optimization. Should we switch to MacOSX gcc 4.2?


= July 15th 2008 =
= July 22nd 2008 =


== Attendees ==
== Attendees ==


*Adobe: Scott, Steven, Edwin, Jeff, Mason, Jennifer, Rob B., Ed, Rob W.
* intel: moh, carmen, shengnan, jungwoo
*Intel: Carmen, Jungwoo, Mingqui
* mozilla: joel, dave, andreas
*Mozilla: Brendan, Jordan, Andreas, Jason, Ben, Joel, Dave M., Dave A., Jim, Benjamin
* adobe: mason, dan, edwin s, rob, steve, vlad, jennifer, rick


== Updates ==
== Updates ==
*(Andreas) SSE2 calling conventions. SSE2-only JIT? Shall we specialize with SSE2-only?
** This may make slower processers even slower.  This would affect Flash.  This should be investigated though.  Would compile-time be enough? Need to look into this.  SSE2 or later machines would makes sense.
** bug 440601
*** gregor at UCI posted some python code to look at functions on mac
*** what are the opinions on ____ at build-time?  inline or static linking?
**** edwin thinks it's okay to have the utility only optimize for functions that we know are safe
**** performance-critical stuff seems to be in lib.c -- include them directly?  seems to sound okay!
*** would anyone be willing to buddy up with andreas for linux?
**** edwin suggests andreas do a little profiling on some built-ins to see if performance is improved first.
* Register usage analysis for builtins (andreas/gregor)
** andreas will try to provide some numbers for the next call
* (edwin) vprop for GCC
** working on some patches for this
*** will push patches, pending successful testing on windows
* (edwin) fragmento
** assembler is long-lived but we want it to be short-lived
** not re-using exit pages was using a lot of memory.
*** now, when recompiling a tree, optimizations make cache usage about 75% faster
** andreas asks: will implemental growth patch still be needed?
*** mason did the testing on bug 443111 -- some tests fail on linux, but other platforms look fine.
*** edwin will look into these crashes.
*** andreas will wait to make his check-in until edwin has investigated a bit further.
** edwin also did a lot of code clean-up with this patch.
** what about getting rid of the exit block entirely? (rick)
*** this would be nice, but haven't thought about the details yet.
** in the end, edwin thinks we will want to use shared side exists.
*** andreas has not tried this yet.
* (moh) Speeding up the hot loop of MMGc PinStackObjects using SSE2 parallel compares
** https://bugzilla.mozilla.org/show_bug.cgi?id=446556
*** a large amount of stack scanning is happening. 
*** moh has gotten about 7% performance improvements so far
*** there is a similar loop in MarkItem, but adjusting that has not yielded as much improvement.
*** edwin thinks this is a valuable patch
*(Jungwoo) Update on parallel JIT'ing in TT.   
*(Jungwoo) Update on parallel JIT'ing in TT.   
** Is trying moving all patches to interpreter side.
** Is trying moving all patches to interpreter side.
** Headsup on Moh's vectorization of hot search loop of MMgc::PinStackObjects using SSE2. Both Sunspider's string benchmarks (string-fasta & string-validate-input) show 8+% speedup.
** 15-30% speedups could be possible!
** Moh will explain more in next meeting.
*** edwin would like numbers with and without trees (these are without trees)
*(Andreas) SSE2 calling conventions. SSE2-only JIT? Shall we specialize with SSE2-only?
*** did intel look into MD5? no, not yet,
** This may make slower processers even slower.  This would affect Flash.  This should be investigated though.  Would compile-time be enough? Need to look into this.  SSE2 or later machines would makes sense.
 
** More investigations needed for SSE2.  Andreas will put a patch together for Mac and Linux.
*(Benjamin) More GC discussion - jemalloc+gc proposal
**semi-precise GC
**shared heap space with malloced memory
**threadsafety, even with write barriers
**at the cost of DRC
***Lars will be looking into this
***Flash Player benefits from string-like objects.
***MMGC could just do GC and drop fix mallloc. Tom is going to formulate a response to the email thread going around.
*(Ed) Worked up new patch for 8-byte instructions.  Made parallel patch for 4-byte.  In comparison, 4-byte won!  Will land another patch for this.
**RP and SP pointers: need to special case these pointers.  This will help keep us under the 8-byte instructions.
*(Steven) Is tracking down performance issues for a Flex-based app for Tamarin Tracing.  New issues are coming up with TT vs. TC issues in doing this.  
*(Steven) Is tracking down performance issues for a Flex-based app for Tamarin Tracing.  New issues are coming up with TT vs. TC issues in doing this.  
**Adobe is building continuous benchmarking is coming along and should be public in a few weeks. 
** working on improving superwords -- preliminary patch is posted for review
**Considering open tri-server so folks can push public patches. Might make sense to have this hosted via Mozilla. Ed to open a bug on this.
*** roughly 20% interpretor speed-up, but no tracing improvement
**Mozilla is also doing macro benchmarking that is being worked on.
*** trying to identify some other areas to work on
*(Andreas) TraceMonkay- 64-bit backend is now working! Will get pushed into Tamarin soon. Ed can help push patch if it gets too big funky.
 
**Carmen at Intel has also done some work with microbytes for SpiderMonkey.
*(Edwin) tri-server
*(Mingqui) Submitted 2 patches a few weeks ago.  Any update on these? Steven at Adobe to review these…
** shared sandbox is online!  everyone is welcome to use this now.
** working on modifying build system to pull patches from different users' repositories for they get queued up properly.
 
* (dan) automated performance metrics
** we have now published a performance page.
*** once a checkin is made, buildbot runs performance tests on various platforms and posts results.
*** weekly performance data is also summarized.
** how to track feedback to this system?  log in bugzilla.
 
*(dave) TraceMonkey:
** still working on additional patches 
***  will write up a wiki article soon
** basic ARM testing is underway
 
*(Mingqui) Submitted 2 patches a few weeks ago.  Steven at Adobe still needs to review these…
**https://bugzilla.mozilla.org/show_bug.cgi?id=438889  
**https://bugzilla.mozilla.org/show_bug.cgi?id=438889  
**https://bugzilla.mozilla.org/show_bug.cgi?id=438881
**https://bugzilla.mozilla.org/show_bug.cgi?id=438881

Revision as of 22:28, 22 July 2008

These updates concern Tamarin and related projects only.

Meeting Details

  • 2:00pm Pacific Time (21:00 UTC) on Tuesdays
    • (5PM Eastern US, 11PM Oslo, 6AM (Wed) Seoul, 7AM (Wed) Melbourne)
  • Location: Tel: 866 705 2554 (us), 913 227 1201 (int’l)
    • Passcode: 914008
  • Duration: 60 minutes
  • join irc.mozilla.org #tamarin for attendence taking and questions

Next meeting's Agenda Items:

July 22nd 2008

Attendees

  • intel: moh, carmen, shengnan, jungwoo
  • mozilla: joel, dave, andreas
  • adobe: mason, dan, edwin s, rob, steve, vlad, jennifer, rick

Updates

  • (Andreas) SSE2 calling conventions. SSE2-only JIT? Shall we specialize with SSE2-only?
    • This may make slower processers even slower. This would affect Flash. This should be investigated though. Would compile-time be enough? Need to look into this. SSE2 or later machines would makes sense.
    • bug 440601
      • gregor at UCI posted some python code to look at functions on mac
      • what are the opinions on ____ at build-time? inline or static linking?
        • edwin thinks it's okay to have the utility only optimize for functions that we know are safe
        • performance-critical stuff seems to be in lib.c -- include them directly? seems to sound okay!
      • would anyone be willing to buddy up with andreas for linux?
        • edwin suggests andreas do a little profiling on some built-ins to see if performance is improved first.
  • Register usage analysis for builtins (andreas/gregor)
    • andreas will try to provide some numbers for the next call
  • (edwin) vprop for GCC
    • working on some patches for this
      • will push patches, pending successful testing on windows
  • (edwin) fragmento
    • assembler is long-lived but we want it to be short-lived
    • not re-using exit pages was using a lot of memory.
      • now, when recompiling a tree, optimizations make cache usage about 75% faster
    • andreas asks: will implemental growth patch still be needed?
      • mason did the testing on bug 443111 -- some tests fail on linux, but other platforms look fine.
      • edwin will look into these crashes.
      • andreas will wait to make his check-in until edwin has investigated a bit further.
    • edwin also did a lot of code clean-up with this patch.
    • what about getting rid of the exit block entirely? (rick)
      • this would be nice, but haven't thought about the details yet.
    • in the end, edwin thinks we will want to use shared side exists.
      • andreas has not tried this yet.
  • (moh) Speeding up the hot loop of MMGc PinStackObjects using SSE2 parallel compares
    • https://bugzilla.mozilla.org/show_bug.cgi?id=446556
      • a large amount of stack scanning is happening.
      • moh has gotten about 7% performance improvements so far
      • there is a similar loop in MarkItem, but adjusting that has not yielded as much improvement.
      • edwin thinks this is a valuable patch
  • (Jungwoo) Update on parallel JIT'ing in TT.
    • Is trying moving all patches to interpreter side.
    • 15-30% speedups could be possible!
      • edwin would like numbers with and without trees (these are without trees)
      • did intel look into MD5? no, not yet,
  • (Steven) Is tracking down performance issues for a Flex-based app for Tamarin Tracing. New issues are coming up with TT vs. TC issues in doing this.
    • working on improving superwords -- preliminary patch is posted for review
      • roughly 20% interpretor speed-up, but no tracing improvement
      • trying to identify some other areas to work on
  • (Edwin) tri-server
    • shared sandbox is online! everyone is welcome to use this now.
    • working on modifying build system to pull patches from different users' repositories for they get queued up properly.
  • (dan) automated performance metrics
    • we have now published a performance page.
      • once a checkin is made, buildbot runs performance tests on various platforms and posts results.
      • weekly performance data is also summarized.
    • how to track feedback to this system? log in bugzilla.
  • (dave) TraceMonkey:
    • still working on additional patches
      • will write up a wiki article soon
    • basic ARM testing is underway

Older meetings