Webdev:Meetings:2009-09-01

From MozillaWiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Open Items

  • AMO team is seeking ideas: We generate CSVs from our statistics data for add-on authors. There are 3 date groupings and 8 different ways to plot the data. The problem is, the historical data continues to grow and we're running out of memory building these huge CSVs on the fly. Ideas:
    • When the uninstall survey started dying on CSVs we used a cron to build them and cache them to disk. If we do this for all our add-ons that's well over 200,000 files and growing. Perhaps we can combine this with one of the other ideas.
    • Provide less historical data. Right now it goes all the way back. Restricting that is weak sauce.
    • Reduce the number of groupings/plots. What if we just provided CSVs for daily downloads with a couple sets of columns. That's only ~15000 files per set of columns. Still a lot.
      • 1 row = 1 day, right? What if past $x weeks in history, we only offered monthly totals? i.e. data older than 6 months is 1 row = 1 (week/month)
    • Generate CSVs for add-ons with more than $x weeks of history. Eventually we'll have #1.
    • Write something way lighter weight to build CSVs on the fly. We can't scale this way forever though.
    • Limit the number of rows returned but provide paging params to view older ranges of data
    • Output CSV as it is generated and bypass Cake views, thus avoiding the need to generate huge arrays of data
    • Each add-on id gets its own tables stats.addonid.* and some data is only offered for a year:
      • *.downloads: date, version, n° of downloads
      • *.usage_total: date; sum of update pings
      • *.usage_apps: date, app, update pings
      • *.usage_ly_versions: date, version, update pings (only for last year)
      • *.usage_ly_apps_and versions: date, version, app, appversion, update pings, userEnabked pings, incompatible pings (only for last year, is there a need for needsDependencies or blocklisted?)
      • *.usage_ly_os: date, app, os, update pings (only for last year)
    • Maybe a service to mail the developer the csv data once per week/month
    • Can metrics do this for us?
    • Why get all the data in memory at once? We could have a little service that builds the csv and streams it out to disk. Let that stay cached for however long is appropriate.
    • add more ideas! thx