Breakpad/WorkWeek: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
 
(22 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Monday =
= Mission =
* Orientation, introductions, goals
Make the web better.
* Access to everything (IT)
* Harvest crash report data
* Discuss hardware, optimization
* Generate aggregate statistics on data to understand trends
* Database optimization, structure
* Publish individual reports for drill-downs and forensics
= Tuesday =
* Determine the cause and frequency of all major crashes
* Software and query optimization
* Help Mozilla focus development resources in areas of highest impact
* How to use explain properly
 
= Wednesday =
= Goals this Week =
= Thursday =
* Understand why reporter queries time out
= Friday =
* Evaluate partitioning strategy
* Evaluate hardware configuration
* Evaluate backup and replication methods
* Develop better/best practices for PostgreSQL
 
= TODO =
* Performance
**pg.conf changes
**fs/os changes
**query tuning
***monitor
***reports
***interactive query tuning
****memory checks
**index/schema check
**performance monitoring
***install tools
***check results
**re-tuning & hw recommendations
**application performance check
***monitor/processor -- python
***reporting -- php
**full text for signatures
* Failover
**Uptime requirements
**Prepare failover plan
**Failover config & scripts
* Partitioning
**Partition scheme check
**Check for bad queries
**Rolloff plans
**Deal with archive partition
* IT Admin
**Monitoring -- plans
***Implementation
**Backup & Redundancy plan
***Implementation
* Capacity Planning
**(see above) Archiving Discussion
***Implement archiving?
***RRD?
**Data growth analysis
**New feature planning
* [https://intranet.mozilla.org/Socorro:DumpingDumpTables DumpingDumpTables]
 
= Schedule =
* Tuesday
**Partitioning
***RRD/Archving?
**Python Check
**Full Text
**Initial Replication
* Wednesday
**Continue replication
**Query analysis etc.
**Capacity planning data
**Bug list and future feature planning
**Interactive query checking
* Thursday
**Capacity Planning session
***Schema changes?
**PHP analysis?
**Deploy replication?
* Friday
**Query analysis pt.2
**Troubleshooting
 
= Travel Schedules =
* aking
** Arrive SJ @ 10:30 - will miss first half of Monday's meetings
** Depart SJ @ 17:10 - cab @ 3pmish
** Wild Palms
* lars
** Arrive SJ @ 17:00 Sunday, March 8
** Depart SF @ 22:12 Saturday, March 14
** Sweaty Palms
* morgamic
** Arrive SJ @ 22:30 Sunday, March 8
** Depart SJ @ 19:20 Friday, March 13
** Wild Palms
 
= Indexes to Remove =
* productdims.productdims_product_version_key (not used, table is small)
* reports_########.reports_########_product_version_key
* topcrashers.idx_topcrashers_total
* urldims.urldims_pkay
* drop urldims.id altogether
* urldims.urldims_url_domain_key
* create pkey on url
 
= Notable Queries =
* select * from pg_stat_user_tables where seq_scan > 100;
* select * from pg_stat_user_tables where seq_scan > 100 and pg_relation_size(relname) > 800000;
* select * from pg_statio_user_tables where relname = 'urldims';

Latest revision as of 22:54, 13 March 2009

Mission

Make the web better.

  • Harvest crash report data
  • Generate aggregate statistics on data to understand trends
  • Publish individual reports for drill-downs and forensics
  • Determine the cause and frequency of all major crashes
  • Help Mozilla focus development resources in areas of highest impact

Goals this Week

  • Understand why reporter queries time out
  • Evaluate partitioning strategy
  • Evaluate hardware configuration
  • Evaluate backup and replication methods
  • Develop better/best practices for PostgreSQL

TODO

  • Performance
    • pg.conf changes
    • fs/os changes
    • query tuning
      • monitor
      • reports
      • interactive query tuning
        • memory checks
    • index/schema check
    • performance monitoring
      • install tools
      • check results
    • re-tuning & hw recommendations
    • application performance check
      • monitor/processor -- python
      • reporting -- php
    • full text for signatures
  • Failover
    • Uptime requirements
    • Prepare failover plan
    • Failover config & scripts
  • Partitioning
    • Partition scheme check
    • Check for bad queries
    • Rolloff plans
    • Deal with archive partition
  • IT Admin
    • Monitoring -- plans
      • Implementation
    • Backup & Redundancy plan
      • Implementation
  • Capacity Planning
    • (see above) Archiving Discussion
      • Implement archiving?
      • RRD?
    • Data growth analysis
    • New feature planning
  • DumpingDumpTables

Schedule

  • Tuesday
    • Partitioning
      • RRD/Archving?
    • Python Check
    • Full Text
    • Initial Replication
  • Wednesday
    • Continue replication
    • Query analysis etc.
    • Capacity planning data
    • Bug list and future feature planning
    • Interactive query checking
  • Thursday
    • Capacity Planning session
      • Schema changes?
    • PHP analysis?
    • Deploy replication?
  • Friday
    • Query analysis pt.2
    • Troubleshooting

Travel Schedules

  • aking
    • Arrive SJ @ 10:30 - will miss first half of Monday's meetings
    • Depart SJ @ 17:10 - cab @ 3pmish
    • Wild Palms
  • lars
    • Arrive SJ @ 17:00 Sunday, March 8
    • Depart SF @ 22:12 Saturday, March 14
    • Sweaty Palms
  • morgamic
    • Arrive SJ @ 22:30 Sunday, March 8
    • Depart SJ @ 19:20 Friday, March 13
    • Wild Palms

Indexes to Remove

  • productdims.productdims_product_version_key (not used, table is small)
  • reports_########.reports_########_product_version_key
  • topcrashers.idx_topcrashers_total
  • urldims.urldims_pkay
  • drop urldims.id altogether
  • urldims.urldims_url_domain_key
  • create pkey on url

Notable Queries

  • select * from pg_stat_user_tables where seq_scan > 100;
  • select * from pg_stat_user_tables where seq_scan > 100 and pg_relation_size(relname) > 800000;
  • select * from pg_statio_user_tables where relname = 'urldims';