Services/Sync/Server/Archived/HereComesEverybody: Difference between revisions
Jump to navigation
Jump to search
LesOrchard (talk | contribs) |
LesOrchard (talk | contribs) |
||
| Line 11: | Line 11: | ||
** [[Labs/Weave/ServerDataCharacteristics | Notes on our data characteristics]] | ** [[Labs/Weave/ServerDataCharacteristics | Notes on our data characteristics]] | ||
=== | === Candidates === | ||
* MySQL variants | * MySQL variants | ||
Revision as of 17:04, 8 February 2010
The "Here Comes Everybody" release is intended to scale services.mozilla.com and Weave Sync up to match potential usage as a built-in feature in Firefox.
Storage
- What are our selection criteria?
- Cost per user?
- Ops complexity?
- Feature development constraints?
- Notes on our data characteristics
Candidates
- MySQL variants
- Classic - single WBO table, InnoDB, no memcached
- Current implementation
- Need to add memcache to collection counts / dates?
- tests/python/run_server_tests.py yields failures=11, errors=2
- Table-per-user, MyISAM
- http://code.flickr.com/blog/2010/02/08/using-abusing-and-scaling-mysql-at-flickr/
- Can MySQL handle / directory hash millions of tables?
- Classic - single WBO table, InnoDB, no memcached
- MongoDB
- telliott has a PHP storage backend
- tests/python/run_server_tests.py yields failures=22, errors=4
- Cassandra
- User:LesOrchard is working on a PHP storage backend
- Cassandra backend implementation notes
- tests/python/run_server_tests.py yields failures=25, errors=10
- HBase?
- Hypertable?
Rejected
- MySQL variants
- Collection-per-table, InnoDB
- Not enough space savings to justify reworking and restrictions introduced
- Collection-per-table, InnoDB
- CouchDB
- Rejected already? Concerns over ability to keep a large scale installation running
- SQLite DB per user
- MHanson has experimented with a sqlite-based server that creates one db per user. In the default configuration it runs very slowly, because of fsync overhead on ext3. Turning off file synchronization helps a lot (250x speedup) but is obviously more dangerous. A different file system could help. Note that the Minimal Server does run on sqlite, because we do not expect heavy concurrent usage.
- Amazon Web Services
Service
There's been mention of rebuilding the service in a new language / framework.
Needs investigation - is the current impl lacking, how / why? Are things more storage-bound, or does the service currently introduce significant CPU / latency issues?
- What are our selection criteria?
- Cost per user?
- Ops complexity?
- Feature development constraints?
Evaluating
- PHP
- Current implementation in plain PHP
- Kohana?
- Not be worth it (vs switching to Python) unless there's a significant web UI added atop the REST API
- Python
- web.py
- Hosted atop apache / mod_wsgi
- Minimal python web framework
- May be too minimal to use beyond a REST API
- Django
- Hosted atop apache / mod_wsgi
- May be overkill for just a REST service, but could help with an added web UI
- See also, AMO / Zamboni
- Twisted
- Event-driven networking engine
- Can be high-performance, but is semi-exotic
- Tornado
- Non-blocking / event-driven web server
- Used by FriendFeed
- Comparable to Twisted, possibly less exotic, though still somewhat unusual
- web.py
Capacity / Load Testing
Methodology
- Need to get our science on.
- Formulate the questions / testing criteria
- Max users per cluster unit?
- Cost / user
- Develop a traffic / load model
- Based on 1 day / 1 week of current Weave service logs?
- Employ a load cluster
- Machines to apply load to the test cluster
- eg. using Grinder or similar
- Do we already have this available?
- We have been using pm-weave05.mozilla.com as our load initiator.
- Build out a test cluster based on a storage / service arch prototype
- We have pm-weave06.mozilla.com available to act as a prototype; it talks to pm-weavefs03.mozilla.com as a DB. pm-weavefs06 has a clean database setup on it as well.
- Perform experimental run of the load model using load cluster on the test cluster
- Variables to monitor over time during load test:
- Concurrent users
- Service
- CPU load
- Latency per request
- Network usage
- Storage
- CPU load
- Disk space usage
- Disk I/O usage
- Repeat load model runs, explore capacity by increasing intensity until failure modes are encountered
- What failure modes?
- Unacceptable latency
- Storage space exhausted
- Connection refusal due to runtime resource exhaustion (e.g. LDAP socket usage, MySQL socket usage)
- What failure modes?
- Form conclusions to answer questions
- Use shared spreadsheet [1] to gather results.
Plan / Schedule
- We need one.
- Who does what and by when?
- We need to select & build the storage / service prototypes
- Build a schedule for setting up each prototype and running the experiment
Load Model
- Do we have logs / log analysis?
- Reads per day? Rate & amount
- Per each hour of a day, to model usage patterns?
- Writes per day? Rate & amount
- Per each hour of a day, to model usage patterns?
- Reads per day? Rate & amount
Testing Hardware
- What do we have, what can we get?
- Need a cluster of machines to retask for experimental evaluation of each storage tech under consideration
- Need machines to apply load to the experiment cluster
Tools
- Grinder
- ab
- log_replay
- Ask oremj?
- XDebug
- For profiling PHP, though it's doubtful there's enough PHP in play for it to make an order-of-magnitude difference vs storage concerns.
- We have a hand-crafted load tool, which is a blessing and a curse: [2]. It can simulate more complex interactions, e.g. create a bunch of users, do a bunch of inserts, delete users, with a rolling window, which might be harder to do with scripted tools.
Developer Relations
- https://wiki.mozilla.org/Labs/Weave/Developer
- http://mozillalabs.com/weave/2010/02/05/weave-sync-new-apis-and-resources-for-developers/
- Need to build a developers.service.mozilla.com?
- See also: https://addons.mozilla.org/en-US/developers
- Present Weave / services.mozilla.com as an open service
- Messaging / copy / design needed
- Forums?
- Offer updated service docs and example clients
- Weave/Experimental_Clients/Web
- Weave/Experimental_Clients/iPhone
- Weave/Experimental_Clients/WebOS
- User:LesOrchard has worked on this in free time - time to apply official time to it?
- python command-line client
- User:Mhanson is working on this.
Marketing
- Marketing the Weave Sync addon
- Who to target?
- What expected growth?