Services/Sync/Server/Archived/HereComesEverybody
From MozillaWiki
< Services | Sync | Server | Archived(Redirected from Labs/Weave/Roadmap/HereComesEverybody)
The "Here Comes Everybody" release is intended to scale services.mozilla.com and Weave Sync up to match potential usage as a built-in feature in Firefox.
Contents
Storage
- What are our selection criteria?
- Cost per user?
- Ops complexity?
- Feature development constraints?
- Concurrent users?
- Latency?
- Notes on our data characteristics
Candidates
- MySQL variants
- Classic - single WBO table, InnoDB, no memcached
- Current implementation
- Need to add memcache to collection counts / dates?
- Passes tests/test_1.0.php
- tests/python/run_server_tests.py yields failures=10
- Table-per-user, MyISAM
- Can MySQL handle / directory hash millions of tables?
- Maybe Flickr has some useful MySQL hacks?
- Classic - single WBO table, InnoDB, no memcached
- MongoDB
- telliott has a PHP storage backend
- Passes tests/test_1.0.php
- tests/python/run_server_tests.py yields failures=22, errors=4
- Cassandra
- User:LesOrchard is working on a PHP storage backend
- Cassandra backend implementation notes
- Passes tests/test_1.0.php
- tests/python/run_server_tests.py yields failures=15
- Most of the test failures look like issues with tests (eg. order sensitive where not necessary, issues with index.php rather than with storage backend)
- HBase?
- Hypertable?
Rejected
- MySQL variants
- Collection-per-table, InnoDB
- Not enough space savings to justify reworking and restrictions introduced
- Collection-per-table, InnoDB
- CouchDB
- Rejected already? Concerns over ability to keep a large scale installation running
- We should unreject this candidate until after some discussion. We could distribute the collections across many databases, many not in Mozilla's control, given the proper architecture. CouchDB would be an interesting candidate for that. See TimelessN (tellis at mozillaDotCom) for details on this. (Note that we will probably re-reject this after the discussion)
- SQLite DB per user
- MHanson has experimented with a sqlite-based server that creates one db per user. In the default configuration it runs very slowly, because of fsync overhead on ext3. Turning off file synchronization helps a lot (250x speedup) but is obviously more dangerous. A different file system could help. Note that the Minimal Server does run on sqlite, because we do not expect heavy concurrent usage.
- Amazon Web Services
Service
There's been mention of rebuilding the service in a new language / framework.
Needs investigation - is the current impl lacking, how / why? Are things more storage-bound, or does the service currently introduce significant CPU / latency issues?
- What are our selection criteria?
- Cost per user?
- Ops complexity?
- Feature development constraints?
Considerations:
- High latency is okay
- Tabs are updated very frequently; history somewhat frequently; everything else less so.
Candidates
This isn't (yet?) a concern in terms of performance / storage capacity. It's more a concern about maintainability and feature velocity.
- PHP
- Current implementation in plain PHP
- Kohana?
- Not be worth it (vs switching to Python) unless there's a significant web UI added atop the REST API
- Python
- web.py
- Hosted atop apache / mod_wsgi
- Minimal python web framework
- May be too minimal to use beyond a REST API
- Django
- Implementation in progress (jensd)
- Hosted atop apache / mod_wsgi
- May be overkill for just a REST service, but could help with an added web UI
- See also, AMO / Zamboni
- A lot of Django's goodness comes from the ORM, which is not available if you're not on a SQL db.
- Twisted
- Event-driven networking engine
- Can be high-performance, but is semi-exotic
- Tornado
- Implementation in progress (mhanson)
- Non-blocking / event-driven web server
- Used by FriendFeed
- Comparable to Twisted, possibly less exotic, though still somewhat unusual
- web.py
Capacity / Load Testing
Methodology
- Need to get our science on.
- Formulate the questions / testing criteria
- Max users per cluster unit?
- Cost / user
- Develop a traffic / load model
- Based on 1 day / 1 week of current Weave service logs?
- Employ a load cluster
- Machines to apply load to the test cluster
- eg. using Grinder or similar
- Do we already have this available?
- We have been using pm-weave05.mozilla.com as our load initiator.
- Build out a test cluster based on a storage / service arch prototype
- We have pm-weave06.mozilla.com available to act as a prototype; it talks to pm-weavefs03.mozilla.com as a DB. pm-weavefs06 has a clean database setup on it as well.
- Perform experimental run of the load model using load cluster on the test cluster
- Variables to monitor over time during load test:
- Concurrent users
- Service
- CPU load
- Latency per request
- Network usage
- Storage
- CPU load
- Disk space usage
- Disk I/O usage
- Repeat load model runs, explore capacity by increasing intensity until failure modes are encountered
- What failure modes?
- Unacceptable latency
- Storage space exhausted
- Connection refusal due to runtime resource exhaustion (e.g. LDAP socket usage, MySQL socket usage)
- What failure modes?
- Form conclusions to answer questions
- Use shared spreadsheet [1] to gather results.
Plan / Schedule
Load Model
- Do we have logs / log analysis?
- Reads per day? Rate & amount
- Per each hour of a day, to model usage patterns?
- Writes per day? Rate & amount
- Per each hour of a day, to model usage patterns?
- Reads per day? Rate & amount
Testing Hardware
- What do we have, what can we get?
- Need a cluster of machines to retask for experimental evaluation of each storage tech under consideration
- Need machines to apply load to the experiment cluster
Tools
- Grinder
- ab
- log_replay
- Ask oremj?
- XDebug
- For profiling PHP, though it's doubtful there's enough PHP in play for it to make an order-of-magnitude difference vs storage concerns.
- We have a hand-crafted load tool, which is a blessing and a curse: [2]. It can simulate more complex interactions, e.g. create a bunch of users, do a bunch of inserts, delete users, with a rolling window, which might be harder to do with scripted tools.
Developer Relations
- https://wiki.mozilla.org/Labs/Weave/Developer
- http://mozillalabs.com/weave/2010/02/05/weave-sync-new-apis-and-resources-for-developers/
- Need to build a developers.service.mozilla.com?
- See also: https://addons.mozilla.org/en-US/developers
- Present Weave / services.mozilla.com as an open service
- Messaging / copy / design needed
- Forums?
- Offer updated service docs and example clients
- Weave/Experimental_Clients/Web
- Weave/Experimental_Clients/iPhone
- Weave/Experimental_Clients/WebOS
- User:LesOrchard has worked on this in free time - time to apply official time to it?
- python command-line client
- User:Mhanson is working on this.
Marketing
- Marketing the Weave Sync addon
- Who to target?
- What expected growth?