User:Ffledgling/Senbonzakura: Difference between revisions

Revision as of 21:28, 15 May 2014

This service (I'm calling it Senbonzakura for now) will generate partial .mar (Mozilla ARchive) files for updates from Version A to Version B on demand.

Benefits

Generate updates on the fly
Generate updates on a need-only basis
Separate the update mar generation process form the build process (speed up ze builds!)
Greater flexibility in what update paths we need/want

Structure

Function Signature (?)

Input: URL for Cmar1, URL for Cmar2, Cmar1 Hash, Cmar2 Hash
Output: Pmar1-2

Web API implementation

There are a couple of questions we might want to answer before we begin designing our API.

What kind of requests will be sent to the API?
Who will be sending them?
What will they look like? format?
- HEAD: return pmar's meta information?
- GET: retrieve pmar if it exists, else error?
- POST: request pmar generation, gives pmar URLs and Hashes
- PUT: ?
- PATCH: ?
- DELETE: Delete mentioned update path? (do we want this?)
Do we need a separate admin API?
What all might we need it for?
- Cache control (invalidation, flush, changes?)
- Starting/restarting service (or will this be done via ssh?)
- ?
What kindof information is to be exposed by the API?
- Available update paths?
- ?

resources: - http://blog.luisrei.com/articles/rest.html - http://blog.luisrei.com/articles/flaskrest.html

Internally:

Fetch Cmar's
Use a resilient retry library here
verify hashes (sanity check)
cache Cmars
Where, How needs to be decided, so ideally have two functions approximating, storage of Cmar, Lookup of Cmar based on it's hash, retrieval of Cmar based on it's hash
determine which version of the mar, mbsdiff tools to use, use them.
These probably need to be cached as well, maybe based on own version, maybe based on gecko version, simply keep a function that decides and determines which one to use and points you to the right one. Use the one given by that tool, assume abstraction.
We might have to cache these as well based on the version of update paths we're given.
generate the partial mar file based on the input .mar's and the given mar, mbsdiff tools.
cache the generated partial mar file based on the update path or based on a combination of the hashes of the input mar files.
Where and how the partial mars are actually cached again depends on our caching strategy, we simply use our abstraction functions.

API & Frontend

have a web API that allows one to trigger request partial mar generation between two given mar files. (Priority)
have a GUI/webpage a front end that kind of does the same

Scaling, Resilience and Caching

It is probably best to design for scalability, resilience and caching from the ground up so things to keep in mind are: - Retry retry retry - Log more than enough to debug - Have our application/service start up from a config file - Do not trust your machine to store state, keep it on disk or on file? - abstraction abstraction abstraction?

When trying to combine scaling and caching, we need to think about how and where we'll store all our cached stuff? - locally on each machine? - S3? How do we optimize caching? Will depend on caching strategy.

Level 1 Caching/Storage

We simply store partialMar.versionA.versionB somewhere, perhaps centrally on an ftp server or on S3.

Level 2 Caching

A lot of the bigger stuff between releases like the XUL libs on every platform remain the same despite different locales, this locale independent stuff should probably be cached and re-used. Since we plan to things at the file level, we might also want to cache the diffs b/w the commonly used files to speed things up further. What kindof speed up will this give us? (is this possible with the way our scripts currently work? I think it is, confirmation needed)

Signing and Certs

Still very hazy on how this plugins into the rest of the system, where it's needed and how if at all it changes things. Feedback needed by catlee, nthomas, bhearsum

Pertinent Questions

does the client require the request to be synchronous or asynchronous?
does the client require any progress information?
will any client need to ask if the partial mar already exists?
how will cache maintenance/invalidation be handled? (same api, admin api, cli, scripts, docs?)
what type of docs are planned.

Issues

Catlee's partial's on demand vs. nthomas's ... something else
Signing explanation
What do we do about the tool versioning?

Deliverables

I do not have a concrete idea of the deliverables so everything below is subject to possibly radical change, but for now, this is what makes sense to me:

Prototype 0.1

The intial prototype will simply be a bunch of python that essentially simply takes the input MAR urls, diffs them and spits them out

Prototype 0.2

The second prototype starts to add the caching functions, resilience logic, mar/mbsdiff tool versioning logic and generally attempts to map out the entire structure/flow of code.
Should probably have some ideas about the certs as well at this point in time

Deliverable 1.0

Have all the basics services up and running with our partial Mar (Level 1) caching up and running, should ideally try deployment on a machine in the cloud and let it run for a bit to see how things go

Deliverable 1.x

Change things around based on feedback from various team members, fine tune the system, add features requested and most importantly iron out glitches and swat those bugs.

Unit Tests

Unit-Test as much code as possible

Docs

Keep documenting stuff being done

Environment

What's required for: - Dev Environment - Deployment/Production

Possible stuff at the moment: 1. Python 2. pip 3. virtualenv

People to contact

In no particular order:

bhearsum
catlee
nthomas
hwine

@@ Line 13: / Line 13: @@
 Input: URL for Cmar1, URL for Cmar2, Cmar1 Hash, Cmar2 Hash<br />Output: Pmar1-2
+=== Web API implementation ===
+There are a couple of questions we might want to answer before we begin designing our API.
+* What kind of requests will be sent to the API?
+* Who will be sending them?
+* What will they look like? format?
+** HEAD: return pmar's meta information?
+** GET: retrieve pmar if it exists, else error?
+** POST: request pmar generation, gives pmar URLs and Hashes
+** PUT: ?
+** PATCH: ?
+** DELETE: Delete mentioned update path? (do we want this?)
+* Do we need a separate admin API?<br />
+* What all might we need it for?
+** Cache control (invalidation, flush, changes?)
+** Starting/restarting service (or will this be done via ssh?)
+** ?
+* What kindof information is to be exposed by the API?
+** Available update paths?
+** ?
+resources: - http://blog.luisrei.com/articles/rest.html - http://blog.luisrei.com/articles/flaskrest.html
 === Internally: ===
@@ Line 41: / Line 65: @@
 A lot of the bigger stuff between releases like the XUL libs on every platform remain the same despite different locales, this locale independent stuff should probably be cached and re-used. Since we plan to things at the file level, we might also want to cache the diffs b/w the commonly used files to speed things up further. What kindof speed up will this give us? (is this possible with the way our scripts currently work? I think it is, confirmation needed)
+== Signing and Certs ==
+Still very hazy on how this plugins into the rest of the system, where it's needed and how if at all it changes things. Feedback needed by catlee, nthomas, bhearsum
+== Pertinent Questions ==
+* does the client require the request to be synchronous or asynchronous?
+* does the client require any progress information?
+* will any client need to ask if the partial mar already exists?
+* how will cache maintenance/invalidation be handled? (same api, admin api, cli, scripts, docs?)
+* what type of docs are planned.
+== Issues ==
+# Catlee's partial's on demand vs. nthomas's ... [https://bugzilla.mozilla.org/show_bug.cgi?id=770995#c0 something else]
+# Signing explanation
+# What do we do about the tool versioning?
 == Deliverables ==
@@ Line 62: / Line 104: @@
 Change things around based on feedback from various team members, fine tune the system, add features requested and most importantly iron out glitches and swat those bugs.
-== Signing and Certs ==
+=== Unit Tests ===
+<pre>Unit-Test as much code as possible</pre>
+=== Docs ===
-Still very hazy on how this plugins into the rest of the system, where it's needed and how if at all it changes things. Feedback needed by catlee, nthomas, bhearsum
+<pre>Keep documenting stuff being done</pre>
+=== Environment ===
-== Issues ==
+What's required for: - Dev Environment - Deployment/Production
-# Catlee's partial's on demand vs. nthomas's ... [https://bugzilla.mozilla.org/show_bug.cgi?id=770995#c0 something else]
+Possible stuff at the moment: 1. Python 2. pip 3. virtualenv
-# Signing explanation
-# What do we do about the tool versioning?
 == People to contact ==

User:Ffledgling/Senbonzakura: Difference between revisions

Revision as of 21:28, 15 May 2014

Contents

Benefits

Structure

Function Signature (?)

Web API implementation

Internally:

API & Frontend

Scaling, Resilience and Caching

Level 1 Caching/Storage

Level 2 Caching

Signing and Certs

Pertinent Questions

Issues

Deliverables

Prototype 0.1

Prototype 0.2

Deliverable 1.0

Deliverable 1.x

Unit Tests

Docs

Environment

People to contact

Related Bug #s

Navigation menu

User:Ffledgling/Senbonzakura: Difference between revisions

Revision as of 21:28, 15 May 2014

Benefits

Structure

Function Signature (?)

Web API implementation

Internally:

API & Frontend

Scaling, Resilience and Caching

Level 1 Caching/Storage

Level 2 Caching

Signing and Certs

Pertinent Questions

Issues

Deliverables

Prototype 0.1

Prototype 0.2

Deliverable 1.0

Deliverable 1.x

Unit Tests

Docs

Environment

People to contact

Related Bug #s

Navigation menu

Search