Storage: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(add disk seek performance optimization as a proposed development priority)
(further contextualize rust stuff)
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
Storage is a SQLite database API.  API documentation is available in the [http://developer.mozilla.org/en/docs/Storage Mozilla Developer Center].
Storage can mean many things in Gecko and Firefox.


== Proposed Future Development Priorities ==
Possibilities include:
* mozStorage, the API that lives under [https://searchfox.org/mozilla-central/source/storage storage/] in the source tree.  This is a C++ wrapper around SQLite that is also exposed to JS via both XPCOM/XPConnect ([https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Storage API docs]) and via a friendlier JS wrapper, [https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Sqlite.jsm SQLite.jsm].  For Rust, the presumption of action is that Rust code will use for interacting with SQLite.  However, there are prototype rust bindings on [https://bugzilla.mozilla.org/show_bug.cgi?id=1482608 Bug 1482608] that could end up making some sense for code that is slowly migrating to rust and has existing mozStorage consumers that are highly coupled.
* The profile "storage/" directory that holds per-origin storage managed by the QuotaManager subsystem and its clients: IndexedDB, the Cache API, and the asm.js cache (which will be removed).


=== Full Text Indexing ===
== Storage Alternatives ==
* separation of indexing from storage
** Firefox could use this to index visited web pages without having to store the contents of the pages themselves.
** Thunderbird could use this to index email messages without having to store their contents in SQLite (ultimately it might make sense to store contents as well, but that's a larger, longer-term change).
** Komodo could use this to index source code stored in files on disk.
* built-in tokenization of Unicode text (i.e. without having to either compile in a large separate library or write custom functions);
* tokenization of marked up text (HTML, PDF, etc.).


=== Performance ===
There are many options for storing data in Gecko.  The Browser Architecture Group has [https://github.com/mozilla/firefox-data-store-docs cataloged] many of the existing uses.


Dr. Hipp posted to the SQLite users list about a disk seek performance optimization that could be a boon to Places performance:
If you are looking for a way to store data in Firefox, here are the usual options:
 
* JSON flat files.  This usually is not a viable option from C++ because we don't have convenient ways to manipulate JS objects from C++.
  Creating an index on A,B is equivalent to sorting on A,B.
* [https://github.com/mozilla/rkv rkv], a "simple, humane, typed Rust interface to LMDB" available in Gecko since [https://bugzilla.mozilla.org/show_bug.cgi?id=1445451 Firefox 63]. LMDB is a memory-mapped flat file database.
* SQLite
  The sorting algorithm currently used by SQLite requires
** For JS code: Use [https://developer.mozilla.org/en-US/docs/Mozilla/JavaScript_code_modules/Sqlite.jsm SQLite.jsm]. The XPConnect interface to the mozStorage API used by C++ is also a possibility, but please consider enhancing SQLite.jsm if it's missing something you need.
O(NlogN) comparisons, which is optimial.  But it also requires
** For C++ code: Use [https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Storage mozStorage].
O(N) disk seeks, which is very suboptimal. You don't notice
** For Rust: This hasn't happened yet, but the presumption is that rust code will use [https://github.com/jgallagher/rusqlite Rusqlite] as that's what's being used by the application services group and that's what mentat had usedAs noted up top, there are prototype rust bindings on [https://bugzilla.mozilla.org/show_bug.cgi?id=1482608 Bug 1482608] that could make sense for a legacy consumer like Places, but there's a lot of questions and flux going on herePlease reach out to the Storage Advisory Group to be part of the conversation on this if you're moving towards implementations with Rust and SQLite.
all these seeks if your database fits in cache. But when you
get into databases of about 3GB, the seeking really slows you
down.
   
A project on our to-do list is to implement a new sorter
that uses O(1) seeksWe know how to do this.  It is just
finding time to do the implementation.

Latest revision as of 17:02, 13 September 2018

Storage can mean many things in Gecko and Firefox.

Possibilities include:

  • mozStorage, the API that lives under storage/ in the source tree. This is a C++ wrapper around SQLite that is also exposed to JS via both XPCOM/XPConnect (API docs) and via a friendlier JS wrapper, SQLite.jsm. For Rust, the presumption of action is that Rust code will use for interacting with SQLite. However, there are prototype rust bindings on Bug 1482608 that could end up making some sense for code that is slowly migrating to rust and has existing mozStorage consumers that are highly coupled.
  • The profile "storage/" directory that holds per-origin storage managed by the QuotaManager subsystem and its clients: IndexedDB, the Cache API, and the asm.js cache (which will be removed).

Storage Alternatives

There are many options for storing data in Gecko. The Browser Architecture Group has cataloged many of the existing uses.

If you are looking for a way to store data in Firefox, here are the usual options:

  • JSON flat files. This usually is not a viable option from C++ because we don't have convenient ways to manipulate JS objects from C++.
  • rkv, a "simple, humane, typed Rust interface to LMDB" available in Gecko since Firefox 63. LMDB is a memory-mapped flat file database.
  • SQLite
    • For JS code: Use SQLite.jsm. The XPConnect interface to the mozStorage API used by C++ is also a possibility, but please consider enhancing SQLite.jsm if it's missing something you need.
    • For C++ code: Use mozStorage.
    • For Rust: This hasn't happened yet, but the presumption is that rust code will use Rusqlite as that's what's being used by the application services group and that's what mentat had used. As noted up top, there are prototype rust bindings on Bug 1482608 that could make sense for a legacy consumer like Places, but there's a lot of questions and flux going on here. Please reach out to the Storage Advisory Group to be part of the conversation on this if you're moving towards implementations with Rust and SQLite.