Places:FsyncApproach: Difference between revisions

(→‎Long Open Transaction Approach: Updated cons and swag)
 
(9 intermediate revisions by 2 users not shown)
Line 23: Line 23:
* modify all queries that interact w/ the partitioned table(s)
* modify all queries that interact w/ the partitioned table(s)
* periodic flush of temp to permanent table (ensuring pk sync)
* periodic flush of temp to permanent table (ensuring pk sync)
swag: 3+ weeks


=Long Open Transaction Approach=
=Long Open Transaction Approach=
Line 39: Line 41:


Pros:
Pros:
* Less overall fsycns
* Easy (almost painless)


Cons:
Cons:
* Can lose data if we crash with an open transaction
* Still involves writes to disk
* If anyone else (non-places) fsyncs, we still lose
* Can no longer make a group of operations atomic with transactions (no nested transaction support).
swag: 1-2 weeks


=Split Database Approach=
=Split Database Approach=
Line 46: Line 56:
From Shaver:
From Shaver:
<blockquote>We could also just use two databases, one for history (with sync=off) and one for bookmarks (with sync=normal) and query against them both.  Then you would get an fsync/s_f_r only when you updated a bookmark, and not during normal browser operation, and we would not be risking loss of bookmark data during crash.</blockquote>
<blockquote>We could also just use two databases, one for history (with sync=off) and one for bookmarks (with sync=normal) and query against them both.  Then you would get an fsync/s_f_r only when you updated a bookmark, and not during normal browser operation, and we would not be risking loss of bookmark data during crash.</blockquote>
Pros:
* The common task of adding a history visit is cheaper
Cons:
* Likely false assumption that history is less important than bookmarks (smart location bar changed this)
* Tight coupling between the two databases
* New history visits require a write to both (likely)
swag: 1-2+ months
=Grow File in Large Hunks Approach=
Summary from Andrew Morton:
<blockquote>
If the file does need to grow during regular operation then I'd suggest
that it be grown in "large" hunks - say, extend it (with write()) by a
megabyte at a time.  Then fsync it to get the metadata written.  Then
proceed to use sync_file_range() for the now-non-extending small writes.
So the large-write-plus-fsync is "rare".
</blockquote>


Pros:
Pros:
* Can reasonably easily be implemented in vfs sqlite3_io_methods.


Cons:
Cons:
* Only a solution for Linux.
* Still waits on ([https://bugzilla.mozilla.org/show_bug.cgi?id=421482#c152 less significant disk access]) regularly, and so may need to be combined with async IO or another approach for optimal behavior.
swag: 2 weeks
=Journaling Virtual Filesystem Approach=
Summary from Karl:
  <blockquote>
  By implementing a virtual filesystem with data journaling, the vfs can sync
  and empty its journal when it chooses.  The filesystem would satisfy the
  SQLITE_IOCAP_SEQUENTIAL characteristics and so the database would remain
  consistent even when it has synchronous == OFF.
  </blockquote>
  <blockquote>
  Such a virtual filesystem implemented using sqlite3_io_methods could log
  changes immediately to its own replay journal file on the underlying
  filesystem, and maintain a lookup table of the pages that have not yet been
  written to their specific real files.
  </blockquote>
  <blockquote>
  Pages in the journal would need to be occasionally synced
  (possibly with sync_file_range), and then copied to their real files.  The
  pages in the real files would need to be synced before their corresponding
  journal pages become free to be reused.  In a single-thread implementation,
  this could be done when the lookup table is too large to maintain
  efficiently and then the journal can be emptied.  In a background-thread
  implementation the journal could be a ring buffer and all syncing would
  take place on the background thread.
  </blockquote>
  <blockquote>
  (This is related to a
  [http://www.mail-archive.com/sqlite-users@sqlite.org/msg35212.html replay journal discussion].)
  </blockquote>
Pros
* All databases could benefit without requiring individual code changes.
Cons
* More complicated than Grow File in Large Hunks Approach
* Data is written twice and so there is twice as much IO.  But this could be counter balanced by disabling sqlite's journal provided transaction completion can be detected by the vfs.
swag: 2-3 months
590

edits