Electrolysis/Meetings/2009-09-21-session history

From MozillaWiki
Jump to: navigation, search

#content IRC meeting about Session History Unification

Held on 2009/09/21 between bz, bsmedberg and fred23

(14:00:00) bz: So I went through Fred's list of current SHistory consumers
(14:00:11) bz: and what I know about the way it's used
(14:00:21) bz: and sort of broke down the types of uses by what's being done
(14:00:27) bz: https://wiki.mozilla.org/Content_Processes/places_that_need_session_history#Conceptual_breakdown_of_session_history_usage
(14:00:35) bz: In brief, chrome uses session history for all sorts of stuff
(14:00:44) bz: serializing it and deserializing it
(14:00:55) bz: using it to identify page data (e.g. view source)
(14:01:04) bz: populating menus
(14:01:06) bz: button state
(14:01:13) bz: getting the POST data
(14:01:52) bz: Content use of session history is, I _think_ limited to length/back/forward/go
(14:02:02) bz: location.reload/replace
(14:02:05) bz: that's it
(14:02:35) bz: (there are more interface members on nsIDOMHistory, but I think we should remove them)
(14:02:58) bz: it seems to me that back/forward/go/reload/replace can be async
(14:03:03) bz: (heck, already basically are)
(14:03:24) bsmedberg: they never throw exceptions based on the current history state?
(14:03:25) bz: but we can make them be more explicitly async on m-c if we want
(14:03:28) bz: hmm
(14:03:41) ***bz looks
(14:04:34) bz: they do
(14:04:49) bz: e.g. forward() does if there's nowhere to go
(14:04:50) bz: hmm
(14:05:06) bsmedberg: the content-facing APIs can be sync
(14:05:10) ***bz checks that
(14:05:42) bz: no, I lied
(14:05:43) bz: ok
(14:05:44) bz: so
(14:05:52) bz: the nsIWebNavigation stuff can do things like throw
(14:05:58) bz: but dom-facing....
(14:06:04) bsmedberg: it eats the error?
(14:06:10) bz: back()/forward() only throw if there is no history object on the docshell
(14:06:14) bsmedberg: ok
(14:06:17) bz: or if it fails to QI to nsIWebNav
(14:06:31) bz: go() same thing
(14:07:24) bz: .length same thing
(14:07:29) bz: (minus the QI)
(14:07:36) bz: .length might need to be sync
(14:07:52) bz: so it sounds to me like the right thing to do is to maintain all session history in the parent process
(14:08:10) bsmedberg: that sorta sounds right
(14:08:18) bsmedberg: there are issues with bfcache, no?
(14:08:24) bsmedberg: and frame navigation?
(14:08:28) bz: right
(14:08:38) bz: frame navigation is perhaps less of a problem
(14:08:55) bz: it (like any other navigation) just adds entries to the history
(14:08:55) fred23: guys, could you recall me those issues briefly ?
(14:09:14) bz: the caveat is that it clones the history tree
(14:09:20) bz: at the moment synchronously with the navigation
(14:09:24) bsmedberg: bz: do those entries not hold references of some sort to the frames that were navigated?
(14:09:32) bsmedberg: or is it all done by name?
(14:09:34) bz: fred23: do you have a general idea of how shistory works?
(14:09:51) bz: bsmedberg: it's done by position in the child list
(14:09:55) bz: bsmedberg: it's pretty broken!
(14:10:06) fred23: I've looked at the code, but never hacked it... so my knowloedge is limited I guess
(14:10:08) bsmedberg: ah
(14:10:35) bz: bsmedberg: smaug has a task item for whenever he has time to rewrite shistory to work with pages that dynamically add iframes or something ajaxy like that
(14:10:48) bsmedberg: fred23: https://developer.mozilla.org/En/Using_Firefox_1.5_caching describes the bfcache
(14:10:54) fred23: k
(14:10:58) bz: bsmedberg: on the other hand, docshell _does_ have a direct handle to its session history entry
(14:11:01) bsmedberg: basically when you navigate we don't actually throw away the old page for a bit
(14:11:02) bz: or rather both of them
(14:11:06) bz: the current one and the loading one
(14:11:19) bz: it uses the latter to decide what to do in various cases
(14:11:24) bz: and it's all kinda fragile.  :(
(14:11:38) bz: ("do" both in terms of where in the load it is, and in terms of managing the session history tree)
(14:11:55) bsmedberg: I must admit the tree part is something of a mystery to me.
(14:12:00) bz: I can explain that
(14:12:17) bz: essentially, session history entries are organized as a list of trees
(14:12:28) bz: each session history entry represents one web page
(14:12:36) bz: its kids are the session history entries for its subframes
(14:12:58) bz: when a navigation occurs, we do the following (conceptually):
(14:13:18) bz: 1) Clone the currently selected tree (the one that represents where we are in the history)
(14:13:29) bz: 2)  Locate the entry in that tree that corresponds to the docshell being navigated
(14:13:40) bz: 2)  Remove the subtree rooted at that entry from the tree
(14:13:45) bz: er, that was 3)
(14:14:08) bz: 4) Add an entry for the new location of the relevant docshell at the position where we removed the subtree
(14:14:23) bz: then as that document loads and finds iframes it'll add them to the tree under its current session history entry
(14:14:29) bz: which it knows because the docshell has a pointer to it
(14:14:40) bsmedberg: We do all this tree business because you can have back-forward inside of an individual frame independent of the toplevel back-forward?
(14:14:47) bz: yes
(14:14:49) bsmedberg: damn
(14:14:55) bsmedberg: Are there tests?
(14:14:57) bz: mmmm
(14:15:06) bz: you raise an interesting question!
(14:15:08) bz: there are a few
(14:15:15) bz: and now there's a test _framework_
(14:15:20) bz: thanks to jgriffin
(14:15:27) bz: but yes, in general it's terribly undertested
(14:15:40) bz: implementation should certainly be preceded by lots of test-writing
(14:15:55) bz: ideally semi-exhaustive test-writing
(14:16:09) bz: best of all, "correct" behavior is undefined
(14:16:21) bz: but at least we can land tests for common situations plus old bugs.....
(14:16:38) bz: so the way back/forward work is you take the two trees
(14:16:41) bz: and compare them
(14:16:41) bsmedberg: Yeah, I can imagine lots of situations where frame navigation mixed with toplevel navigation gives you interesting possibilities
(14:17:03) bz: find the first node in the tree (in some traversal order; not really that well-defined) that differs
(14:17:12) bz: and perform navigation in the corresponding docshell 
(14:17:24) bz: using LoadHistoryEntry
(14:17:47) bz: now the thing is....
(14:18:01) bz: session history entries hold on to various data about the docshell
(14:18:06) bz: but not the docshell itself
(14:18:21) bz: data is uri, principal, post data, layout state
(14:18:33) bz: some strings and booleans
(14:18:40) bz: strings and booleans are easy
(14:18:47) bz: I assume we have a sane solution for URIs
(14:18:53) bz: principal is a bit of a pain, right?
(14:19:09) bz: since for those object identity matters
(14:19:15) bsmedberg: Not sure we've really dealt with URIs or principals sanely yet
(14:19:19) bz: ok
(14:19:38) bsmedberg: except I'm hoping that we don't ever have to compare principals across process boundaries
(14:19:39) bz: fred23: I'm worried about your silence
(14:19:50) ***fred23 is still listening...
(14:20:03) bz: fred23: sure; I'm just surprised you have no questions.  ;)
(14:20:32) bz: bsmedberg: chrome needs to be able to perform security checks with the principal of a content process.  But that doesn't have to be the same object as the one in the content process, true
(14:20:49) bz: bsmedberg: for session history, however, we do have to serialize and deserialize principals for sessionstore
(14:20:55) fred23: bz: I have a lot, but don't want to interrupt atm
(14:21:07) ***fred23 is reading scrollback
(14:21:11) bsmedberg: bz: my *hope* is that sessionstore for a tab will be serialized in the content process
(14:21:18) bsmedberg: perhaps with some kind of sanity-checking higher up
(14:21:22) bz: hmm
(14:21:28) ***bsmedberg isn't sure what kind of sanity checking is necessary
(14:21:44) bz: it depends on how much we're trying to sandbox
(14:21:45) bsmedberg: but until some sessionrestore guru is willing to work on it...
(14:22:08) bz: long-term, it seems to me we should never trust any self-reported principal or origin URI from a child process
(14:23:09) bz: So that's basically normal session history in a nutshell
(14:23:12) bz: not counting bfcache
(14:23:17) bz: the bfcache case is interesting
(14:23:29) bz: because in that case the session history holds a direct owning reference to the document viewer
(14:23:49) bz: and vice versa, iirc
(14:23:51) ***bz checks
(14:24:14) bz: yes
(14:24:54) bz: Now all the document viewer does with mSHEntry is:
(14:24:59) bz: 1) Condition some stuff on it
(14:25:21) bz: 2)  Call SetSticky(mIsSticky) on it when navigating away from the page that's going into bfcache
(14:25:46) bz: 3)  Tell it about itself (or tell it to drop all pointers to itself)
(14:26:17) bz: (well, and enable the actual restoration)
(14:26:32) bz: the key is that the document viewer itself doesn't want much out of the history entry
(14:26:35) bz: other than for it to keep it alive
(14:26:43) bz: and to be able to get it from the shentry
(14:26:48) bz: (when restoring)
(14:27:04) bz: we could just as easily change to having the shentry store some sort of unique id
(14:27:15) bz: and have some way in the content process to map ids to document viewers
(14:27:23) bz: some way that keeps the latter alive
(14:27:34) bz: would need to be a little careful about ownership
(14:27:38) bz: but shouldn't be too bad, I hope
(14:28:23) bz: shistory needs to be able to drop the ref it has to the content viewer
(14:29:08) fred23: Are Document viewers on the child side ?
(14:29:09) bz: but it already usually does that async
(14:29:16) bz: document viewers are on the child side, yes
(14:29:19) fred23: gotchqa
(14:29:44) bz: a document viewer is basically an object that exists for every "document being displayed" (so not XHR result document, say)
(14:29:49) bz: and has effectively document lifetime
(14:29:56) bz: so sorta like inner window
(14:30:23) bz: So I think we can make bfcache work
(14:30:35) bz: if shistory is in parent only
(14:30:44) bz: certainly with at worst some sync child-to-parent calls
(14:30:48) bz: but possibly even without those
(14:31:31) bsmedberg: bz: so how is it going to work when there is a navigation which needs to be rendered by a different content process?
(14:31:50) bsmedberg: will we just say "no bgcache" in that case?
(14:32:06) bsmedberg: or will we save state temporarily in the old content process and evict it later?
(14:32:16) bz: the latter seems feasible to me
(14:32:21) bz: when we get to that point
(14:32:33) bz: store the "which process" handle in the history
(14:32:37) bsmedberg: yeah
(14:32:45) bz: need to be careful about managing the child process lifetime
(14:32:48) bsmedberg: we'll probably need to do that sooner for various about: pages...
(14:32:58) bz: oh, right
(14:33:00) bz: good, good
(14:33:09) ***bz is _really_ happy about that, actually
(14:33:17) ***fred23 nods
(14:34:40) bz: so I think our task list here is:
(14:34:42) bz: 1)  Write tests.
(14:36:55) bz: 2)  Make all navigations async, including anchor scrolls
(14:35:01) bz: 3)  Separate session history management out of docshell into a separate beastie
(14:35:45) bz: 4)  Figure out how to deal with principals / uris sanely
(14:36:00) bz: 5)  See what parts of (2) can be made async
(14:40:34) bz: 6)  Make nsDocShell not hold direct nsISHEntry pointers
(14:36:29) bz: (though we could keep it all sync for now, I suppose; there aren't that many calls into session history stuff really)
(14:36:56) bz: if we can
(14:36:59) bz: should check html5
(14:38:18) fred23: bz: by a "separate beastie", do you mean something totally new ?  or merging to existing stuff ?
(14:38:51) bz: excellent question
(14:39:01) bz: I have to admit to a temptation to move it into nsSHistory.  ;)
(14:39:14) bz: it seems right, somehow
(14:39:27) bz: add whatever API we need on nsISHistory to make it work
(14:39:42) fred23: right
(14:39:43) bz: basically, have a higher-level API than "add this shentry to this other shentry"
(14:40:07) bz: and instead move most of AddToSessionHistory into nsSHistory