Places/StatusMeetings/2006-10-26 and 2006-10-27
From MozillaWiki
< Places | StatusMeetings
« previous week | index | next week »
Places meeting: 2006-10-26 4pm PST
sspitzer thunder / myk / dietrich / et al: meeting time? thunder hola dietrich howdy all myk sspitzer: yes, i'm here dietrich sspitzer: i'll try out the new patch on bug 356487 asap dietrich anyone have a chance to check out the ERD i sent out? sspitzer dietrich: yes thunder yeah sspitzer I had a question myk dietrich: one more naming suggestion is to call the moz_anno_names table moz_attributes, since the moz_annos table is an entity-attribute-value (EAV) table, and moz_anno_names stores the attributes that the moz_annos table uses dietrich myk: moz_attributes sounds kinda generic.. moz_anno_names is specific to annotations dietrich moz_anno_attributes? myk dietrich: sure sspitzer my question was also about moz_annos sspitzer and this is not new to your ERD, but: sspitzer are moz_annos required to have a moz_anno_name? sspitzer can we have un-nammed annotations? sspitzer I think it should be a "may have a " relationship myk dietrich: i generally prefer one-word names, but perhaps expressiveness is better than brevity in this case sspitzer instead of a "has a" myk sspitzer: they are required to have a name dietrich myk: we could remove the moz_* prefix from all the tables dietrich sspitzer: i'm not sure what the use case for un-typed annotations is myk dietrich: yeah, i've thought about that as well, but it makes some sense to have it given that sqlite doesn't support namespaces or cross-db queries sspitzer dietrich: here's my use case myk dietrich: the use case i usually consider is that multiple extensions want to store some related data in the database; we don't want them to create colliding tables (neither colliding with us nor with each other)... dietrich myk: yeah, prefixing to avoid collisions makes sense myk dietrich: large "enterprise" RDBMSes use namespaces to accomplish this; medium-scale RDBMSes like MySQL allow you to join across databases, effectively making databases be namespaces myk dietrich: incidentally, i dislike prefixes too and think they are often used unnecessarily, but in this case i think they make sense dietrich myk: yeah, the lack of cross-db queries means extensions are likely to add to our db as opposed to creating their own dietrich ugh myk dietrich: yeah :-) sspitzer well, I was thinking about extension developers, who might use annotations in ways other than tagging (which is how I have been thinking about them, mostly). in ways where all annotations are of the same type, but to play nice with other extensions and our own code, they might need a moz_anno_name, like "extension_xyz", so maybe my use case is wrong. dietrich sspitzer: yeah, anno_names are basically being used as a loose typing system sspitzer ok, then ignore my question. myk dietrich: hmm, actually it looks like i'm wrong; sqlite does support joining across databases using the ATTACH DATABASE syntax myk although "There is a compile-time limit of 10 attached database files." thunder it must be fairly inefficient if they do that myk (http://www.sqlite.org/lang_attach.html) thunder s/do that/have that restriction/ sspitzer speaking of SQLite, I think thunder is working on updating to a new version, one with full text indexing. thunder oh; yes thunder I have a patch waiting to be reviewed by vlad thunder (well, and committed by vlad, since I don't have an account) dietrich myk: thx for the link. yeah we should encourage extension devs to use external dbs :) dietrich thunder: cool, is there a bug for that? myk dietrich: yeah, you're right, we really should; i wonder what it means for our prefixes thunder yeah, looking thunder just a sec thunder 341137 thunder https://bugzilla.mozilla.org/show_bug.cgi?id=341137 myk sspitzer: brett encouraged third parties to use namespaces in the annotation names to avoid collisions myk sspitzer: f.e., in my implementation of microsummaries on top of places, i used microsummary/ as the namespace, f.e. the annotation name for the generator URI is microsummary/generator_uri (or something like that; would have to look at source to recall the exact name) dietrich myk: if we can't restrict table creation in our db, then we should keep the prefixes sspitzer myk: ok, that makes sense. thanks for the background info. myk dietrich: i think i concur, especially given that compile-time limit to the number of other databases one can attach thunder dietrich: my patch doesn't enable the text searching stuff thunder (yet) thunder I thought I'd get this reviewed first, then add the interface to mozStorage dietrich fyi: i sent mail to todd agulnick, asking for him to take a look at the proposed changes dietrich i'll hit up the yahoo people also thunder cool thunder um, do we have an agenda of some sort? dietrich thunder: i think just to report any progress on the task list, and then any specific issues people want to bring up thunder ok, thunder I'm in the process of making a couple of new tests for tinderbox thunder TpMH (medium history) and TpLH (large history) thunder they are basically Tp, but I copy in a history file thunder however, I don't have good history files to copy in :) dietrich thunder: QA might be able to help getting history files thunder I'm going to shoot for ~1MB for the medium and ~10MB for the large thunder ah good call thunder I'll ask thunder other than that, I just need to get this patch checked in and deployed, and we'll have new testing data thunder hopefully it'll be useful dietrich cool dietrich i'm sure it will. far more realistic baselines for testing Tp thunder yeah thunder other than that, nothing new and exciting to report thunder the 3.3.8 update seth mentioned thunder I'll add interfaces for the text search stuff when I figure out how to do that :) dietrich anyone else have anything to discuss? dietrich the only other thing i wanted to mention is that we should be thinking about where we want UI to go after we get Fx2 parity dietrich eg, considering stuff here: http://wiki.mozilla.org/Firefox/Feature_Brainstorming:Bookmarks thunder yeah. dietrich as well as adding any ideas to that list thunder that is where the exciting stuff starts :-) dietrich exactly! thunder I want cheese-based bookmarks dietrich mmmm thunder but we should offer a lactose-free version dietrich on that note: meeting adjourned :) thunder um, with full text search in the db, we should have smart bookmark folders thunder haha :) dietrich thunder: yes, that'd be v. cool dietrich hey i don't think that's on the list thunder really * thunder adds * myk votes for cheese-based bookmarks thunder hooray! thunder okay, it has been totally added sspitzer I don;'t have much to report, except a new patch (that address the comments from dietrich). thunder woo thunder commit! thunder :-P
Places meeting: 2006-10-27 2pm PST
sspitzer dietrich / thunder / myk / et al dietrich hi sspitzer dietrich: first, thanks for the tip about the storage inspector. sspitzer that's a very useful extension dietrich yah sspitzer https://addons.mozilla.org/firefox/3072/ sspitzer so, a couple questions that relate to the ERD sspitzer http://wiki.mozilla.org/Places:BookmarksComments#Places_ERD sspitzer we have title and user_title in the moz_places title (moz_history if you use the storage inspector on your trunk with places enabled "as is" today) sspitzer those are usually the same, right? sspitzer can we do a sort of copy-on-write trickery, where we don't store it twice if they are the same? sspitzer when we do our query, we ask for both columns, and if user_title is null, we use title? sspitzer or, is there a reason we do what we are doing? dietrich sspitzer: iirc user_title is basically for bookmarks dietrich title retains the original title from history dietrich say you bookmark a page and title the bookmark differently, it populates user_title dietrich user_title may be obsoleted by the schema changes actually sspitzer how is that? dietrich my opinion is that a bookmark title is one of the most-likely to be changed, and represented in multiple places, properties of a bookmark dietrich and therefore should be tied to the bookmark, not to the history entry dietrich i think that user_title should be removed, and a title property added to moz_bookmarks sspitzer ok, I follow you. I thought you were saying it has been changed in your new erd sspitzer you are saying that we should. dietrich right sspitzer I agree, but this leads me into my next question, which is really something mconnor asked me (and he might have asked you already, too) thunder dietrich: agreed thunder I think dietrich that's basically what i say in the wiki, but i forgot to change that in the diagram thunder nod sspitzer why are bookmarks and history in the same data model? Could we have two databses, and when we need to, join across them? dietrich sspitzer: since history is unique on URI and bookmarks isn't anymore, that's a possibility dietrich however, thunder I was thinking about that too dietrich there might be performance repercussions of loading the 2 files separately myk dietrich: putting user_title into moz_bookmarks seems like the right thing to do dietrich i think that there are close ties between history and annos, bookmarks and annos, but not as close an association between bookmarks and history dietrich wrt to how they're used in practice myk dietrich: since it's a property of the bookmark, not the place itself dietrich myk: yep dietrich sspitzer: i think you bring up a larger question: what does a tight integration between history and bookmarks in the data model buy us? myk sspitzer: it's a good question; one might also ask the corollary question, however: why ever have more than one database, no matter how many different kinds of data we start storing in sqlite? sspitzer for myks question, do we want extensions to be using our database? sspitzer not that I have a problem with extension authors, but it seems like they could impact the browser, even after the extension is removed. sspitzer I was thinking (maybe naively) that having one giant database with everything could be costly on startup. sspitzer at yesterdays 4pm meeting, we chatted here about table prefixes sspitzer about why we were doing moz_* sspitzer about my naive thinking, I don't know if sqlite is designed for one db with lots of tables, or better to have mutliple dbs. we already have: myk sspitzer: that may be true, since databases are represented as a single file; on the other hand, perhaps sqlite just loads a small portion of that file on startup; i don't know sspitzer myk: me neither sspitzer we already have: sspitzer bookmarks_history.sqlite urlclassifier.sqlite sspitzer formhistory.sqlite urlclassifier2.sqlite sspitzer search.sqlite sspitzer each of those are a separate db on disk. sspitzer dietrich asked: sspitzer what does a tight integration between history and bookmarks in the data model buy us? dietrich i think that the page URI singleton model demands a tight integration between them myk sspitzer: as an aside, perhaps we should rename bookmarks_history.sqlite -> places.sqlite myk (that is, unless it turns out that we should be sticking everything into a single database) dietrich eg: if we remove the place_id FK in moz_bookmarks, what breaks? dietrich anything that modifies a bookmark will now be annotating a bookmark URI, not the place URI thunder it might make it slower to update the history table when one visits the place thunder since you'd have to look up which row it is thunder though, probably not thunder since that is/could be indexed anyway dietrich yeah thunder (to look up by uri) sspitzer does separating them make "clear private data" faster? sspitzer if we don't have to worry about bookmarks being lost? dietrich sspitzer: possibly dietrich however, i think it's important to think about what we want from a functionality perspective first, then optimize on the requirements of those goals myk fwiw, i agree with dietrich myk we should avoid premature optimization and instead focus on accurately modeling the concepts with the database schema thunder hmm thunder sure; but from that perspective myk once we have an accurate model, then it makes sense to measure its performance and modify the schema accordingly (adding indexed, de-normalizing in exceptional cases, etc.) thunder bookmarks is a separate concept from history thunder so should live in its own table :) myk thunder: right, which is why we represent the two concepts with their own tables thunder ah, I thought the argument was geared toward not separating them dietrich hence my question: after removing the bookmark singleton approach, is there a reason to keep that place_id FK in the moz_bookmarks table? thunder my confusion, sorry sspitzer dietrich: thinking... dietrich i'm trying to think of use-cases for showing history data for a bookmark thunder there are dietrich but in those cases, you could join on URI instead of place_id thunder but, right thunder which is why I think it's not necessary to have the FK there dietrich also: say you wanted visitation stats for a bookmark, those would have to be stored against the bookmark URI, not the original URI thunder URI? thunder or row? dietrich thunder: same thing wrt to annotations dietrich thunder: right now there are place: uris for folders, queries, etc thunder it could be useful to know how many times I've been to a site vs how many times I've clicked on this bookmark thunder both could be useful, really, depending on context dietrich one of my recommendations was to provide URIs for all bookmarks datastore objects, for annotations, etc myk dietrich: removing the place_id FK essentially turns the uri column into a FK; i don't see a problem with that offhand dietrich even if that URI is something simple like place:bookmark:{PKID} dietrich myk: exactly - place_id is redundant at the point thunder yeah, it does do that, with the difference that we don't need to maintain it pointing to an actual row thunder (right?) dietrich yep myk dietrich: there's a minor space hit, but i'd say it's insigificant, since users don't generally have too many bookmarks thunder so you could have a bookmark with a uri that is not in the history table dietrich thunder: i think so myk dietrich: one consideration is that joins might be more expensive thunder whereas with the FK route it seems bad if it doesn't point anywhere (though, it could just be null, I guess? - but even then you still have to walk the bookmarks when you clear history) dietrich well, a bookmark's URI *is* a URI that's not already in history thunder er thunder no, I mean dietrich myk: how so? thunder the uri pointed to by a bookmark myk dietrich: as we'd be joining on URI string rather than integer ID dietrich myk: good point - given that we'd already be indexing the URI cols, i wonder how big that hit would be myk dietrich: here's a use case that requires a join from places -> bookmarks: say i wanted to search my history for some site. in the search results, i'd probably want to know that a particular result is bookmarked rather than being just some random site i visited once myk f.e. the search results might include an icon for each result, and the icon for results that are also bookmarks would be different (f.e. a bookmark symbol overlaid over the favicon) dietrich myk: so that was my next question: in use-cases like that, should it be done at the db layer? myk dietrich: well, my first thought is that the db layer solution would be simplest, but maybe that's just because i know how to do it thunder that is a valid use-case, but we don't know how much slower that would be sans-integer-FK dietrich myk: i think it would be faster to implement that use-case at the db layer than it would be to call out to the bookmark service from the front-end thunder nor do we know much much faster other operations (e.g. clear history) would be myk dietrich: sure; in that case, i suspect that a join from places -> bookmarks would be faster on an integer key than a URI, but i don't know that for sure dietrich any interaction w/ annotations are likely joins on URI cols dietrich in the current model, and with the proposed changes dietrich and likely to occur more often than the use-case we're discussing myk another consideration is that it isn't clear to me that "places" is conceptually the same thing as "history"; i imagine that interesting applications could be enabled by differentiating between the two dietrich so that's really an issue we have in the status quo (if it's even an issue) dietrich myk: i totally agree. i think that using annotations as an extensibility mechanism is key to that. myk dietrich: yes, annotations is important, but we would also need to provide for the possibility that a place in the places table doesn't necessarily represent an item in "history", if "history" means the set of places the user has visited thunder hrm dietrich myk: i think that the separation of moz_visits and moz_places effectively does that myk dietrich: yes, indeed myk dietrich: but we seem to have been talking about the places table as if it's history, and i wanted to make sure we were differentiating between them conceptually dietrich ok dietrich hm, i wonder if clearing history in places clears moz_places entries (ne moz_history), or just entries in moz_visits dietrich b/c entries in moz_places are kind of like an implicit visit sspitzer dietrich: didn dietrich sure you can put non history stuff in there, but history URIs are there because you visited them dietrich i guess if there's no moz_visits entry, there's no way to tell how the moz_places entry got there sspitzer 't brett also tells us recently that the clearing of history happens "in chunks"? myk dietrich: perhaps the solution is to clear entries which are neither bookmarks, nor history, nor have any annotations in the annotations database to indicate that some other code cares about those URIs sspitzer looking for his comments... dietrich myk: sure that might do it dietrich when clearing history, clear moz_visits *and* remove "un-attached" entries in moz_places myk dietrich: you mentioned earlier that interactions between places and annotations are likely joins between URI columns, but it looks like your ERD has them joined by ID sspitzer "places history expiration happens incrementally as you browse (instead of delaying shutdown)" dietrich sspitzer: yeah, i wonder if clearing history manually forces removal from the db, or if it does the expire-over-time thing dietrich eg: via cpd myk dietrich: also, there's a partial model for expiring annotations; i've previously proposed adding "expire when bookmark goes away" functionality; perhaps there should also be "expire when history cleared" sspitzer for more on expiration, see brettw's comments on http://wiki.mozilla.org/Places/StatusMeetings/2006-10-12 sspitzer he writes: dietrich myk: yeah i was mistaken, it's by place_id dietrich myk: sure. i think brettw referenced a bug for that dietrich maybe u filed it dietrich :) myk :-) myk i probably filed the bookmarks one; if so, i'll add history to it myk hope i'm not dumping too many considerations on y'all dietrich myk: not at all :) dietrich we're still early in the process, so it's great to have more ideas to churn on dietrich i think the takeaways wrt to schema, that i need to update ERD to: sspitzer dietrich: look at http://lxr.mozilla.org/seamonkey/source/toolkit/components/places/src/nsNavHistoryExpire.cpp, clearing history manually doesn't appear to do the expire over time thing. dietrich - remove place_id FK from bookmarks dietrich and rename the db file to moz_places.sqlite? dietrich or just places sspitzer I think places.sqlite sspitzer based on the names of the other .sqlite files dietrich k sspitzer one more item, in addition to "- remove place_id FK from bookmarks" sspitzer what about user_title? sspitzer or, did you already have that, but not in the ERD image? dietrich sspitzer: right, that also needs to be added to the diagram sspitzer any objection to me making today's chat part of yesterday's 4pm chat log, that I'll post on http://wiki.mozilla.org/Places/StatusMeetings? myk i'm not so sure removing place_id is actually the right thing to do dietrich myk: yeah that one's wobbly dietrich i think there aren't any super arguments for keeping it dietrich but no args for removing it either dietrich which leaves performance, which i think would be worse (just not sure to what degree) dietrich i leave it in for now dietrich however, thunder, this means that you cannot bookmark something that isn't in moz_places :) sspitzer dietrich: then how does it work when I right click and bookmark a link I've never been to? sspitzer do we create an entry, and set the visited to null, or some special value? sspitzer (in the moz_places table) dietrich sspitzer: right now it creates entries in both dietrich both moz_history and moz_bookmarks dietrich i'm not sure what we do for visits dietrich sspitzer: http://lxr.mozilla.org/seamonkey/source/toolkit/components/places/src/nsNavHistory.cpp#868 dietrich that's what's called when you add a bookmark dietrich "create a new hidden, untyped, unvisited entry" sspitzer ok, and then the moz_bookmark references that? dietrich yep sspitzer ok, thanks for clarifying.