User:Sspitzer/GlobalFrecency: Difference between revisions

Jump to navigation Jump to search
no edit summary
mNo edit summary
No edit summary
Line 19: Line 19:
3)  Upon first run, I have to add the frecency column to the moz_places table, create an index for that column, and make sure that livemark items and "place:" urls get a frecency of 0, and it should not appear in the ac results.  Another way for a place to have a 0 frecency is if the url only has "embedded" visits.
3)  Upon first run, I have to add the frecency column to the moz_places table, create an index for that column, and make sure that livemark items and "place:" urls get a frecency of 0, and it should not appear in the ac results.  Another way for a place to have a 0 frecency is if the url only has "embedded" visits.


5)  If I don't know the frecency of a place, the value is -1.  This is what I call an "invalid" frecency.  If something has an invalid frecency, it will show up in the ac results.
5)  If I don't know the frececny of a place, the value is -1.  This is what I call an "invalid" frecency.  If something has an invalid frecency, it will show up in the ac results.


6)  the url bar drop down shows "typed" sites, ordered by frecency descending.
6)  the url bar drop down shows "typed" sites, ordered by frecency descending.
Line 25: Line 25:
7)  When inserting a bookmark, we attempt to calculate a frecency for it.  This will impact the performance of bookmark import and also fx 2 - > fx 3 migration.  (spin off bug coming about how to deal with it.)
7)  When inserting a bookmark, we attempt to calculate a frecency for it.  This will impact the performance of bookmark import and also fx 2 - > fx 3 migration.  (spin off bug coming about how to deal with it.)


8)  for how we calculate a frecency for a site, see http://wiki.mozilla.org/User:Mconnor/PlacesFrecency (and option 3).  I use the 10 most recent visits, and this is pref controlled, as all are the buckets, weights and bonus values.
8)  for how we calculate a freceny for a site, see http://wiki.mozilla.org/User:Mconnor/PlacesFrecency (and option 3).  I use the 10 most recent visits, and this is pref controlled, as all are the buckets, weights and bonus values.


9)  if we don't have any visits for a site, I make an attempt to estimate the frecency.  more on this in a spin off bug (including when we should estimate and when we should not.)
9)  if we don't have any visits for a site, I make an attempt to estimate the frecency.  more on this in a spin off bug (including when we should estimate and when we should not.)
Line 83: Line 83:


'''Additional notes (need to clean these up, log spin off bugs, etc).  These are mostly for dietrich so he knows what I've done and why and the known issues.'''
'''Additional notes (need to clean these up, log spin off bugs, etc).  These are mostly for dietrich so he knows what I've done and why and the known issues.'''
0)
<dietrich> / something bookmarked and typed will have a higher frecency than
<dietrich> / something just typed or just bookmarked.
<dietrich> you mean here that something bookmarked/typed longer ago is weighted more than something recently bookmarked/typed?
We might call nsNavHistory::CalculateFrecencyInternal() with a place id of -1, which means the place doesn't exist yet.
One example of this when inserting a bookmark.


1)
1)
Line 114: Line 124:
5)
5)


issue:  if when autocompleting, we will prefer moz_place title over bookmark title, and so when showing match, we will show that one.  because of the fix for bug #407292 – When adding a bookmark with no title, we should use the uri as the title, you might get the uri as the title if it matches the user text, as we prefer it.
issue:  if when autocompleteing, we will prefer moz_place title over bookmark title, and so when showing match, we will show that one.  because of the fix for bug #407292 – When adding a bookmark with no title, we should use the uri as the title, you might get the uri as the title if it matches the user text, as we prefer it.


6)
6)
Line 195: Line 205:
21)
21)


for the ac queries, sort by frecency, then typed, then visit_count because we might not have frecency.  in the case of 3b2 migration or clear all private data, lots of places have frecency = -1 (until idle), so we will have lots of frecency ties, so use typed and visit_count to break the ties and provide better results.
for the ac queries, sort by freceny, then typed, then visit_count because we might not have frecency.  in the case of 3b2 migration or clear all private data, lots of places have frecency = -1 (until idle), so we will have lots of frecency ties, so use typed and visit_count to break the ties and provide better results.


22)
22)
Line 226: Line 236:
25)
25)


improved mDBOldFrecency query, explain why to dietrich
improved mDBOldFrecnecy query, explain why to deitrich


26)
26)
Line 266: Line 276:
29) would like to order autocomplete by frecency DESC, typed DESC visit_count DESC to break ties, but this is slow, even with an index.  (spin off bug)
29) would like to order autocomplete by frecency DESC, typed DESC visit_count DESC to break ties, but this is slow, even with an index.  (spin off bug)


30) comment about why we add typed bonus to bookmark bonus for frecency of  
30) comment about why we add typed bonus to bookmark bonus for frecency of  


     // not the same logic above, as a single visit could not both
     // not the same logic above, as a single visit could not both
Line 289: Line 299:


'''todo'''
'''todo'''
<dietrich> / something bookmarked and typed will have a higher frecency than
<dietrich> / something just typed or just bookmarked.
<dietrich> you mean here that something bookmarked/typed longer ago is weighted more than something recently bookmarked/typed? 
You wrote "longer ago", but in this case, these places have no visits.  If we had visits, we would not hit this code.  We would have returned here:
  if (numSampledVisits) {
      *aFrecency = (PRInt32) NS_ceilf(aVisitCount * pointsForSampledVisits / numSampledVisits);
      return NS_OK;
    }
I am having a hard time explaining this with words, so let me try this.  The current code gives us the following:
f(zero visits, bookmarked, typed) > f(zero visits, bookmarked, not typed) > f(zero visits, not bookmarked, typed) = f(zero visits, not bookmarked, not typed) = 0
Where f() is the frecency calculation function.
Here's how you can end up with these four scenarios:
zero visits, bookmarked, typed:  type in http://www.google.com and then bookmark it.  clear all history, giving it a frecency of -1.  when we recalc on idle, we will give it a frecency of 340 (bookmark bonus + typed bonus, see the browser.frecency.unvisited* prefs).  We have no visits, but since it is a bookmark, we want a non-zero frecency.
In irc where I wrote:  " while working on the asnwer to your question above, I noticed a small issue in that code, I'll answer it in a spin off bug as soon as I confirm the issue."  This is what I was referring to.  I think we should be returning 200 here, instead of 340.
The change would be:
Replace:
  // not the same logic above, as a single visit could not both
  // a bookmark visit and a typed visit.  but when estimating a frecency
  // for a place that doesn't have any visits, this will make it so
  // something bookmarked and typed will have a higher frecency than
  // something just typed or just bookmarked.
  if (aIsBookmarked)
    bonus += mUnvisitedBookmarkBonus;
  if (aTyped)
    bonus += mUnvisitedTypedBonus;
with:
  // assuming mUnvisitedTypedBonus > mUnvisitedBookmarkBonus
  // this makes it so an unvisted, typed bookmark frecency > unvisited, untyped bookmark frecency
  if (aTyped)
    bonus = mUnvisitedTypedBonus
  else if (aBookmarked)
    bonus = mUnvisitedBookmarkBonus;
zero visits, bookmarked, not typed:  create a new bookmark by right clicking on a url that you've never visited.  frecency will be 140 (browser.frecency.unvisitedBookmarkBonus)
zero visits, not bookmarked, typed:  type in http://www.google.com, and annotate it but don't bookmark it.  clear all private data, giving it a frecency of -1.  on idle, we'd give this thing a value of 0 because visit count is 0.
zero visits, not bookmarked, not typed:  this will give us a frececny of 0, which is desired.


x) after fx 2 / fx 3b2 migration, force a few idles?  do a massive frecency recalc?  how long does each take?  do a few on a timer, not on idle to improve the first impression
x) after fx 2 / fx 3b2 migration, force a few idles?  do a massive frecency recalc?  how long does each take?  do a few on a timer, not on idle to improve the first impression
Line 319: Line 383:
x)
x)


for calc frecency, never calc 0 unless place: or unvisit livemark item, so do 1.
for calc frecncy, never calc 0 unless place: or unvisit livemark item, so do 1.


x)
x)
Line 335: Line 399:
x)
x)


when calc frecency, never calc 0 unless place: or unvisit livemark item, so do 1.
when calc frecncy, never calc 0 unless place: or unvisit livemark item, so do 1.


x)
x)
Line 372: Line 436:


2 -> 3 migration, frecency = -1, but do we have visit counts?  force place: and unvisited livemarks to be zero, do one on idle, wait for rest?
2 -> 3 migration, frecency = -1, but do we have visit counts?  force place: and unvisited livemarks to be zero, do one on idle, wait for rest?
3b2 - > 3 migration, frecency = -1, but we do have visit counts.  force place: and unvisited livemarks to be zero, do on idle, wait for rest.
3b2 - > 3 migraiton, frecency = -1, but we do have visit counts.  force place: and unvisited livemarks to be zero, do on idle, wait for rest.
clear all private data, frecency = -1, but we do have visit counts.  force place: and unvisited livemarks to be zero, do on idle, wait for rest.
clear all private data, frecency = -1, but we do have visit counts.  force place: and unvisited livemarks to be zero, do on idle, wait for rest.


small area:
small area:
partial expiration, frecency = -1 for a few places, but we do have visit counts.  don't think we need to force place and unvisited livemarks, (maybe we do for unvisit livemarks) , do on idle, wait for rest.
partial expiration, frecency = -1 for a few places, but we do have visit counts.  don't think we need to force place and unvisited livemarks, (maybe we do for unvisit livemaks) , do on idle, wait for rest.


small, but could be big:
small, but could be big:
234

edits

Navigation menu