User talk:Andyed

From MozillaWiki
Jump to: navigation, search

Current interfaces for browsing history are focused on URLs when perhaps they should also be considering sessions (ala browseMaps ). One of the most salient of session experiences is the search, characterized by # of clicks, time spent both searching (ex. struggling) and browsing (ex. learning).

Extracting searches and the resulting visits is not possible currently with he Places API but is achievable with raw SQL. I've noted in the about:me page that search efficacy might be an interesting thing to display, though perhaps search volume is more casually interesting.

Search Volume

Demo adapted from about:me, Install about:search at http://surfmind.com/lab/mozilla/aboutsearch/aboutsearch_022.xpi

SELECT rev_host as rev_host, SUM(visit_count) as visits 
FROM moz_places WHERE (url like '%?q=%' OR url like '%&q=%') GROUP BY rev_host 
ORDER BY SUM(visit_count) DESC LIMIT 10

Search Efficacy Metrics

Motivation around these metrics is described in a blog post but in short, here are two queries which when combined together provide the strongest metric of the effectiveness of search experiences with a provider, failed query chains.

The submetrics of query volume per provider and clicks per query are also interesting though less discrimating.

Query to Click Patterns (all queries)

SELECT count(distinct s.session) as N, count(distinct s2.from_visit) as clicks, rev_host 
FROM moz_historyvisits s join  moz_places p ON  s.place_id = p.id 
LEFT OUTER JOIN  moz_historyvisits s2  
ON s2.from_visit = s.id 
WHERE ( 
rev_host like '%moc.elgoog.www%' 
OR rev_host like '%moc.hcraes.nsm%' 
OR rev_host like '%moc.evil.hcraes%' 
OR rev_host = 'moc.oohay.hcraes.%' 
OR rev_host = 'moc.gnib.%') 
AND (p.url like '%q=%' or p.url like '%p=%') 
GROUP BY rev_host

Queries to Click Patterns (unabandoned)

SELECT count(distinct s.session) as N, count(distinct s2.from_visit) as clicks, rev_host 
FROM moz_historyvisits s, moz_places p, moz_historyvisits s2 
WHERE s.place_id = p.id AND s2.from_visit = s.id AND ( 
rev_host like '%moc.elgoog.www%' 
OR rev_host like '%moc.hcraes.nsm%' 
OR rev_host like '%moc.evil.hcraes%' 
OR rev_host = 'moc.oohay.hcraes.%' 
OR rev_host = 'moc.gnib.%') 
AND (p.url like '%q=%' or p.url like '%p=%') 
GROUP BY rev_host

Computing % Abandoned

Abandonment rate is a direct measure of expectation violation in search. The user thought that their query was sufficient to retrieve the item or topic of interest and that expecation was unmet.

( AllQueries.N - Unabandoned.N ) / AllQueries.N = AbandonmentRate

Note: N here is distinct(moz_historyvisits.session)

This is a solid metric for providing users objective feedback on their experience with search engines.

Conclusions

  • It's pretty easy to query for search volume in places, though computationally expensive with the % partial match.
  • Looking at click chains (by joining moz_historyvisits to itself) is a powerful technique
  • Peer review & QA of this code is needed and welcome
  • Ideally the AbandonmentRate could be computed in one query. Any ideas?