User:Ondrej/Bugs/385245

From MozillaWiki
Jump to: navigation, search

bug 385245

History sidebar supports multiple views, for each view there is one query performed which reads all the data to be displayed on the view. The more data is in the database the longer it takes to display the sidebar.

Testing conditions

  • No favicons (they seems to slow down display of the sidebar and should be solved separately).
  • Loading time is just the time spent in the database engine.
  • Time is always in ms unless specified else.
  • Three databases of various size:
Database Places Visits
Small 88 15124
Mid 589 76250
Bug 3276 255363

Views

View by Date and Site

Current Situation

Loads all places and their visits, performs grouping by date and site in memory.

Future Situation

This just combines "View by Date" and "View by Site" - there will be on demand loading on two leves, first by the date and then by the site. Only last visits to each place in given site and time range will be displayed. This will be the fastest view.

View by Site

Current Situation

Loads all places and their visits, performs grouping by site in memory.

Future Situation

Only list of sites will be loaded initially.
SELECT DISTINCT rev_host
FROM moz_places
WHERE hidden <> 1
ORDER BY 1 DESC;
Database Loading Time Records
Small 1.76 48
Mid 12.64 517
Bug 77.53 2831

Even for big database, the time seems to be acceptable. Pages inside the sites will be loaded on demand using the rev_host as filter, we will be selecting only latest visits to each place on the given site.

View by Date

Current Situation

Loads all places and all related visits to memory, processes the data in memory and creates groups by days. This takes with big database more then 30 seconds. When user clicks on the day, more processing is done, with big database and "Older than 6 days" range this can take 3 minutes. Collapsing and expanding this range takes the same time.

Database Loading Time Records
Small 294 1525
Mid 1366 67423
Bug 5156 245441

Future Situation

There will be 8 queries performed which will check for which ranges we have visits (Today, Yesterday, 2 days ago, ...). No visits will be loaded, just the ranges will be presented. A query to check whether there is a visit for the given date:
SELECT visit_date
FROM   moz_historyvisits
WHERE  visit_date
       BETWEEN
           CAST(strftime("%s",current_timestamp, "-1 days") AS UNSIGNED)*1000000
       AND
           CAST(strftime("%s",current_timestamp) AS UNSIGNED)*1000000
LIMIT 1;
This query takes about 0.5ms on any size database, together these 8 queries take 4ms. So the sidebar would be displayed immediately. When the user clicks on any range, only places visited in the period will be loaded and sorted by most recent visits:
SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, MAX(v.visit_date), f.url, v.session, null
FROM moz_places h
    JOIN moz_historyvisits v ON h.id = v.place_id
    LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id
WHERE
    visit_date
       BETWEEN
           CAST(strftime("%s",current_timestamp, "-1 days") AS UNSIGNED)*1000000
       AND
           CAST(strftime("%s",current_timestamp) AS UNSIGNED)*1000000
  AND h.hidden <> 1
  AND v.visit_type NOT IN (0,4)
GROUP BY h.id
ORDER BY MAX(v.visit_date) DESC;
Database Today Yesterday Older than 6 days
Small 3.42 3.30 77.01
Mid 7.71 8.41 394.81
Bug 21.49 26.32 1654.24

This gives reasonable times with exception of the "Older than 6 days", which would be slow even for mid database. It requires incremental loading.

View by Most Visited

Current Situation

Joins places with history, this could be avoided if the visit_count is only updated for proper visit_types (to be checked).

Future Situation

Requires incremental loading.

View by Last Visited

Current Situation

Correct query, loads all the places and their last visit.

Future Situation

Requires incremental loading.

Searching

Current Situation

Searching loads all places from the database and performs searching on UI. This is done for each group of letters typed.

Future Situation

We should repeat the query only when user removes some characters from the search, not when he types more (in such case, we should do filtering in memory). We should be searching using LIKE for title and url. This leads to full scan on the table, but this full scan is done anyway, so filtering on backend should speed the searching up.

Incremental loading

Even with possible on demand loading, there are still some queries, that would be too slow. Fortunately all such views can be loaded in chunks using existing indexes. We should not load all the data in the sidebar immediately, but instead we should load the data as the user scrolls down.