Update:Remora Database Migration: Difference between revisions
| Line 489: | Line 489: | ||
|PreviewURI||thumbdata<br/>''data from download cooked into an image of 200x150 with aspect ratio perserved (add filler if needed)'' | |PreviewURI||thumbdata<br/>''data from download cooked into an image of 200x150 with aspect ratio perserved (add filler if needed)'' | ||
|- | |- | ||
|PreviewURI|| | |PreviewURI||thumbtype<br/>''the MIME type from download'' | ||
|- | |- | ||
|ID||addon_id | |ID||addon_id | ||
Revision as of 22:22, 29 December 2006
This page is about the migration from the v1 database to the new (remora) database structures.
The Utility
The Python script that does the migration has all sorts of fun features, exciting options, as well as providing thrilling output. It is a wholesome command line program.
Installation
The script is called migration.py and lives in svn at .../trunk/services/bin
The script has two Python library dependencies:
- cse -- using svn at GoogleCode: http://csepython.googlecode.com/svn/trunk/
- MySQLdb -- http://sourceforge.net/projects/mysql-python
Command Line Options
lars@deerlodge:~/project> ./migration.py -?
m1 0.2
This routine migrates data from the old AMO database schems to the new one.
-a, --addons
a quoted comma delimited list of the ids of addons OR the word
"all" (default: all)
--addonStatusWhenNoFilesApproved
the status to set an addon if no files approved
(null, sandbox, pending, nominated, public, disabled)
(default: sandbox)
--addonStatusWhenSomeFilesApproved
the status to set an addon if some files approved
(null, sandbox, pending, nominated, public, disabled)
(default: public)
--clear
clear all exisiting information from the new database
--clearAddons
clear all exisiting addons in the "addons" option list
information from the new database
-c, --config
specify the location and name of the config file
(default: ./migration.conf)
-?, --help
print this message
-l, --locale
the locale to use when migrating addons (default: en-US)
--logPathName
a progressive log of all runs of the migration script
(default: ./migration.log)
-A, --migrateAddons
if present, this switch causes the addons in the "addons"
option list to be migrated
-M, --migrateMetaData
if present, this switch causes the metadata tables to
be migrated
--newDatabaseName
the name of the new database within the server (default: )
--newPassword
the password for the user in the new database (default: )
--newServerName
the name of the new database server (default: )
--newUserName
the name of the user in the new database (default: )
-n, --not
reverses the meaning of the "addon" option. If the "addon"
option has a list, then specify everything except what's on
the list
--oldDatabaseName
the name of the old database within the server (default: )
--oldPassword
the password for the user in the old database (default: )
--oldServerName
the name of the old database server (default: )
--oldUserName
the name of the user in the old database (default: )
--previewURIPrefix
a prefix to add to the URI in the preview table to enable
downloading (default: https://addons.mozilla.org)
--tests
run the test cases
-v, --verbose
print status information as it runs
Usage examples
Migrate all metadata (-M) and all addons (-A) and talk alot about what's going on (-v):
lars@deerlodge:~/project> ./migration.py -MAv
2006-08-21 21:05:56.80 beginning migration version 0.2 with options:
addons=all
config=./migration.conf
migrateAddons
migrateMetaData
newDatabaseName=noo
newPassword=**********
newServerName=deerlodge
newUserName=lars
oldDatabaseName=old
oldPassword=**********
oldServerName=deerlodge
oldUserName=lars
verbose
version=0.2
2006-08-21 21:05:56.80 beginning metadata migration
2006-08-21 21:05:56.80 clearing metadata tables...
done.
2006-08-21 21:05:58.19 beginning applicationsToApplications...
done.
2006-08-21 21:05:58.65 beginning categoriesToTags...
done.
2006-08-21 21:05:58.76 beginning osToPlatforms...
done.
2006-08-21 21:05:58.79 beginning userprofilesToUsers...
done.
2006-08-21 21:06:29.47 beginning addons migration
2006-08-21 21:06:29.47 clearing addons tables...
done.
2006-08-21 21:06:29.47 beginning mainToAddOns...
done.
2006-08-21 21:06:33.21 beginning authorxrefToAddons_users...
done.
2006-08-21 21:06:34.87 beginning versionToVerions...
2006-08-21 21:07:02.08 WARNING -- version ID 15798 of addon 2506 for application 79 fails to migrate.
(1452, 'Cannot add or update a child row: a foreign key constraint fails
(`noo/applications_versions`, CONSTRAINT `applications_versions_ibfk_1` FOREIGN KEY
(`application_id`) REFERENCES `applications` (`id`))')
2006-08-21 21:07:02.08 WARNING -- version ID 15799 of addon 2507 for application 79 fails to migrate.
(1452, 'Cannot add or update a child row: a foreign key constraint fails
(`noo/applications_versions`, CONSTRAINT `applications_versions_ibfk_1` FOREIGN KEY
(`application_id`) REFERENCES `applications` (`id`))')
done.
2006-08-21 21:07:09.55 beginning categoryxrefToAddons_tags...
done.
2006-08-21 21:07:13.81 beginning previewsToPreviews...
done.
done.
Clear all the Addon information from the new database silently:
lars@deerlodge:~/project> ./migration.py --clearAddons
Copy only addons 3110, 3112 and 3113 to the new database:
lars@deerlodge:~/project> ./m1.py -A -a 3110,3112,3113 -v
2006-08-21 21:14:42.80 beginning migration version 0.2 with options:
addons=3110,3112,3113
config=./migration.conf
migrateAddons
newDatabaseName=noo
newPassword=**********
newServerName=deerlodge
newUserName=lars
oldDatabaseName=old
oldPassword=**********
oldServerName=deerlodge
oldUserName=lars
verbose
version=0.2
2006-08-21 21:14:42.80 beginning setupAddonSelectionTable...
done.
2006-08-21 21:14:42.97 beginning addons migration
2006-08-21 21:14:42.97 clearing addons tables...
done.
2006-08-21 21:14:42.97 beginning mainToAddOns...
done.
2006-08-21 21:14:43.01 beginning authorxrefToAddons_users...
done.
2006-08-21 21:14:43.02 beginning versionToVerions...
done.
2006-08-21 21:14:43.14 beginning categoryxrefToAddons_tags...
done.
2006-08-21 21:14:43.17 beginning previewsToPreviews...
done.
done.
Copy all addons except 3110, 3112 and 3113 to the new database:
lars@deerlodge:~/project> ./migration.py -A -na 3110,3112,3113 -v
2006-08-21 21:16:16.01 beginning migration version 0.2 with options:
addons=3110,3112,3113
config=./migration.conf
migrateAddons
newDatabaseName=noo
newPassword=**********
newServerName=deerlodge
newUserName=lars
not
oldDatabaseName=old
oldPassword=**********
oldServerName=deerlodge
oldUserName=lars
verbose
version=0.2
2006-08-21 21:16:16.01 beginning setupAddonSelectionTable...
done.
2006-08-21 21:16:18.25 beginning addons migration
2006-08-21 21:16:18.25 clearing addons tables...
done.
2006-08-21 21:16:18.25 beginning mainToAddOns...
done.
2006-08-21 21:16:21.79 beginning authorxrefToAddons_users...
done.
2006-08-21 21:16:23.15 beginning versionToVerions...
2006-08-21 21:16:50.91 WARNING -- version ID 15798 of addon 2506 for application 79 fails to migrate.
(1452, 'Cannot add or update a child row: a foreign key constraint fails
(`noo/applications_versions`, CONSTRAINT `applications_versions_ibfk_1` FOREIGN KEY
(`application_id`) REFERENCES `applications` (`id`))')
2006-08-21 21:16:50.91 WARNING -- version ID 15799 of addon 2507 for application 79 fails to migrate.
(1452, 'Cannot add or update a child row: a foreign key constraint fails
(`noo/applications_versions`, CONSTRAINT `applications_versions_ibfk_1` FOREIGN KEY
(`application_id`) REFERENCES `applications` (`id`))')
done.
2006-08-21 21:16:58.47 beginning categoryxrefToAddons_tags...
done.
2006-08-21 21:17:02.80 beginning previewsToPreviews...
done.
done.
Remove Addons 2506 and 2507 from the new database:
lars@deerlodge:~/project> ./migration.py --clearAddons -a 2506,2507
Reinsert Addon 2506 as a German language addon:
lars@deerlodge:~/project> ./migration.py -A -a 2506 -l de
Independent Information (metadata)
applications to applications, appversions
| Old Database | New Database |
|---|---|
| AppID | applications.id |
| GUID | applications.guid |
| AppName | translations.localized_string insert id into applications.name |
| AppName | translations.localized_string insert id into applications.shortname |
| major | no mapping |
| minor | no mapping |
| release | no mapping |
| Old Database | New Database |
|---|---|
| AppID | appversions.id |
| AppID | appversions.application_id use AppID for originally encountered application |
| Version | appversions.version |
| appversions.created time.now | |
| appversions.modified use default value |
categories to tags
| Old Database | New Database |
|---|---|
| CategoryID | id |
| CatName | translations.localized_string insert id into name |
| CatDesc | translations.localized_string insert id into description |
| CatType | addontype_id lookup in addontypes.name assuming mapping of E->extensions, T->themes... |
| CatApp | application_id lookup in applications.name |
| created time.now | |
| modified use default value |
os to platforms
| Old Database | New Database |
|---|---|
| OSID | id |
| OSName | translations.localized_string insert id into name |
| OSName | translations.localized_string insert id into shortname |
| created time.now | |
| modified use default value |
userprofiles to users
| Old Database | New Database |
|---|---|
| UserID | id |
| UserName | parse to firstname, lastname |
| nickname no source for data, Null not allowed, no default - I put in an empty string | |
| UserEmail | |
| UserWebsite | homepage |
| UserPass | password |
| UserMode | no mapping |
| UserTrusted | no mapping |
| UserEmailHide | emailhidden |
| UserLastLogin | no mapping |
| ConfirmationCode | confirmationcode |
| created time.now | |
| modified use default value | |
| notes null |
authorxref to addons_users
| Old Database | New Database |
|---|---|
| ID | addon_id |
| UserID | user_id |
| created time.now | |
| modified use default value |
add-on specific information (per addon ID)
main to addons, translations
| Old Database | New Database |
|---|---|
| ID | id |
| GUID | guid |
| Name | translations.localized_string insert into translations, then fetch id name |
| Type | addontype_id will have to find type in type table |
| DateAdded | created |
| DateUpdated | modified |
| Homepage | translations.localized_string insert id into homepage |
| Description | translations.localized_string insert into translations, then fetch id description |
| Rating | averagerating |
| downloadcount | weeklydownloads |
| TotalDownloads | totaldownloads |
| devcomments | translations.localized_string insert id into developercomments |
| icon use default value | |
| Description | translations.localized_string take up to 250 characters, do not break inside a word, add ellipsis if break is not at the end of sentence insert into translations, then fetch id summary |
| inactive use default value | |
| prerelease use default value | |
| adminreview use default value | |
| sitespecific use default value | |
| externalsoftware use default value | |
| approvalnotes use NULL | |
| eula use NULL | |
| privacypolicy use NULL |
version to versions, files, applications_versions
| Old Database | New Database |
|---|---|
| vID | versions.id |
| ID | versions.addon_id |
| Version | versions.version |
| MinAppVer_int | no mapping |
| MaxAppVer_int | no mapping |
| DateAdded | versions.created |
| DateUpdated | versions.modified |
| Notes | translations.localized_string insert id into versions.releasenotes |
| addons.approvalnotes empty string | |
| versions.created time.now | |
| modified use default value |
| Old Database | New Database |
|---|---|
| AppID | applications_versions.application_id |
| vID | applications_versions.version_id |
| MinAppVer | applications_versions.min look up in appversions table ambiguous versions are translated via the technique outlined in update:ToublesInAddonMigration. Missing versions are added to the appversiontable |
| MaxAppVer | applications_versions.max look up in appversions table ambiguous versions are translated via the technique outlined in update:ToublesInAddonMigration. Missing versions are added to the appversiontable |
| applications_versions.created time.now | |
| applications_versions.modified use default value |
| Old Database | New Database |
|---|---|
| id auto-increment | |
| vID | files.version_id |
| OSID | files.platform_id |
| URI | files.filename use base name only |
| Size | files.size |
| hash | files.hash |
| DateApproved | files.dateapproved |
| approved | files.approved mapping -- "YES": 2, "NO": 3, "?":1, "DISABLED":6 |
| files.created time.now | |
| files.modified use default value |
categoryxref to addons_tags
| Old Database | New Database |
|---|---|
| ID | addon_id |
| CategoryID | tag_id |
previews to previews
| Old Database | New Database |
|---|---|
| PreviewID | id |
| PreviewURI | filedata data from download |
| PreviewURI | filetype the MIME type from download |
| PreviewURI | thumbdata data from download cooked into an image of 200x150 with aspect ratio perserved (add filler if needed) |
| PreviewURI | thumbtype the MIME type from download |
| ID | addon_id |
| caption | translations.localized_string insert id from transaltions into into caption |
| preview | highlight conversion -- "YES":1, "NO":0 |
| created time.now | |
| modified use default value |
Feedback
If we don't migrate add-ons which are only compatible with fx 1.0, then users can upload add-ons with the same em:ID and then users who might have had the old add-ons still installed but disabled (or who are still using fx 1.0) will be updated to the new one.
Previews
- No previews were migrated -- but we need to solve this issue to help get images over.
- In order to pull them we'd need to generate the URI and grab them so we can insert them into the proper blob in the previews table
Old db:
mysql> describe previews;
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| PreviewID | int(11) | | PRI | NULL | auto_increment |
| PreviewURI | varchar(200) | | | | |
| ID | int(5) | | MUL | 0 | |
| caption | varchar(255) | | | | |
| preview | enum('YES','NO') | | | NO | |
+------------+------------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)
- "PreviewURI" column points to the relative path of the preview.
- "ID" is the FK to main.ID
- "caption" would get ported to dynamic l10n
- "preview" would be "highlight" in the new previews table
Moving to new db:
- Use "https://addons.mozilla.org".$PreviewURI to grab the image and insert it into the filedata blog in the new db.
- Filetype should be preserved as well so it can display properly in the images controller.
Duplicate Files
- (clouserw) I think there were some duplications within the `files` table. For example:
mysql> select * from files where filename like 'stumbleupon-2.1%'; +--------+------------+-------------+---------------------------------+------+------+------------+--------+---------------------+---------------------+---------------------+ | id | version_id | platform_id | filename | size | hash | codereview | status | datestatuschanged | created | modified | +--------+------------+-------------+---------------------------------+------+------+------------+--------+---------------------+---------------------+---------------------+ | 133663 | 7507 | 1 | stumbleupon-2.1-fx+fl+mz+ns.xpi | 101 | | 0 | 4 | 0000-00-00 00:00:00 | 2006-12-22 00:20:46 | 0000-00-00 00:00:00 | | 133664 | 7507 | 1 | stumbleupon-2.1-fx+fl+mz+ns.xpi | 101 | | 0 | 4 | 0000-00-00 00:00:00 | 2006-12-22 00:20:46 | 0000-00-00 00:00:00 | | 133665 | 7507 | 1 | stumbleupon-2.1-fx+fl+mz+ns.xpi | 101 | | 0 | 4 | 0000-00-00 00:00:00 | 2006-12-22 00:20:46 | 0000-00-00 00:00:00 | | 133666 | 7507 | 1 | stumbleupon-2.1-fx+fl+mz+ns.xpi | 101 | | 0 | 4 | 0000-00-00 00:00:00 | 2006-12-22 00:20:46 | 0000-00-00 00:00:00 | +--------+------------+-------------+---------------------------------+------+------+------------+--------+---------------------+---------------------+---------------------+ 4 rows in set (0.00 sec)
- (morgamic) I think the script may have created duped rows in files because in the old 'version' table rows would be duped except for the app/appversion info -- in this database that data is spliced off. Might want to start there.
- (lars) files are duplicated because sometimes the the same physical file is specified for more than one application (Firefox & Mozilla) - it is very easy to prevent this from resulting in two entries in to the file table. However, if the two entries may not match exactly (hash, approval, etc), the one seen later will superceed the one seen earlier...
Duplicate Categories/Tags
- I haven't looked at the results since last week, but as I recall, there are many duplicate categories. My guess is that this is because in v1/v2, each application has to have its own category, so there might be an "Emo" category for Firefox and a different "Emo" category for Thunderbird, and the add-on is in both of them, but it only shows one because we grouped by name before. We will need to look at this and figure out whether to fix this in the code or migration. Fligtar 16:28, 27 December 2006 (PST)
- (lars) the database schema for the `tags` table requires that we duplicate the entries because it has foreign keys to both the addontypes and applications tables. There are six tags called "Miscellaneous" because we need entries for the addontypes "extension" and "theme" for the applications "Firefox", "Mozilla" and "Thunderbird"
Other
- There are blank filenames in the `files` table - is this a result of the migration?
- (lars) empty filenames in the migration are faithful reproductions from the old database. There are entries in the olddb version table where the URI column is blank. What would you like me to do with them?
- Some sha sums are calculated in the files table, some aren't. Since we're migrating data, would it be useful to calculate hashes while we're at it? (Would it be easily doable?)