Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Guidelines for db schema changes

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


robla at wikimedia

Apr 24, 2012, 5:52 PM

Post #1 of 10 (600 views)
Permalink
Guidelines for db schema changes

Hi everyone,

As we do more frequent deploys, it's going to become critical that we
get database schema changes correct, and that we do so in a way that
gives us time to prepare for said changes and roll back to old
versions of the software should a deploy go poorly. This applies both
to MediaWiki core and to WMF-deployed extensions.

I'd like to propose that we make the following standard practice:

1. All schema changes must go through a period of being optional.
For example, instead of changing the format of a column, create a new
column, make all writes happen to the old and new column (if it
exists) and deprecate use of the old column. Check if the new column
exists before blindly assuming that it does. Only eliminate support
for the old column after it's clear the schema migration has happened
and there's no chance that we'll need to roll back to the old version
of the software.

2. There might be cases where rule #1 will be prohibitive from a
performance perspective. However, schema changes like that should be
rare to begin with, and should have prominent discussion on this list.
In the case where it's impossible to follow rule #1, it is still
critical to write scripts to roll back to the pre-change state.

3. For anything that involves a schema change to the production dbs,
make sure Asher Feldman (afeldman [at] wikimedia) is on the reviewer
list. He's already keeping an eye on this stuff the best he can, but
it's going to be easy for him to miss changes in extensions should
they happen.

I don't have a strong opinion about whether we need to follow rule #1
above through an iteration of our six month tarball release cycle, but
we at least need to follow it through the two week deployment cycle.

Assuming this seems sensible to everyone, I can update this page with this:
http://www.mediawiki.org/wiki/Development_policy

(/me desperately tries to avoid yak shaving and updating the policy
above for Git)

Rob

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Apr 24, 2012, 6:15 PM

Post #2 of 10 (581 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

On Tue, Apr 24, 2012 at 5:52 PM, Rob Lanphier <robla [at] wikimedia> wrote:
> I don't have a strong opinion about whether we need to follow rule #1
> above through an iteration of our six month tarball release cycle, but
> we at least need to follow it through the two week deployment cycle.
>
I think rule #1 is unnecessary for tarballs, but it's definitely a
good thing for deployments.

Other than that this all makes perfect sense to me.

Roan

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


faidon at wikimedia

Apr 24, 2012, 11:27 PM

Post #3 of 10 (573 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

On Tue, Apr 24, 2012 at 05:52:24PM -0700, Rob Lanphier wrote:
> As we do more frequent deploys, it's going to become critical that we
> get database schema changes correct, and that we do so in a way that
> gives us time to prepare for said changes and roll back to old
> versions of the software should a deploy go poorly. This applies both
> to MediaWiki core and to WMF-deployed extensions.
>
> I'd like to propose that we make the following standard practice:

I'm still new around here so pardon me if this sounds infeasible for us:

In other systems I've worked before, such problems have been solved by
each schema-breaking version providing schema *and data* migrations for
both forward *and backward* steps.

This means that the upgrade transition mechanism knew how to add or
remove columns or tables *and* how to fill them with data (say by
concatenating two columns of the old schema). The same program would
also take care to do the exact opposite steps in a the migration's
backward method, in case a rollback was needed.

The migrations themselves can be kept in the source tree, perhaps even
versioned and with the schema version kept in the database, so that both
us and external users can at any time forward their database to any
later version, automagically.

I think that both Ruby on Rails and Python/Django (with South) employ
such schemas and I've seen them work well in practice before.

Regards,
Faidon

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


hashar+wmf at free

Apr 25, 2012, 12:12 AM

Post #4 of 10 (573 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

Le 25/04/12 02:52, Rob Lanphier a écrit :
> 3. For anything that involves a schema change to the production dbs,
> make sure Asher Feldman (afeldman [at] wikimedia) is on the reviewer
> list. He's already keeping an eye on this stuff the best he can, but
> it's going to be easy for him to miss changes in extensions should
> they happen.

I am pretty sure Jenkins could detect a change is being made on a .sql
file and then add a specific reviewer using Gerrit CLI tool.

Logged as:
https://bugzilla.wikimedia.org/36228


--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


afeldman at wikimedia

Apr 25, 2012, 9:58 AM

Post #5 of 10 (572 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

I am generally in favor of all of this and in the meeting that proceeded
Rob's email, proposed that we develop a new schema migration tool for
mediawiki along similar lines. Such a beast would have to work in all
deployment cases without modifications (stock single wiki installs and at
wmf with many wikis across multiple masters with tiered replication), be
idempotent when run across many databases, track version and state per
migration, and include up/down steps in every migration.

There are opensource php migration tools modeled along those used by the
popular ruby and python frameworks. I deployed
https://github.com/davejkiger/mysql-php-migrations at kiva.org a couple
years ago where it worked well and is still in use. Nothing will meet our
needs off the shelf though. A good project could at best be forked into
mediawiki with modifications if the license allows it, or more likely serve
as a model for our own development.

On Tue, Apr 24, 2012 at 11:27 PM, Faidon Liambotis <faidon [at] wikimedia>wrote:

>
> In other systems I've worked before, such problems have been solved by
> each schema-breaking version providing schema *and data* migrations for
> both forward *and backward* steps.


> This means that the upgrade transition mechanism knew how to add or
> remove columns or tables *and* how to fill them with data (say by
> concatenating two columns of the old schema). The same program would
> also take care to do the exact opposite steps in a the migration's
> backward method, in case a rollback was needed.
>

Down migrations aid development; I find them most useful as documentation
of prior state, making a migration readable as a diff. They generally
aren't useful in production environments at scale though, which developers
removed from the workings of production need to be aware of. Even with
transparent execution of migrations, the time it takes to apply changes
will nearly always be far outside of the acceptable bounds of an emergency
response necessitating a code rollback. So except in obvious cases such as
adding new tables, care is needed to keep forward migration backwards
compatible with code as much as possible.

The migrations themselves can be kept in the source tree, perhaps even
> versioned and with the schema version kept in the database, so that both
> us and external users can at any time forward their database to any
> later version, automagically.


Yep. That we have to pull in migrations from both core and many extensions
(many projects, one migration system) while also running different sets of
extensions across different wikis intermingling on the same database
servers adds some complexity but we should get there.

-Asher
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


afeldman at wikimedia

Apr 25, 2012, 10:02 AM

Post #6 of 10 (575 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

Thanks, hashar!

On Wed, Apr 25, 2012 at 12:12 AM, Antoine Musso <hashar+wmf [at] free> wrote:

> Le 25/04/12 02:52, Rob Lanphier a écrit :
> > 3. For anything that involves a schema change to the production dbs,
> > make sure Asher Feldman (afeldman [at] wikimedia) is on the reviewer
> > list. He's already keeping an eye on this stuff the best he can, but
> > it's going to be easy for him to miss changes in extensions should
> > they happen.
>
> I am pretty sure Jenkins could detect a change is being made on a .sql
> file and then add a specific reviewer using Gerrit CLI tool.
>
> Logged as:
> https://bugzilla.wikimedia.org/36228
>
>
> --
> Antoine "hashar" Musso
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


otto at wikimedia

Apr 26, 2012, 8:30 AM

Post #7 of 10 (568 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

I once wrote a pretty decent schema migration tool that fits most if not all of these requirements. It was built for the Kohana PHP framework, but a lot of it is pretty independent of that. If someone ends up working on this I'd love to help and maybe share some code and ideas.

-Andrew Otto

http://ottomata.org
http://www.flickr.com/photos/OttomatonA
http://www.couchsurfing.org/people/otto


On Apr 25, 2012, at 12:58 PM, Asher Feldman wrote:

> I am generally in favor of all of this and in the meeting that proceeded
> Rob's email, proposed that we develop a new schema migration tool for
> mediawiki along similar lines. Such a beast would have to work in all
> deployment cases without modifications (stock single wiki installs and at
> wmf with many wikis across multiple masters with tiered replication), be
> idempotent when run across many databases, track version and state per
> migration, and include up/down steps in every migration.
>
> There are opensource php migration tools modeled along those used by the
> popular ruby and python frameworks. I deployed
> https://github.com/davejkiger/mysql-php-migrations at kiva.org a couple
> years ago where it worked well and is still in use. Nothing will meet our
> needs off the shelf though. A good project could at best be forked into
> mediawiki with modifications if the license allows it, or more likely serve
> as a model for our own development.
>
> On Tue, Apr 24, 2012 at 11:27 PM, Faidon Liambotis <faidon [at] wikimedia>wrote:
>
>>
>> In other systems I've worked before, such problems have been solved by
>> each schema-breaking version providing schema *and data* migrations for
>> both forward *and backward* steps.
>
>
>> This means that the upgrade transition mechanism knew how to add or
>> remove columns or tables *and* how to fill them with data (say by
>> concatenating two columns of the old schema). The same program would
>> also take care to do the exact opposite steps in a the migration's
>> backward method, in case a rollback was needed.
>>
>
> Down migrations aid development; I find them most useful as documentation
> of prior state, making a migration readable as a diff. They generally
> aren't useful in production environments at scale though, which developers
> removed from the workings of production need to be aware of. Even with
> transparent execution of migrations, the time it takes to apply changes
> will nearly always be far outside of the acceptable bounds of an emergency
> response necessitating a code rollback. So except in obvious cases such as
> adding new tables, care is needed to keep forward migration backwards
> compatible with code as much as possible.
>
> The migrations themselves can be kept in the source tree, perhaps even
>> versioned and with the schema version kept in the database, so that both
>> us and external users can at any time forward their database to any
>> later version, automagically.
>
>
> Yep. That we have to pull in migrations from both core and many extensions
> (many projects, one migration system) while also running different sets of
> extensions across different wikis intermingling on the same database
> servers adds some complexity but we should get there.
>
> -Asher
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


otto at wikimedia

Apr 27, 2012, 1:53 PM

Post #8 of 10 (565 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

Here's the migrations library I wrote. :)
https://github.com/ottomata/cs_migrations


-Andrew Otto


On Apr 26, 2012, at 11:30 AM, Andrew Otto wrote:

> I once wrote a pretty decent schema migration tool that fits most if not all of these requirements. It was built for the Kohana PHP framework, but a lot of it is pretty independent of that. If someone ends up working on this I'd love to help and maybe share some code and ideas.
>
> -Andrew Otto
>
> http://ottomata.org
> http://www.flickr.com/photos/OttomatonA
> http://www.couchsurfing.org/people/otto
>
>
> On Apr 25, 2012, at 12:58 PM, Asher Feldman wrote:
>
>> I am generally in favor of all of this and in the meeting that proceeded
>> Rob's email, proposed that we develop a new schema migration tool for
>> mediawiki along similar lines. Such a beast would have to work in all
>> deployment cases without modifications (stock single wiki installs and at
>> wmf with many wikis across multiple masters with tiered replication), be
>> idempotent when run across many databases, track version and state per
>> migration, and include up/down steps in every migration.
>>
>> There are opensource php migration tools modeled along those used by the
>> popular ruby and python frameworks. I deployed
>> https://github.com/davejkiger/mysql-php-migrations at kiva.org a couple
>> years ago where it worked well and is still in use. Nothing will meet our
>> needs off the shelf though. A good project could at best be forked into
>> mediawiki with modifications if the license allows it, or more likely serve
>> as a model for our own development.
>>
>> On Tue, Apr 24, 2012 at 11:27 PM, Faidon Liambotis <faidon [at] wikimedia>wrote:
>>
>>>
>>> In other systems I've worked before, such problems have been solved by
>>> each schema-breaking version providing schema *and data* migrations for
>>> both forward *and backward* steps.
>>
>>
>>> This means that the upgrade transition mechanism knew how to add or
>>> remove columns or tables *and* how to fill them with data (say by
>>> concatenating two columns of the old schema). The same program would
>>> also take care to do the exact opposite steps in a the migration's
>>> backward method, in case a rollback was needed.
>>>
>>
>> Down migrations aid development; I find them most useful as documentation
>> of prior state, making a migration readable as a diff. They generally
>> aren't useful in production environments at scale though, which developers
>> removed from the workings of production need to be aware of. Even with
>> transparent execution of migrations, the time it takes to apply changes
>> will nearly always be far outside of the acceptable bounds of an emergency
>> response necessitating a code rollback. So except in obvious cases such as
>> adding new tables, care is needed to keep forward migration backwards
>> compatible with code as much as possible.
>>
>> The migrations themselves can be kept in the source tree, perhaps even
>>> versioned and with the schema version kept in the database, so that both
>>> us and external users can at any time forward their database to any
>>> later version, automagically.
>>
>>
>> Yep. That we have to pull in migrations from both core and many extensions
>> (many projects, one migration system) while also running different sets of
>> extensions across different wikis intermingling on the same database
>> servers adds some complexity but we should get there.
>>
>> -Asher
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l [at] lists
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


robla at wikimedia

May 14, 2012, 7:25 PM

Post #9 of 10 (518 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

On Tue, Apr 24, 2012 at 5:52 PM, Rob Lanphier <robla [at] wikimedia> wrote:
> Assuming this seems sensible to everyone, I can update this page with this:
> http://www.mediawiki.org/wiki/Development_policy

And this is done now.

In case you aren't using a threaded mail client, here's the original discussion:
http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/60967

Rob

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


afeldman at wikimedia

Jun 13, 2012, 4:56 PM

Post #10 of 10 (471 views)
Permalink
Re: Guidelines for db schema changes [In reply to]

Hi all,

I'd like to remind everyone involved in development that requires db schema
migrations - please keep in mind the three related guidelines in our
official deployment policies -
http://www.mediawiki.org/wiki/Development_policy#Database_patches -
especially the third, which is to make schema changes optional.

Once a migration has been reviewed, please update
http://wikitech.wikimedia.org/view/Schema_changes with all pertinent
details, then get in touch for deployment scheduling. There are good and
legitimate reasons to not follow the "make schema changes optional" policy
but if that's the case, please provide 3-7 days of lead time, depending on
the size of tables and number of effected wikis.

Best,
Asher

On Mon, May 14, 2012 at 7:25 PM, Rob Lanphier <robla [at] wikimedia> wrote:

> On Tue, Apr 24, 2012 at 5:52 PM, Rob Lanphier <robla [at] wikimedia> wrote:
> > Assuming this seems sensible to everyone, I can update this page with
> this:
> > http://www.mediawiki.org/wiki/Development_policy
>
> And this is done now.
>
> In case you aren't using a threaded mail client, here's the original
> discussion:
> http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/60967
>
> Rob
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.