Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Bools/ints in database wrappers

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


brion at wikimedia

Aug 21, 2008, 3:13 PM

Post #1 of 8 (627 views)
Permalink
Bools/ints in database wrappers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

greg [at] svn wrote:
> Force inserted bools to be ints, per Tim's suggestion on bug 15148.
> This should be fine for now as Postgres uses SMALLINTs for BOOLs, to
> match MySQLs BOOL/TINYINT aliasing and subsequent MW coding assumptions.
> It's possible this will cause bad effects if inserted values are called
> in a boolean context when they are going into a non-SMALLINT column, but we can handle
> those as they come up.
[snip]
> function addQuotes( $s ) {
> if ( is_null( $s ) ) {
> return 'NULL';
> + } else if ( is_bool( $s ) ) {
> + return intval( $s );

It probably wouldn't hurt to do this on general principle in the base
class. Some quick background:

PHP's boolean type interpolates to string as:

true -> "1"
false -> ""

MySQL's "bool" column type is an alias to "tinyint" (which we use, as
"bool" didn't exist in earlier MySQL versions), where you generally
store 1 or 0. However if you stick '' in there it gets automatically
interpreted as 0... (does this happen in MySQL's strict mode, though?)

PostgreSQL has a "bool" data type also, which seems to accept a variety
of values:
http://www.postgresql.org/docs/8.1/static/datatype-boolean.html
none of which are "". ;)

It's also much whingier about its numeric types, and won't let us insert
an empty string, which is what a boolean false interpolates to.


Now, most of the time we explicitly use 1 and 0 when storing boolean
values in MediaWiki, like:

'page_is_new' => ($lastRevision === 0) ? 1 : 0,
'page_is_redirect' => $rt !== NULL ? 1 : 0,

But it'd be nice not to break if people forget and use a genuine bool...
for that matter it'd be nice to be able to do that as a matter of course. ;)

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkit6J0ACgkQwRnhpk1wk45XUgCgy2BL/ggyfKmHMXbauzXGPq3Q
pwAAoI2tKNJKhuU7o90pbFG9w5S/IEfR
=NrDM
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 21, 2008, 3:24 PM

Post #2 of 8 (607 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

On Thu, Aug 21, 2008 at 6:13 PM, Brion Vibber <brion [at] wikimedia> wrote:
> Now, most of the time we explicitly use 1 and 0 when storing boolean
> values in MediaWiki, like:
>
> 'page_is_new' => ($lastRevision === 0) ? 1 : 0,
> 'page_is_redirect' => $rt !== NULL ? 1 : 0,
>
> But it'd be nice not to break if people forget and use a genuine bool...
> for that matter it'd be nice to be able to do that as a matter of course. ;)

Would it make sense for the database abstraction layer to know our
schema, so that it can automatically cast things to bool (or whatever
is necessary) before handing them down to the actual database? If we
wanted to get hardcore it could also do stuff like handle truncation
for limited-length fields, but that's scarier (e.g. if someone
manually alters their schema to lengthen a field, they'd have to tell
MediaWiki about it).

Of course, if the abstraction layer doesn't know about a field it
could fall back to current ignorant behavior.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 21, 2008, 4:23 PM

Post #3 of 8 (597 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aryeh Gregor wrote:
> Would it make sense for the database abstraction layer to know our
> schema, so that it can automatically cast things to bool (or whatever
> is necessary) before handing them down to the actual database? If we
> wanted to get hardcore it could also do stuff like handle truncation
> for limited-length fields, but that's scarier (e.g. if someone
> manually alters their schema to lengthen a field, they'd have to tell
> MediaWiki about it).

In my opinion that would be a bit horrifyingly scary. ;) Domain-specific
knowledge is better placed in domain-specific classes...

Proper behavior for overlong input can vary. Sometimes we want to reject
it outright, since it's invalid -- an overlong page title, username, or
group name wouldn't function correctly truncated.

Other times where truncation is appropriate might be edit comments,
where we might want a pretty ellipsis... and we might want to truncate
part of a comment _inside_ a quoted string, or something fancy like that
that's totally unsuitable for the DB layer. :D

Further, we should probably expect that the database may outright reject
long fields rather than silently truncating them -- like MySQL in strict
mode -- and should ensure that our behavior is sane.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkit+N4ACgkQwRnhpk1wk46anQCg1j77KLeaFUm+gLR3eL/VuIkW
RXsAnRGAbYm3dOPL+exNkiXnGAqMd6f7
=PPbE
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 21, 2008, 4:45 PM

Post #4 of 8 (597 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

On Thu, Aug 21, 2008 at 7:23 PM, Brion Vibber <brion [at] wikimedia> wrote:
> In my opinion that would be a bit horrifyingly scary. ;) Domain-specific
> knowledge is better placed in domain-specific classes...

Well, having pgsql and all the other BOOL-supporting DBMSes forced to
use integer types just because devs are too lazy to make sure things
work properly on all databases is exactly the sort of thing the
abstraction layer is meant to prevent. Each database should be able
to properly use its own features. Introducing domain-specific
knowledge is kind of icky, but I can't think of any other way to do
it.

Of course this logic could be properly factored out of Database.php.
For instance, we could add another argument to the constructor (or add
a mutator, which could allow extensions to join the fun) that would
allow you to pass in an array of fieldname => type mappings. Then
whatever instantiates the Database object we use could pass that in
when it constructs the object.

> Proper behavior for overlong input can vary. Sometimes we want to reject
> it outright, since it's invalid -- an overlong page title, username, or
> group name wouldn't function correctly truncated.

I suggested truncation because that's what MySQL would normally do,
and so it's what we're probably relying on in some cases already.
More sensible behavior would be to throw an exception, rather than
having it silently sort of work on MySQL but fail elsewhere (including
MySQL in strict mode).

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at home

Aug 22, 2008, 1:45 AM

Post #5 of 8 (599 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

Aryeh Gregor schreef:
> On Thu, Aug 21, 2008 at 7:23 PM, Brion Vibber <brion [at] wikimedia> wrote:
>
>> In my opinion that would be a bit horrifyingly scary. ;) Domain-specific
>> knowledge is better placed in domain-specific classes...
>>
>
> Well, having pgsql and all the other BOOL-supporting DBMSes forced to
> use integer types just because devs are too lazy to make sure things
> work properly on all databases is exactly the sort of thing the
> abstraction layer is meant to prevent. Each database should be able
> to properly use its own features. Introducing domain-specific
> knowledge is kind of icky, but I can't think of any other way to do
> it.
>
> Of course this logic could be properly factored out of Database.php.
> For instance, we could add another argument to the constructor (or add
> a mutator, which could allow extensions to join the fun) that would
> allow you to pass in an array of fieldname => type mappings. Then
> whatever instantiates the Database object we use could pass that in
> when it constructs the object.
Don't all DBMSes interpret 1 as true and 0 as false? Then why don't we
just check if something is a bool and convert it to 1 or 0? Problem
solved (I think).

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 22, 2008, 7:14 AM

Post #6 of 8 (588 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

On Fri, Aug 22, 2008 at 4:45 AM, Roan Kattouw <roan.kattouw [at] home> wrote:
> Don't all DBMSes interpret 1 as true and 0 as false? Then why don't we
> just check if something is a bool and convert it to 1 or 0? Problem
> solved (I think).

Hmm, yes. That's what Greg just committed, actually, on Brion's
advice. So does this mean pgsql can now use BOOLs without having to
worry about scary keeping track of types and stuff?

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 22, 2008, 9:22 AM

Post #7 of 8 (587 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aryeh Gregor wrote:
> On Thu, Aug 21, 2008 at 7:23 PM, Brion Vibber <brion [at] wikimedia> wrote:
>> In my opinion that would be a bit horrifyingly scary. ;) Domain-specific
>> knowledge is better placed in domain-specific classes...
>
> Well, having pgsql and all the other BOOL-supporting DBMSes forced to
> use integer types just because devs are too lazy to make sure things
> work properly on all databases is exactly the sort of thing the
> abstraction layer is meant to prevent.

You *do* realize that "1" and "0" are valid values for PostgreSQL's
genuine BOOL type, right?

> Each database should be able
> to properly use its own features. Introducing domain-specific
> knowledge is kind of icky, but I can't think of any other way to do
> it.

We use type-specific wrappers for formatting timestamp values
appropriately, so clearly there is another way to do it. :)

>> Proper behavior for overlong input can vary. Sometimes we want to reject
>> it outright, since it's invalid -- an overlong page title, username, or
>> group name wouldn't function correctly truncated.
>
> I suggested truncation because that's what MySQL would normally do,
> and so it's what we're probably relying on in some cases already.

We must not rely on that, since MySQL may be configured in strict mode,
which will whine at truncation.

> More sensible behavior would be to throw an exception, rather than
> having it silently sort of work on MySQL but fail elsewhere (including
> MySQL in strict mode).

Right, that would throw an exception due to the database error.

- -- brion
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkiu59UACgkQwRnhpk1wk452pQCgl37pBBMscVZROntJYoJ8E82o
LJUAoMqETMHsegJ08ob3J51X2GhlgcZ9
=WPW1
-----END PGP SIGNATURE-----

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 22, 2008, 9:27 AM

Post #8 of 8 (593 views)
Permalink
Re: Bools/ints in database wrappers [In reply to]

On Fri, Aug 22, 2008 at 12:22 PM, Brion Vibber <brion [at] wikimedia> wrote:
> You *do* realize that "1" and "0" are valid values for PostgreSQL's
> genuine BOOL type, right?

Yeah, I'm not very awake right now, I think. So is pgsql going to
switch to a proper BOOL type now or what?

> We use type-specific wrappers for formatting timestamp values
> appropriately, so clearly there is another way to do it. :)

Touché. (Although people do periodically forget to use those.)
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.