
dan_the_man at telus
May 11, 2008, 2:25 AM
Post #3 of 16
(254 views)
Permalink
|
parse_str isn't actually broken at all. It does exactly what it was meant to do. Duplicate what PHP does with $_SERVER['QUERY_STRING'] to create the $_GET variable. The issue is that PHP has gpc quotes on those. And because of the definition of parse_str it does that to. So we're merely handling parse_str in the same way as we handle PHP's $_GET variable. By undoing that. parse_str is avoided, the code never touches it unless it is absolutely necessary. The only case where it may be necessary is if we're given a string for the query, and the developer asks us to merge that with the defaults. And so the code makes sure to avoid it unless that is requested. parse_str also can't be completely avoided if we want to break apart the string. Firstly we would likely mangle the query string by trying to do it ourselves and suddenly encountering complex data, which at some point would be attempted to be merged back into a string. Because we did not split apart the string in the same way that PHP did, and thus have been passing around data in a format that MediaWiki does not handle (invalid broken query string) the end result will be a bad string which is outputted into the page. Incidentally, I also remember reading something about php coded functions made to duplicate what parse_str does to be more inefficient than it. In other words, it's more of a performance issue than simply handling it the same way we handle $_GET. ~Daniel Friesen(Dantman) of: -The Gaiapedia (http://gaia.wikia.com) -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) -and Wiki-Tools.com (http://wiki-tools.com) Brion Vibber wrote: > A) there's no need for addslashes() on URL-encoded output > > B) if parse_str is that broken, please avoid it. > > -- brion vibber (brion @ wikimedia.org) > > On May 9, 2008, at 18:13, DanTMan <dan_the_man[at]telus.net> wrote: > > >> http://svn.wikimedia.org/viewvc/mediawiki/?view=rev&revision=34541 >> http://svn.wikimedia.org/viewvc/mediawiki/?view=rev&revision=34542 >> >> Since brion had to go I'll post here and detail the magic quotes >> stuff. >> >> Though, when it's reviewed and if it's ok, please don't un-revert it. >> There were 3 accidental PHP syntax errors committed, I fixed all >> three, >> but brion reverted for different reasons before I could commit the >> fix. >> So if marked as ok, let me to the commit since I have the syntax >> error-less version. But if you were wondering, Missing $ on 2374, >> and a >> comma instead of a : on 2388 and 2389. >> >> To make some improvements to the Title::getLocalURL and >> Title::getFullURL functions I did some changes: >> >> Firstly, the point. I'm trying to add in some functionality which we >> nicely give out in things like $db->select, but is missing and would >> be >> quite helpful in the two Title functions. >> The idea is to allow an array or object as input into the query >> parameter rather than restricting to a plain string. >> In that way a developer can call something like: >> $wgTitle->getLocalURL(array( >> 'foo' => $<Dangerous user supplied variable>, >> 'bar' => 'baz' >> )); >> Rather than: >> $wgTitle->getLocalURL("foo=" . addslashes($<Dangerous user supplied >> variable>) . "&bar=baz"); >> It cleans up the input and allows developers to finally do the same >> things done inside of the clean Database functions as well as the Xml >> functions. >> >> To do this I had to introduce a new function, similar to an existing >> one. The new function named wfBuildQuery (because it is a substitute >> for >> http_build_query, keeping the naming convention), which was to replace >> the old wfArrayToCgi, which has two issues itself, the arguments are >> in >> an unintuitive order (The reason it supports 2 arrays is so that you >> can >> merge parameters into an array of defaults, however in wfArrayToCgi >> you >> need to put the defaults first if you want to use them, so you can't >> properly document the parameters since they change in definition >> depending on how many params you pass), and the second issue is the >> name... The definition here is that we can pass it a array, object, or >> query string and it'll output a query string for us to use. But >> ArrayToCgi doesn't fit that definition anymore. >> >> Ok, so now how the train went... >> We want to support both arrays and objects in inputs now, as well as >> strings. >> However, the hitch. wfArrayToCgi is under it's definition supposed >> to be >> similar to wfBuildQuery but it supports having two arrays so you can >> merge a defaults array with the query array. >> >> First change... In wfArrayToCgi the order of the + operator ends up >> making it so that you need to put the default first... So, if you pass >> one parameter that is the query, if you pass two, then suddenly that >> query parameter becomes the defaults parameter and the second becomes >> the query parameter. That does not follow the convention of parameters >> which we follow. So I reversed the order in wfBuildQuery and gave it >> explicit $query and $default variables. >> >> But now... remember we're now supporting arrays and objects. So, to >> fix >> that, "$query = ((array)$defaults) + ((array)$query);" basically we're >> typesetting the objects into arrays so we're making sure that we >> combine >> two like types rather than is being possible to combine an array and >> an >> object and get a potential error about trying to combine unlike types. >> If you don't get what the dual braces are for, that's because if we >> didn't wrap the typesets then they would apply to the whole thing and >> not the individual variables, and we'd be back at square one trying to >> combine two potentially unlike objects. >> >> Ok, now onto our third method of input. Rather than forcing our code >> to >> test the type of something before it passes it onto the function, we >> should allow strings to passthrough rather than return fatal errors >> when >> they try to. >> So, we have a is_string down at the bottom, if it's a string it passes >> through, if it isn't then we build a query string from it. >> >> Ok, now we have one last valid input case which we have not handled. >> What if we have a defaults array, and the query is a string? We can't >> merge a strings the same way we can merge arrays. >> So, the solution is we're going to have to take the string and parse >> it >> into a query array so that it can be merged. >> Ok, we've got two ways of doing this now... >> A) Build some insane set of code to split up the query string which is >> bound to fail on anything past basic data. >> B) Use PHP's built in parse_str to turn the query string into a query >> array with the exact same output as PHP. Quite a bit more reliable >> than >> writing your own kludge by hand. >> >> So, ok... We're using php's parse_str now to break apart the strings, >> then we merge them back into an array. So we can turn the final result >> into a good query string. >> Two issues. Firstly, we don't want to split a string then merge it >> back >> together all the time do we? No, that's stupid and inefficient. So, >> the >> string parsing is contained inside the same if that the merge is >> inside >> of. We only split the query string into an array if we have a default >> query that we need to merge with. Just for sanity's sake we also make >> sure that the defaults param is an array to and also split it if >> it's a >> string. But if we only pass a single string and don't have a default, >> then that is all skipped by and because of the is_string down below >> the >> string simply passes in from the input, and out as the return value >> sucessfully allowing any valid input and outputting a string all the >> time, additionally that means it's safe so a query can be passed >> through >> twice without anything being broken. >> >> Ok, final issue, the one which caused brion to raise an eyebrow. There >> is an issue with parse_str, it's even documented on it's php doc page. >> "*Note:* The magic_quotes_gpc setting affects the output of this >> function, as parse_str() uses the same mechanism that PHP uses to >> populate the $_GET, $_POST, etc. variables." >> Eeew... parse_str has the same issues we have to deal with inside of >> $wgRequest... So, what do we do? >> We test for get_magic_quotes_gpc just like $wgRequest does. And then >> if >> it returns true we use the WebRequest's public fix_magic_quotes >> function >> to clean up the evilness that parse_str gave us. >> >> And voila, we have a function which does the best it can to do things >> with clean code, and is easy to use inside of code. >> >> Though on the note of that, before someone decides to unrevert. I >> think >> it would be a good idea to create a wfParseQuery function as a safe >> wrapper for parse_str in all it's evil gpc horror in case someone >> needs >> to break up a query string again. >> >> If you're wondering about the other changes: >> We have the pair of wfMemcKey and wfForeignMemcKey, however we don't >> have a pair for wfWikiID. >> So I created the wfForeignWikiID. It's not apparent now, but with the >> new ability to share specific tables people are going to find useful >> ways to make advantage of that, and it's going to be a good idea to >> have >> a way to get a key for the shared data or data of another wiki. >> Helpfully the wfForeignWikiID function takes input for you to specify >> the database and prefix, if you don't pass data to it then it will >> fallback to the shared database's data. Which 85% of the time, is what >> we want, something is shared and we want the shared ID to go along >> with >> it. And finally, if you have no shared database set, we fallback to >> the >> local database which will be what we want. >> >> The MemcKey functions were also duplicating the functionality of the >> WikiID functions. They had the same conditional expression inside of >> them that the WikiID functions had. I made them make use of those and >> cut them down to 3 liners. >> Technically, the Local functions could actually be made to use the >> Foreign functions. That would eliminate more code duplication. I could >> do that to. >> >> -- >> ~Daniel Friesen(Dantman) of: >> -The Gaiapedia (http://gaia.wikia.com) >> -Wikia ACG on Wikia.com (http://wikia.com/wiki/Wikia_ACG) >> -and Wiki-Tools.com (http://wiki-tools.com) >> >> _______________________________________________ >> Wikitech-l mailing list >> Wikitech-l[at]lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l[at]lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > _______________________________________________ Wikitech-l mailing list Wikitech-l[at]lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
|