Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Re: Re quests about log system

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


aschulz4587 at gmail

Sep 4, 2011, 12:11 PM

Post #1 of 21 (997 views)
Permalink
Re: Re quests about log system

It would be nice to have standard functions that supports storing associative
arrays in log_params rather than fragile ordered lists. I ended up hacking
up a quick function in FlaggedRevs for this. Newer log types could make use
of this and existing ones could if they had some b/c code.


Bugzilla from niklas.laxstrom [at] gmail wrote:
>
> Hello, I'm currently partially rewriting the log system, because the
> current one doesn't support i18n well enough.
>
> I'm trying to avoid any radical changes like changes to the database
> schema. My changes mostly touch
> handling log entries and formatting them.
>
> So, if you know any defects in the current log system, or have an wish
> what the new should do, or know someplace where these kind of wishes
> exist, please tell me.
> I have scanned the list of bugs in bugzilla quickly, but it is a bit
> hard to find relevant bugs when there is no logging component.
>
> I'm aiming to solve at least these bugs:
> https://bugzilla.wikimedia.org/30737 User names should be moved into
> log messages.
> https://bugzilla.wikimedia.org/24156 Messages of log entries should
> support GENDER
> https://bugzilla.wikimedia.org/24620 Log entries are difficult to
> localize; rewrite logs system
> https://bugzilla.wikimedia.org21716 Log entries assume sentence starts
> with username
>
> -Niklas
> --
> Niklas Laxström
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>

--
View this message in context: http://old.nabble.com/Requests-about-log-system-tp32396608p32397174.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


niklas.laxstrom at gmail

Sep 8, 2011, 2:51 AM

Post #2 of 21 (974 views)
Permalink
Re: Re quests about log system [In reply to]

I just commited many changes to logging code. There is more to come,
but I think this is suitable place to write in more detail what is
going on. I'd also like to request code review and testing :)

Thus far I have committed new formatting code and small cleanups. Both
LogEventsList and RecentChanges are using the formatters now.
I haven't committed my last patch, which changes Title.php to generate
log entries using my new code. That will also fix page histories and
IRC feed, which use static version of log action text, which is
generated together when the new log item is inserted into the
database.

There are two major parts in the new logging system: LogEntry and LogFormatter.
LogEntry is a model around one log entry. It has multiple subclasses.
For constructing new log entries, you will create a new ManualLogEntry
and fill necessary info, after which you can call insert() and
publish(). If you are loading entries from database, you can simply
call DatabaseLogEntry::newFromRow( $row ). It supports rows both from
logging and recentchanges table. Usually you want to go directly to
LogFormatter and call newFromEntry or the hand newFromRow shortcut.
LogFormatter provides getActionText() method, which formats the log
action for you, taking those pesky LogPage::DELETED_FOO restrictions
into account. The action text includes the username, to support
different word orders. There is also getPlainActionText(), which
formats the log entry so that it is suitable for page histories and
IRC feeds.

LogEntries can have parameters. Parameters should be an associative
array. When saved to database, it is encoded to JSON. If you can pass
parameters directly to the message which is used to format the action
text, you can name the keys like "#:foobar", where # is a number and
should start from 4, because parameters 1, 2 and 3 are already
reserved and should be common to all log entries. Those are user name
link, username for gender and target page link respectively.

If they key is not in #:foobar format, it is not automatically
available for the action text message. By subclassing LogFormatter you
can do whatever you want with the parameters. Be aware of
$this->plaintext value though, it indicates whether we can use any
markup or just plaintext. This is how the MoveLogFormatter is
registered. I've added a type/* shortcut to avoid some repetition. If
the value is an existing class, it will be used. Otherwise the old
behavior of calling the function is used through LegacyLogFormatter.

$wgLogActionsHandlers = array(
// move, move_redir
'move/*' => 'MoveLogFormatter',
);

So what does this all bring to us?
* Flexible word order
* The most complex piece of log formatting is done only once, and it
also takes care of hiding any restricted items
* Gender is supported
* Ability to store parameters as an associative array
* New message naming conventions to reduce boilerplate
* Anonymous users can make log entries, that are actually shown
* Global logs should be easier to implement now, but it is not
directly supported by the current code.
* Two simple methods: getActionText and getPlainActionText, instead of
the mess of making log entries all over the place
* All code for one log type is now in single place, instead of lots of
switch $type in different places.

So once more, please text, review and comment. I still have lots to
do, all the log types need to be converted one by one to the new
system, to take the full benefit of improved i18n. Easiest way to find
the commits is probably this page:
http://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/nikerabbit

-Niklas

--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


aschulz4587 at gmail

Sep 8, 2011, 3:18 AM

Post #3 of 21 (970 views)
Permalink
Re: Re quests about log system [In reply to]

Yay for log_params. I was thinking JSON would be appropriate here, so I'm
glat to see that.

I'll toss these revs onto my review queue.


Bugzilla from niklas.laxstrom [at] gmail wrote:
>
> I just commited many changes to logging code. There is more to come,
> but I think this is suitable place to write in more detail what is
> going on. I'd also like to request code review and testing :)
>
> Thus far I have committed new formatting code and small cleanups. Both
> LogEventsList and RecentChanges are using the formatters now.
> I haven't committed my last patch, which changes Title.php to generate
> log entries using my new code. That will also fix page histories and
> IRC feed, which use static version of log action text, which is
> generated together when the new log item is inserted into the
> database.
>
> There are two major parts in the new logging system: LogEntry and
> LogFormatter.
> LogEntry is a model around one log entry. It has multiple subclasses.
> For constructing new log entries, you will create a new ManualLogEntry
> and fill necessary info, after which you can call insert() and
> publish(). If you are loading entries from database, you can simply
> call DatabaseLogEntry::newFromRow( $row ). It supports rows both from
> logging and recentchanges table. Usually you want to go directly to
> LogFormatter and call newFromEntry or the hand newFromRow shortcut.
> LogFormatter provides getActionText() method, which formats the log
> action for you, taking those pesky LogPage::DELETED_FOO restrictions
> into account. The action text includes the username, to support
> different word orders. There is also getPlainActionText(), which
> formats the log entry so that it is suitable for page histories and
> IRC feeds.
>
> LogEntries can have parameters. Parameters should be an associative
> array. When saved to database, it is encoded to JSON. If you can pass
> parameters directly to the message which is used to format the action
> text, you can name the keys like "#:foobar", where # is a number and
> should start from 4, because parameters 1, 2 and 3 are already
> reserved and should be common to all log entries. Those are user name
> link, username for gender and target page link respectively.
>
> If they key is not in #:foobar format, it is not automatically
> available for the action text message. By subclassing LogFormatter you
> can do whatever you want with the parameters. Be aware of
> $this->plaintext value though, it indicates whether we can use any
> markup or just plaintext. This is how the MoveLogFormatter is
> registered. I've added a type/* shortcut to avoid some repetition. If
> the value is an existing class, it will be used. Otherwise the old
> behavior of calling the function is used through LegacyLogFormatter.
>
> $wgLogActionsHandlers = array(
> // move, move_redir
> 'move/*' => 'MoveLogFormatter',
> );
>
> So what does this all bring to us?
> * Flexible word order
> * The most complex piece of log formatting is done only once, and it
> also takes care of hiding any restricted items
> * Gender is supported
> * Ability to store parameters as an associative array
> * New message naming conventions to reduce boilerplate
> * Anonymous users can make log entries, that are actually shown
> * Global logs should be easier to implement now, but it is not
> directly supported by the current code.
> * Two simple methods: getActionText and getPlainActionText, instead of
> the mess of making log entries all over the place
> * All code for one log type is now in single place, instead of lots of
> switch $type in different places.
>
> So once more, please text, review and comment. I still have lots to
> do, all the log types need to be converted one by one to the new
> system, to take the full benefit of improved i18n. Easiest way to find
> the commits is probably this page:
> http://www.mediawiki.org/wiki/Special:Code/MediaWiki/author/nikerabbit
>
> -Niklas
>
> --
> Niklas Laxström
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>

--
View this message in context: http://old.nabble.com/Requests-about-log-system-tp32396608p32422536.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


maxsem.wiki at gmail

Sep 8, 2011, 3:36 AM

Post #4 of 21 (969 views)
Permalink
Re: Re quests about log system [In reply to]

On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail> wrote:

>
> Yay for log_params. I was thinking JSON would be appropriate here, so I'm
> glat to see that.
>
>
Even though data in those fields is small enough, can
serialize()/unserialize() be used instead? It's faster and doesn't require
the mess of ServicesJSON to work correctly.

--
Best regards,
Max Semenik ([[User:MaxSem]])
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


niklas.laxstrom at gmail

Sep 8, 2011, 4:25 AM

Post #5 of 21 (973 views)
Permalink
Re: Re quests about log system [In reply to]

On 8 September 2011 13:36, Max Semenik <maxsem.wiki [at] gmail> wrote:
> On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail> wrote:
>
>>
>> Yay for log_params. I was thinking JSON would be appropriate here, so I'm
>> glat to see that.
>>
>>
> Even though data in those fields is small enough, can
> serialize()/unserialize() be used instead? It's faster and doesn't require
> the mess of ServicesJSON to work correctly.

Do those cause actual problems or is it just matter of preference? In
my opinion JSON is much better for anyone who wants to dig the logs
without using PHP. Also, is (un)serialize guaranteed to be stable
across PHP versions?

-Niklas

--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


lists at nadir-seen-fire

Sep 8, 2011, 7:57 AM

Post #6 of 21 (972 views)
Permalink
Re: Re quests about log system [In reply to]

On 11-09-08 04:25 AM, Niklas Laxström wrote:
> On 8 September 2011 13:36, Max Semenik <maxsem.wiki [at] gmail> wrote:
>> On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail> wrote:
>>
>>> Yay for log_params. I was thinking JSON would be appropriate here, so I'm
>>> glat to see that.
>>>
>>>
>> Even though data in those fields is small enough, can
>> serialize()/unserialize() be used instead? It's faster and doesn't require
>> the mess of ServicesJSON to work correctly.
> Do those cause actual problems or is it just matter of preference? In
> my opinion JSON is much better for anyone who wants to dig the logs
> without using PHP. Also, is (un)serialize guaranteed to be stable
> across PHP versions?
>
> -Niklas
We already use serialize in HistoryBlob/Revision, the job queue,
caching, file metadata, the localization cache, ...

So if you add any new fields to the db you should really stick to
(un)serialize.
We're already using serialize everywhere and we even use binary storage
which is troublesome for anyone trying to stare at the database with
most phpmyadmin installs. People being minorly inconvenienced when
reading the database raw is the last of our issues.
If you want to argue the irrelevant minority that would be slightly
inconvenienced reading the database raw I'll argue the irrelevant
minority that would be slightly inconvenienced trying to do db queries
to mw code externally and have to parse json which isn't as simple as
(un)serialize.
;) I'll also wager that HipHop makes the gap in speed between
(un)serialize and json farther.

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


niklas.laxstrom at gmail

Sep 8, 2011, 10:00 AM

Post #7 of 21 (969 views)
Permalink
Re: Re quests about log system [In reply to]

On 8 September 2011 17:57, Daniel Friesen <lists [at] nadir-seen-fire> wrote:
> On 11-09-08 04:25 AM, Niklas Laxström wrote:
>> On 8 September 2011 13:36, Max Semenik <maxsem.wiki [at] gmail> wrote:
>>> On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail> wrote:
>>>
>>>> Yay for log_params. I was thinking JSON would be appropriate here, so I'm
>>>> glat to see that.
>>>>
>>>>
>>> Even though data in those fields is small enough, can
>>> serialize()/unserialize() be used instead? It's faster and doesn't require
>>> the mess of ServicesJSON to work correctly.
>> Do those cause actual problems or is it just matter of preference? In
>> my opinion JSON is much better for anyone who wants to dig the logs
>> without using PHP. Also, is (un)serialize guaranteed to be stable
>> across PHP versions?
>>
>>   -Niklas
> We already use serialize in HistoryBlob/Revision, the job queue,
> caching, file metadata, the localization cache, ...
>
> So if you add any new fields to the db you should really stick to
> (un)serialize.
> We're already using serialize everywhere and we even use binary storage
> which is troublesome for anyone trying to stare at the database with
> most phpmyadmin installs. People being minorly inconvenienced when
> reading the database raw is the last of our issues.
> If you want to argue the irrelevant minority that would be slightly
> inconvenienced reading the database raw I'll argue the irrelevant
> minority that would be slightly inconvenienced trying to do db queries
> to mw code externally and have to parse json which isn't as simple as
> (un)serialize.
> ;) I'll also wager that HipHop makes the gap in speed between
> (un)serialize and json farther.

Very well, r96585.


--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


niklas.laxstrom at gmail

Sep 9, 2011, 1:15 AM

Post #8 of 21 (961 views)
Permalink
Re: Re quests about log system [In reply to]

Big thank you for everyone who already looked and tested the code,
especially to Aaron. I have fixed the few issues that have come up.

Have we reached to an agreement to serialize the parameters instead of
formatting them with JSON? I am going commit code that actually
creates log entries using this new system, so I'd rather be sure we
are comfortable with what we have chosen, to avoid unnecessary mix of
different formats in the database.

-Niklas

On 8 September 2011 20:00, Niklas Laxström <niklas.laxstrom [at] gmail> wrote:
> On 8 September 2011 17:57, Daniel Friesen <lists [at] nadir-seen-fire> wrote:
>> On 11-09-08 04:25 AM, Niklas Laxström wrote:
>>> On 8 September 2011 13:36, Max Semenik <maxsem.wiki [at] gmail> wrote:
>>>> On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail> wrote:
>>>>
>>>>> Yay for log_params. I was thinking JSON would be appropriate here, so I'm
>>>>> glat to see that.
>>>>>
>>>>>
>>>> Even though data in those fields is small enough, can
>>>> serialize()/unserialize() be used instead? It's faster and doesn't require
>>>> the mess of ServicesJSON to work correctly.
>>> Do those cause actual problems or is it just matter of preference? In
>>> my opinion JSON is much better for anyone who wants to dig the logs
>>> without using PHP. Also, is (un)serialize guaranteed to be stable
>>> across PHP versions?
>>>
>>>   -Niklas
>> We already use serialize in HistoryBlob/Revision, the job queue,
>> caching, file metadata, the localization cache, ...
>>
>> So if you add any new fields to the db you should really stick to
>> (un)serialize.
>> We're already using serialize everywhere and we even use binary storage
>> which is troublesome for anyone trying to stare at the database with
>> most phpmyadmin installs. People being minorly inconvenienced when
>> reading the database raw is the last of our issues.
>> If you want to argue the irrelevant minority that would be slightly
>> inconvenienced reading the database raw I'll argue the irrelevant
>> minority that would be slightly inconvenienced trying to do db queries
>> to mw code externally and have to parse json which isn't as simple as
>> (un)serialize.
>> ;) I'll also wager that HipHop makes the gap in speed between
>> (un)serialize and json farther.
>
> Very well, r96585.



--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


aschulz4587 at gmail

Sep 9, 2011, 2:28 PM

Post #9 of 21 (962 views)
Permalink
Re: Re quests about log system [In reply to]

I'd still prefer JSON for offline/non-PHP use. I'm not sure it's a huge deal
though.


Bugzilla from niklas.laxstrom [at] gmail wrote:
>
> Big thank you for everyone who already looked and tested the code,
> especially to Aaron. I have fixed the few issues that have come up.
>
> Have we reached to an agreement to serialize the parameters instead of
> formatting them with JSON? I am going commit code that actually
> creates log entries using this new system, so I'd rather be sure we
> are comfortable with what we have chosen, to avoid unnecessary mix of
> different formats in the database.
>
> -Niklas
>
> On 8 September 2011 20:00, Niklas Laxström <niklas.laxstrom [at] gmail>
> wrote:
>> On 8 September 2011 17:57, Daniel Friesen <lists [at] nadir-seen-fire>
>> wrote:
>>> On 11-09-08 04:25 AM, Niklas Laxström wrote:
>>>> On 8 September 2011 13:36, Max Semenik <maxsem.wiki [at] gmail> wrote:
>>>>> On Thu, Sep 8, 2011 at 2:18 PM, Aaron Schulz <aschulz4587 [at] gmail>
>>>>> wrote:
>>>>>
>>>>>> Yay for log_params. I was thinking JSON would be appropriate here, so
>>>>>> I'm
>>>>>> glat to see that.
>>>>>>
>>>>>>
>>>>> Even though data in those fields is small enough, can
>>>>> serialize()/unserialize() be used instead? It's faster and doesn't
>>>>> require
>>>>> the mess of ServicesJSON to work correctly.
>>>> Do those cause actual problems or is it just matter of preference? In
>>>> my opinion JSON is much better for anyone who wants to dig the logs
>>>> without using PHP. Also, is (un)serialize guaranteed to be stable
>>>> across PHP versions?
>>>>
>>>>   -Niklas
>>> We already use serialize in HistoryBlob/Revision, the job queue,
>>> caching, file metadata, the localization cache, ...
>>>
>>> So if you add any new fields to the db you should really stick to
>>> (un)serialize.
>>> We're already using serialize everywhere and we even use binary storage
>>> which is troublesome for anyone trying to stare at the database with
>>> most phpmyadmin installs. People being minorly inconvenienced when
>>> reading the database raw is the last of our issues.
>>> If you want to argue the irrelevant minority that would be slightly
>>> inconvenienced reading the database raw I'll argue the irrelevant
>>> minority that would be slightly inconvenienced trying to do db queries
>>> to mw code externally and have to parse json which isn't as simple as
>>> (un)serialize.
>>> ;) I'll also wager that HipHop makes the gap in speed between
>>> (un)serialize and json farther.
>>
>> Very well, r96585.
>
>
>
> --
> Niklas Laxström
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>

--
View this message in context: http://old.nabble.com/Requests-about-log-system-tp32396608p32434885.html
Sent from the Wikipedia Developers mailing list archive at Nabble.com.


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


agarrett at wikimedia

Sep 9, 2011, 6:22 PM

Post #10 of 21 (960 views)
Permalink
Re: Re quests about log system [In reply to]

On Thu, Sep 8, 2011 at 8:36 PM, Max Semenik <maxsem.wiki [at] gmail> wrote:
> Even though data in those fields is small enough, can
> serialize()/unserialize() be used instead? It's faster and doesn't require
> the mess of ServicesJSON to work correctly.

I'd prefer JSON. I don't care about the speed, it's not a critical
code path, and JSON is stable, well-defined and can be read by any
client, whereas serialize is some scary PHP format that may or may not
change without notice.

--
Andrew Garrett
Wikimedia Foundation
agarrett [at] wikimedia

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


lists at nadir-seen-fire

Sep 9, 2011, 10:00 PM

Post #11 of 21 (961 views)
Permalink
Re: Re quests about log system [In reply to]

On 11-09-09 06:22 PM, Andrew Garrett wrote:
> On Thu, Sep 8, 2011 at 8:36 PM, Max Semenik <maxsem.wiki [at] gmail> wrote:
>> Even though data in those fields is small enough, can
>> serialize()/unserialize() be used instead? It's faster and doesn't require
>> the mess of ServicesJSON to work correctly.
> I'd prefer JSON. I don't care about the speed, it's not a critical
> code path, and JSON is stable, well-defined and can be read by any
> client, whereas serialize is some scary PHP format that may or may not
> change without notice.
- We already (un)serialize data in and out of the database.
- (un)serialize can't change, if it does we already have problems.
- These are for database storage we have no reason to input data into a
private database in a format expecting people to read the data back from
other clients.
- json in php requires a mess of code and potentially a 3rd party
libraries because:
-- the bulit-in json json_{en,de}code library functions may not be installed
-- the bulit-in json library in some cases actually has a bug that makes
it encode/decode json incorrectly

--
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


innocentkiller at gmail

Sep 9, 2011, 10:20 PM

Post #12 of 21 (961 views)
Permalink
Re: Re quests about log system [In reply to]

On Sat, Sep 10, 2011 at 1:00 AM, Daniel Friesen
<lists [at] nadir-seen-fire> wrote:
> - json in php requires a mess of code and potentially a 3rd party
> libraries because:
> -- the bulit-in json json_{en,de}code library functions may not be installed
> -- the bulit-in json library in some cases actually has a bug that makes
> it encode/decode json incorrectly
>

Well, that's why we have the FormatJson wrapper. Could use some tests
to make sure that the output from Services_Json and json_{en,de}code
are identical. I'm whipping up some trivial ones now.

-Chad

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


mail at tgries

Sep 9, 2011, 11:30 PM

Post #13 of 21 (962 views)
Permalink
Re: Re quests about log system [In reply to]

Am 10.09.2011 07:20, schrieb Chad:
>
>> - json in php requires a mess of code and potentially a 3rd party
>> libraries because:
>> -- the bulit-in json json_{en,de}code library functions may not be installed
>> -- the bulit-in json library in some cases actually has a bug that makes
>> it encode/decode json incorrectly
>>
> Well, that's why we have the FormatJson wrapper. Could use some tests
> to make sure that the output from Services_Json and json_{en,de}code
> are identical. I'm whipping up some trivial ones now.
>
> -Chad
>
This is a good idea. Filed as
https://bugzilla.wikimedia.org/show_bug.cgi?id=30841
"Development of unit tests for FormatJson.php versus PHP built-in
json_encode() json_decode() "
Attachments: signature.asc (0.48 KB)


innocentkiller at gmail

Sep 9, 2011, 11:34 PM

Post #14 of 21 (963 views)
Permalink
Re: Re quests about log system [In reply to]

On Sat, Sep 10, 2011 at 2:30 AM, Thomas Gries <mail [at] tgries> wrote:
> Am 10.09.2011 07:20, schrieb Chad:
>>
>>> - json in php requires a mess of code and potentially a 3rd party
>>> libraries because:
>>> -- the bulit-in json json_{en,de}code library functions may not be installed
>>> -- the bulit-in json library in some cases actually has a bug that makes
>>> it encode/decode json incorrectly
>>>
>> Well, that's why we have the FormatJson wrapper. Could use some tests
>> to make sure that the output from Services_Json and json_{en,de}code
>> are identical. I'm whipping up some trivial ones now.
>>
>> -Chad
>>
> This is a good idea. Filed as
> https://bugzilla.wikimedia.org/show_bug.cgi?id=30841
> "Development of unit tests for FormatJson.php versus PHP built-in
> json_encode() json_decode() "
>

I said I was already whipping them up, and indeed I had committed
them before you e-mailed. Resolving FIXED ;-)

-Chad

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


lists at nadir-seen-fire

Sep 10, 2011, 12:13 AM

Post #15 of 21 (962 views)
Permalink
Re: Re quests about log system [In reply to]

On 11-09-09 10:00 PM, Daniel Friesen wrote:
> On 11-09-09 06:22 PM, Andrew Garrett wrote:
>> On Thu, Sep 8, 2011 at 8:36 PM, Max Semenik <maxsem.wiki [at] gmail> wrote:
>>> Even though data in those fields is small enough, can
>>> serialize()/unserialize() be used instead? It's faster and doesn't require
>>> the mess of ServicesJSON to work correctly.
>> I'd prefer JSON. I don't care about the speed, it's not a critical
>> code path, and JSON is stable, well-defined and can be read by any
>> client, whereas serialize is some scary PHP format that may or may not
>> change without notice.
> - We already (un)serialize data in and out of the database.
> - (un)serialize can't change, if it does we already have problems.
> - These are for database storage we have no reason to input data into a
> private database in a format expecting people to read the data back from
> other clients.
> - json in php requires a mess of code and potentially a 3rd party
> libraries because:
> -- the bulit-in json json_{en,de}code library functions may not be installed
> -- the bulit-in json library in some cases actually has a bug that makes
> it encode/decode json incorrectly
Here's another kick:
- Using JSON in php, when you decode what you encoded you don't always
get the same thing back (serialize you of course do)

> var_dump(FormatJson::encode(array(1=>1,2=>2)));
string(13) "{"1":1,"2":2}"
var_dump(FormatJson::decode('{"1":1,"2":2}'));
object(stdClass)#20 (2) {
["1"]=>
int(1)
["2"]=>
int(2)
}

array in, object out.

There is a FormatJson argument to return assoc arrays instead of
objects, but this means rather than always getting the right type of
data back you have to specifically note when you want assoc arrays and
when you want objects.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


innocentkiller at gmail

Sep 10, 2011, 12:27 AM

Post #16 of 21 (965 views)
Permalink
Re: Re quests about log system [In reply to]

On Sat, Sep 10, 2011 at 3:13 AM, Daniel Friesen
<lists [at] nadir-seen-fire> wrote:
> On 11-09-09 10:00 PM, Daniel Friesen wrote:
>> On 11-09-09 06:22 PM, Andrew Garrett wrote:
>>> On Thu, Sep 8, 2011 at 8:36 PM, Max Semenik <maxsem.wiki [at] gmail> wrote:
>>>> Even though data in those fields is small enough, can
>>>> serialize()/unserialize() be used instead? It's faster and doesn't require
>>>> the mess of ServicesJSON to work correctly.
>>> I'd prefer JSON. I don't care about the speed, it's not a critical
>>> code path, and JSON is stable, well-defined and can be read by any
>>> client, whereas serialize is some scary PHP format that may or may not
>>> change without notice.
>> - We already (un)serialize data in and out of the database.
>> - (un)serialize can't change, if it does we already have problems.
>> - These are for database storage we have no reason to input data into a
>> private database in a format expecting people to read the data back from
>> other clients.
>> - json in php requires a mess of code and potentially a 3rd party
>> libraries because:
>> -- the bulit-in json json_{en,de}code library functions may not be installed
>> -- the bulit-in json library in some cases actually has a bug that makes
>> it encode/decode json incorrectly
> Here's another kick:
> - Using JSON in php, when you decode what you encoded you don't always
> get the same thing back (serialize you of course do)
>
>> var_dump(FormatJson::encode(array(1=>1,2=>2)));
> string(13) "{"1":1,"2":2}"
> var_dump(FormatJson::decode('{"1":1,"2":2}'));
> object(stdClass)#20 (2) {
>  ["1"]=>
>  int(1)
>  ["2"]=>
>  int(2)
> }
>
> array in, object out.
>
> There is a FormatJson argument to return assoc arrays instead of
> objects, but this means rather than always getting the right type of
> data back you have to specifically note when you want assoc arrays and
> when you want objects.
>

Only mildly annoying, and not enough of a reason to stop using JSON
imo. That's the behavior of json_decode() anyway, so we're not diverging
from upstream.

-Chad

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Sep 10, 2011, 2:23 AM

Post #17 of 21 (963 views)
Permalink
Re: Re quests about log system [In reply to]

On Sat, Sep 10, 2011 at 9:13 AM, Daniel Friesen
<lists [at] nadir-seen-fire> wrote:
> Here's another kick:
> - Using JSON in php, when you decode what you encoded you don't always
> get the same thing back (serialize you of course do)
>
Even serialize() doesn't round-trip cleanly in certain cases:
https://bugs.php.net/bug.php?id=55495

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


greg at endpoint

Sep 10, 2011, 8:00 AM

Post #18 of 21 (961 views)
Permalink
Re: Re quests about log system [In reply to]

> - Using JSON in php, when you decode what you encoded you don't always
> get the same thing back (serialize you of course do)

How is PHP's YAML support? I prefer YAML to JSON, as it's even easier
to read and parse. May not be best here, but could be an option if
the interface is less broken than the JSON one. :)

--
Greg Sabino Mullane greg [at] endpoint
End Point Corporation
PGP Key: 0x14964AC8


niklas.laxstrom at gmail

Sep 10, 2011, 8:31 AM

Post #19 of 21 (961 views)
Permalink
Re: Re quests about log system [In reply to]

On 10 September 2011 18:00, Greg Sabino Mullane <greg [at] endpoint> wrote:
>> - Using JSON in php, when you decode what you encoded you don't always
>> get the same thing back (serialize you of course do)
>
> How is PHP's YAML support? I prefer YAML to JSON, as it's even easier
> to read and parse. May not be best here, but could be an option if
> the interface is less broken than the JSON one. :)

It's awful. Needs external libraries/code which all are broken in way
or another. Besides, JSON is a subset of YAML. Also, by this
definition YAML cannot be easier to parse than JSON.

-Niklas

--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


niklas.laxstrom at gmail

Sep 10, 2011, 11:28 AM

Post #20 of 21 (960 views)
Permalink
Re: Re quests about log system [In reply to]

On 10 September 2011 12:23, Roan Kattouw <roan.kattouw [at] gmail> wrote:
> On Sat, Sep 10, 2011 at 9:13 AM, Daniel Friesen
> <lists [at] nadir-seen-fire> wrote:
>> Here's another kick:
>> - Using JSON in php, when you decode what you encoded you don't always
>> get the same thing back (serialize you of course do)
>>
> Even serialize() doesn't round-trip cleanly in certain cases:
> https://bugs.php.net/bug.php?id=55495

Another point came up in IRC discussions: "JSON converts UTF-8
characters to unicode escape sequences. serialize() does not."

This effectively negates any size advantages JSON might have. I
noticed JSON is used extensively in resource loader.
Roan: did you consider this point when choosing JSON format for
storing message resources?

-Niklas

--
Niklas Laxström

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Sep 11, 2011, 7:38 AM

Post #21 of 21 (962 views)
Permalink
Re: Re quests about log system [In reply to]

On Sat, Sep 10, 2011 at 8:28 PM, Niklas Laxström
<niklas.laxstrom [at] gmail> wrote:
> Another point came up in IRC discussions: "JSON converts UTF-8
> characters to unicode escape sequences. serialize() does not."
>
> This effectively negates any size advantages JSON might have. I
> noticed JSON is used extensively in resource loader.
> Roan: did you consider this point when choosing JSON format for
> storing message resources?
>
No, I didn't know this and didn't think about it. The reason I chose
JSON is because the primary use case of the message blobs was to be
sent to JavaScript and used for i18n there, so it was convenient to
keep it as JSON the whole time and not have to do any decoding or
encoding for the primary use case. The only case in which PHP needs to
read the JSON data is when a single message is changed in the
MediaWiki: namespace, in which case the blob is decoded, the message
contents are swapped out, and the blob is re-encoded.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.