Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

How sort key works?

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


heldergeovane at gmail

Aug 10, 2009, 12:09 PM

Post #1 of 8 (627 views)
Permalink
How sort key works?

Hi!

Some time ago I was categorizing some pages at pt.wikibooks and I found the
following curious situation:
I've used the code [[Category:Test|*]] in these pages:
http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/*Bi*
ologia_celular/Índice&action=edit<http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/Biologia_celular/%C3%8Dndice&action=edit>
http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/*Bo*
nsai_no_Brasil:_Índice&action=edit<http://pt.wikibooks.org/w/index.php?title=User:Heldergeovane/Bonsai_no_Brasil:_%C3%8Dndice&action=edit>

So, we expect:
* Both pages should appear at "Category:Test" under "*", and
* "User:Heldergeovane/B*i*ologia_celular/Índice" should be before
"User:Heldergeovane/B*o*nsai_no_Brasil:_Índice" (since "i" comes first than
"o").

However, what we found at
http://pt.wikibooks.org/w/index.php?title=Category:Test
is the reverse order.

1) What is the criteria for ordering the pages when the sort key of two
pages are the same?
2) Is there anything wrong with the ordering of the two pages above?

Helder
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Aug 10, 2009, 12:18 PM

Post #2 of 8 (607 views)
Permalink
Re: How sort key works? [In reply to]

2009/8/10 Helder Geovane Gomes de Lima <heldergeovane [at] gmail>:
> 1) What is the criteria for ordering the pages when the sort key of two
> pages are the same?
I think they're ordered by page ID, but I'm not sure. For all
practical purposes, the ordering of pages with the same sortkey is
undefined.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 10, 2009, 1:07 PM

Post #3 of 8 (605 views)
Permalink
Re: How sort key works? [In reply to]

On 8/10/09 12:18 PM, Roan Kattouw wrote:
> 2009/8/10 Helder Geovane Gomes de Lima<heldergeovane [at] gmail>:
>> 1) What is the criteria for ordering the pages when the sort key of two
>> pages are the same?
> I think they're ordered by page ID, but I'm not sure. For all
> practical purposes, the ordering of pages with the same sortkey is
> undefined.

To clarify, here's the information that's available when sorting a
category membership list:

* category name (fixed, since we're looking at a particular category)
* sort key (normally the page title, unless you overrode it)
* page ID (roughly corresponds to page creation time)

The page title can only be applied to the sorting if it's actually *in*
the sort key. If you've overridden it, then *only* the sort key you
provided will have any relevance in ordering; page ID will serve as a
'tiebreaker' but isn't really predictable.

-- brion

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 10, 2009, 1:50 PM

Post #4 of 8 (597 views)
Permalink
Re: How sort key works? [In reply to]

On Mon, Aug 10, 2009 at 4:07 PM, Brion Vibber<brion [at] wikimedia> wrote:
> To clarify, here's the information that's available when sorting a
> category membership list:
>
> * category name (fixed, since we're looking at a particular category)
> * sort key (normally the page title, unless you overrode it)
> * page ID (roughly corresponds to page creation time)
>
> The page title can only be applied to the sorting if it's actually *in*
> the sort key. If you've overridden it, then *only* the sort key you
> provided will have any relevance in ordering; page ID will serve as a
> 'tiebreaker' but isn't really predictable.

We could break ties by appending the page title to custom sort keys,
if this is a problem.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


heldergeovane at gmail

Aug 10, 2009, 3:10 PM

Post #5 of 8 (593 views)
Permalink
Re: How sort key works? [In reply to]

2009/8/10 Aryeh Gregor
<Simetrical+wikilist [at] gmail<Simetrical%2Bwikilist [at] gmail>
>

> We could break ties by appending the page title to custom sort keys,
> if this is a problem.
>

I think it would be good! =)

(We actually have manually used "*{{PAGENAME}}" for a while... =S)

Helder
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


petr.kadlec at gmail

Aug 11, 2009, 1:35 AM

Post #6 of 8 (590 views)
Permalink
Re: How sort key works? [In reply to]

2009/8/11 Helder Geovane Gomes de Lima <heldergeovane [at] gmail>:
> 2009/8/10 Aryeh Gregor
>> We could break ties by appending the page title to custom sort keys,
>> if this is a problem.
>
> I think it would be good! =)

I don’t (at least not in the way it is expressed). If you want to use
the page title as a tiebreaker, then add it as a new column to the
index (before the page_id), not (as I read the original sentence) by
appending the title to the sort key.

Otherwise, you’ll have to separate the sort key from the title with
some control character under U+0020 (to ensure correct ordering of
different-length sort keys – you need a separator which sorts before
any valid character), which would be messy.

But still, I don’t see the point in doing that. You don’t want a page
called “Aaa” to come after a page called “Abc” when you set their
sortkeys both to the same value? Don’t do that then. Set the sortkey
accordingly to what you want.

(OBTW, a different thing is that category paging is probably buggy in
this tiebreaking aspect – even though the index is correctly defined
to be unique, the page_id column is not included in the &from= paging
parameter. But this bug will probably appear only in extreme cases,
like 300 articles with an identical sortkey.)

-- [[cs:User:Mormegil | Petr Kadlec]]

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


roan.kattouw at gmail

Aug 11, 2009, 2:46 AM

Post #7 of 8 (586 views)
Permalink
Re: How sort key works? [In reply to]

2009/8/11 Petr Kadlec <petr.kadlec [at] gmail>:
> I don’t (at least not in the way it is expressed). If you want to use
> the page title as a tiebreaker, then add it as a new column to the
> index (before the page_id), not (as I read the original sentence) by
> appending the title to the sort key.
>
The page title is not in the categorylinks table, so we can't add it
to the index.

> Otherwise, you’ll have to separate the sort key from the title with
> some control character under U+0020 (to ensure correct ordering of
> different-length sort keys – you need a separator which sorts before
> any valid character), which would be messy.
>
> But still, I don’t see the point in doing that. You don’t want a page
> called “Aaa” to come after a page called “Abc” when you set their
> sortkeys both to the same value? Don’t do that then. Set the sortkey
> accordingly to what you want.
>
Exactly. When using identical sortkeys, you shouldn't complain that
MediaWiki doesn't magically know in which order you want to sort them.
You can make it predictable by using a (more) unique sortkey.

Roan Kattouw (Catrope)

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 11, 2009, 8:51 AM

Post #8 of 8 (588 views)
Permalink
Re: How sort key works? [In reply to]

On Tue, Aug 11, 2009 at 4:35 AM, Petr Kadlec<petr.kadlec [at] gmail> wrote:
> (OBTW, a different thing is that category paging is probably buggy in
> this tiebreaking aspect – even though the index is correctly defined
> to be unique, the page_id column is not included in the &from= paging
> parameter. But this bug will probably appear only in extreme cases,
> like 300 articles with an identical sortkey.)

It will return slightly wrong results whenever two articles with the
same sort key happen to hit a page boundary. It's not a huge deal,
since sortkeys are usually fairly unique, but it shouldn't be hard to
fix if cl_from is already part of the sortkey index -- which it is, on
trunk, although I can't say for sure whether that matches the deployed
version.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.