Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Wikipedia: Wikitech

Re: [Foundation-l] Question to post...

 

 

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded


beesley at gmail

Aug 13, 2009, 5:28 PM

Post #1 of 11 (1018 views)
Permalink
Re: [Foundation-l] Question to post...

On Fri, Aug 14, 2009 at 5:23 AM, Gregory Maxwell<gmaxwell [at] gmail> wrote:
> On Thu, Aug 13, 2009 at 2:56 PM, Cox, Serita<Serita.Cox [at] bridgespan> wrote:
>> Google's new search engine, Caffeine, is supposedly kicking Wikipedia
>> entries further down results page. Thoughts? Comments?
>> http://software.silicon.com/applications/0,39024653,39484015,00.htm
> [from my comments in #wikimedia-tech the other day]
> "So— I tried 20 random words, and the WP result was lower in four of
> them, the same in the rest."
> "No pattern really...  We still have the problem with "article at funny name;
> redirect from common name; common name search on google gives squat",
> which I consider to be much more major."

A simple solution to this is using the canonical tags which all major
search engines started supporting earlier this year.

<http://www.mattcutts.com/blog/canonical-link-tag/?
Wikia's GPL code to add this to MediaWiki is available here:
<https://wikia-code.com/wikia/trunk/extensions/wikia/CanonicalHref/CanonicalHref.php?
More info on it in Nick's blog post at
<http://www.techyouruniverse.com/wikia/google-canonical-href-with-mediawiki>

Angela

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


brion at wikimedia

Aug 13, 2009, 6:13 PM

Post #2 of 11 (958 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

On 8/13/09 5:28 PM, Angela wrote:
> On Fri, Aug 14, 2009 at 5:23 AM, Gregory Maxwell<gmaxwell [at] gmail> wrote:
>> "So— I tried 20 random words, and the WP result was lower in four of
>> them, the same in the rest."
>> "No pattern really... We still have the problem with "article at funny name;
>> redirect from common name; common name search on google gives squat",
>> which I consider to be much more major."
>
> A simple solution to this is using the canonical tags which all major
> search engines started supporting earlier this year.

That's been deployed for a while, eg:

<link rel="canonical" href="/wiki/Foobar" />
at http://en.wikipedia.org/wiki/Foo

-- brion

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Aug 14, 2009, 4:21 AM

Post #3 of 11 (960 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

* Brion Vibber <brion [at] wikimedia> [Thu, 13 Aug 2009 18:13:38 -0700]:
> That's been deployed for a while, eg:
>
> <link rel="canonical" href="/wiki/Foobar" />
> at http://en.wikipedia.org/wiki/Foo
>
I haven't found such code in MediaWiki 54916 snapshot from SVN
(currently seems to be running at WMF). I am missing the code (I've
looked into monobook and grepped for "canonical" through the subtree),
or does it use some kind of extension? The most logical place for it is
the monobook skin source code.
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


petr.kadlec at gmail

Aug 14, 2009, 4:28 AM

Post #4 of 11 (962 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

2009/8/14 Dmitriy Sintsov <questpc [at] rambler>:
> * Brion Vibber <brion [at] wikimedia> [Thu, 13 Aug 2009 18:13:38 -0700]:
>> That's been deployed for a while, eg:
>>
>> <link rel="canonical" href="/wiki/Foobar" />
>> at http://en.wikipedia.org/wiki/Foo
>>
> I haven't found such code in MediaWiki 54916 snapshot from SVN

You were not looking closely enough. See
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Article.php?revision=54916&view=markup
(function showRedirectedFromHeader())

> The most logical place for it is
> the monobook skin source code.

Nope, there is no reason to limit the functionality to one skin.

-- [[cs:User:Mormegil | Petr Kadlec]]

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Aug 25, 2009, 1:24 AM

Post #5 of 11 (854 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

* Petr Kadlec <petr.kadlec [at] gmail> [Fri, 14 Aug 2009 13:28:32 +0200]:
> 2009/8/14 Dmitriy Sintsov <questpc [at] rambler>:
> > * Brion Vibber <brion [at] wikimedia> [Thu, 13 Aug 2009 18:13:38
> -0700]:
> >> That's been deployed for a while, eg:
> >>
> >> <link rel="canonical" href="/wiki/Foobar" />
> >> at http://en.wikipedia.org/wiki/Foo
> >>
> > I haven't found such code in MediaWiki 54916 snapshot from SVN
>
> You were not looking closely enough. See
>
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Article.php?revision=54916&view=markup
> (function showRedirectedFromHeader())
>
Why this is being added only for redirects? In one of my wiki (old v
1.11) there's no such method showRedirectedFromHeader() in Article
class. I've tried to add the canonical link with $WgOut->addLink() to
Article::view(), then the canonical link is not being displayed for
action=edit, for example. Then, placed it to Article::outputWikiText(),
the same behavior. What's the proper place for this code in MediaWiki
1.11?

Can't update that wiki (patched code and many exotic extensions) just
yet.
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Platonides at gmail

Aug 25, 2009, 3:55 AM

Post #6 of 11 (852 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

Dmitriy Sintsov wrote:
> Why this is being added only for redirects? In one of my wiki (old v
> 1.11) there's no such method showRedirectedFromHeader() in Article
> class. I've tried to add the canonical link with $WgOut->addLink() to
> Article::view(), then the canonical link is not being displayed for
> action=edit, for example. Then, placed it to Article::outputWikiText(),
> the same behavior. What's the proper place for this code in MediaWiki
> 1.11?
>
> Can't update that wiki (patched code and many exotic extensions) just
> yet.
> Dmitriy

The proper fix would be to move your patched code, and update your
exotic extensions so you can get up to date from now on, instead of
patching it still more.


_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


gerard.meijssen at gmail

Aug 25, 2009, 4:05 AM

Post #7 of 11 (851 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

Hoi,
I know how it feels not to be able to update your wiki. My recommendations
are ... make sure that your extensions are in the Wikimedia Foundations SVN
code repository. Spend time on making your exotic extensions conform to
development standards and move as much as possible from you Core changes to
extensions think hooks in stead.

One thing is clear, you cannot compare your functionality with what exists
in release 1.16a and, with more usability initiative changes going in, you
will regret even more that it is hard for you to update.
Thanks,
GerardM

2009/8/25 Dmitriy Sintsov <questpc [at] rambler>

> * Petr Kadlec <petr.kadlec [at] gmail> [Fri, 14 Aug 2009 13:28:32 +0200]:
> > 2009/8/14 Dmitriy Sintsov <questpc [at] rambler>:
> > > * Brion Vibber <brion [at] wikimedia> [Thu, 13 Aug 2009 18:13:38
> > -0700]:
> > >> That's been deployed for a while, eg:
> > >>
> > >> <link rel="canonical" href="/wiki/Foobar" />
> > >> at http://en.wikipedia.org/wiki/Foo
> > >>
> > > I haven't found such code in MediaWiki 54916 snapshot from SVN
> >
> > You were not looking closely enough. See
> >
>
> http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/Article.php?revision=54916&view=markup
> > (function showRedirectedFromHeader())
> >
> Why this is being added only for redirects? In one of my wiki (old v
> 1.11) there's no such method showRedirectedFromHeader() in Article
> class. I've tried to add the canonical link with $WgOut->addLink() to
> Article::view(), then the canonical link is not being displayed for
> action=edit, for example. Then, placed it to Article::outputWikiText(),
> the same behavior. What's the proper place for this code in MediaWiki
> 1.11?
>
> Can't update that wiki (patched code and many exotic extensions) just
> yet.
> Dmitriy
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l [at] lists
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 25, 2009, 6:35 AM

Post #8 of 11 (862 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

On Tue, Aug 25, 2009 at 4:24 AM, Dmitriy Sintsov<questpc [at] rambler> wrote:
> Why this is being added only for redirects?

What else should it be added for?

> I've tried to add the canonical link with $WgOut->addLink() to
> Article::view(), then the canonical link is not being displayed for
> action=edit, for example. Then, placed it to Article::outputWikiText(),
> the same behavior.

What purpose would a canonical link serve on action=edit?

> What's the proper place for this code in MediaWiki
> 1.11?

It's unlikely anyone else is going to spend time hunting through
nearly two-year-old code for you. Two-year-old code which, by the
way, very possibly has known, unpatched security vulnerabilities,
since it hasn't been supported in a year or so. If you're not willing
to upgrade for whatever reason, you'll probably have to figure this
kind of thing out yourself.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Aug 25, 2009, 9:30 AM

Post #9 of 11 (855 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

* Aryeh Gregor <Simetrical+wikilist [at] gmail> [Tue, 25 Aug 2009
09:35:25 -0400]:
> On Tue, Aug 25, 2009 at 4:24 AM, Dmitriy Sintsov<questpc [at] rambler>
> wrote:
> > Why this is being added only for redirects?
>
> What else should it be added for?
>
For every invocation of the same article with any action that produces
HTML output.

> > I've tried to add the canonical link with $WgOut->addLink() to
> > Article::view(), then the canonical link is not being displayed for
> > action=edit, for example. Then, placed it to
> Article::outputWikiText(),
> > the same behavior.
>
> What purpose would a canonical link serve on action=edit?
>
Wouldn't the action=edit be indexed by robots when we have no proper
robots.txt? Or, there will be meta noindex, nofollow in the head of such
page? Anyway, it seems that Yandex crawler doesn't like the meta noindex
rules in the header of the page, giving an error (warning) message in
the stats of their webmaster tools. I've thought that the purpose of
canonical link is to threat the multiple actions of the page as the
single page to the web indexer, thus, improving the ranks.

> > What's the proper place for this code in MediaWiki
> > 1.11?
>
> It's unlikely anyone else is going to spend time hunting through
> nearly two-year-old code for you. Two-year-old code which, by the
> way, very possibly has known, unpatched security vulnerabilities,
> since it hasn't been supported in a year or so. If you're not willing
> to upgrade for whatever reason, you'll probably have to figure this
> kind of thing out yourself.
>
I am willing to upgrade, just not yet. It's not my fault that the wiki
wasn't upgraded for such long time - I work with it only recently. My
other wikis run 1.14.1 Yes, it uses some custom made extensions which
aren't in SVN nor www.mediawiki.org. I'll try to figure out myself, of
course.
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Simetrical+wikilist at gmail

Aug 25, 2009, 10:13 AM

Post #10 of 11 (863 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

On Tue, Aug 25, 2009 at 12:30 PM, Dmitriy Sintsov<questpc [at] rambler> wrote:
> For every invocation of the same article with any action that produces
> HTML output.

That's wrong. The canonical version of a page must be a page with
substantially identical content. Edit pages serve totally different
HTML; rel=canonical pointing to the article will just be ignored by
search engines. See here for a discussion of how rel=canonical works:

http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html

Note, e.g., "We allow slight differences, e.g., in the sort order of a
table of products. We also recognize that we may crawl the canonical
and the duplicate pages at different points in time, so we may
occasionally see different versions of your content." Totally
different content, no.

> Wouldn't the action=edit be indexed by robots when we have no proper
> robots.txt? Or, there will be meta noindex, nofollow in the head of such
> page?

Yes, we set noindex on edit pages.

> Anyway, it seems that Yandex crawler doesn't like the meta noindex
> rules in the header of the page, giving an error (warning) message in
> the stats of their webmaster tools.

What does the warning say? Ideally, of course, you should ban them in
robots.txt, so the search engine doesn't have to bother fetching the
URL.

> I've thought that the purpose of
> canonical link is to threat the multiple actions of the page as the
> single page to the web indexer, thus, improving the ranks.

The purpose is to tell search engines which URL you'd prefer them to
present to users, if the same content is being served under multiple
URLs. It is not meant to artificially inflate rankings by counting
unindexed pages as contributing to some entirely different page of
your choosing, and using it that way won't actually work. Since
search engines were already using heuristics to identify duplicate
content, and might well continue to use those exact same heuristics to
validate rel=canonical, it might not improve rankings at all.

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


questpc at rambler

Aug 25, 2009, 12:02 PM

Post #11 of 11 (852 views)
Permalink
Re: [Foundation-l] Question to post... [In reply to]

* Aryeh Gregor <Simetrical+wikilist [at] gmail> [Tue, 25 Aug 2009
13:13:56 -0400]:
> That's wrong. The canonical version of a page must be a page with
> substantially identical content. Edit pages serve totally different
> HTML; rel=canonical pointing to the article will just be ignored by
> search engines. See here for a discussion of how rel=canonical works:
>
>
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
>
Thanks for pointing out.

> Note, e.g., "We allow slight differences, e.g., in the sort order of a
> table of products. We also recognize that we may crawl the canonical
> and the duplicate pages at different points in time, so we may
> occasionally see different versions of your content." Totally
> different content, no.
>
Well, semantically an edit page and action=view page are not totally
different, for sure. Both of these will contain very similar
information. But I cannot go against standards, that's impossible.
That's something like law, you don't always like it, but you have to
obey it.

> > Anyway, it seems that Yandex crawler doesn't like the meta noindex
> > rules in the header of the page, giving an error (warning) message
in
> > the stats of their webmaster tools.
>
> What does the warning say? Ideally, of course, you should ban them in
> robots.txt, so the search engine doesn't have to bother fetching the
> URL.
>
I've banned them in robots.txt It produces the warning due to
non-existing titles, which also have meta noindex. There are some links
from foreign sites to non-existing titles which I obviously cannot
disable something like "http://mywiki.org/wiki/nonexsitingtitle" .
Yandex gives the warning "Document contains meta-tag noindex"
(approximately translated from Russian). A lots of such warnings. A bit
strange, why this is a warning at all. Google doesn't give such warning.

> The purpose is to tell search engines which URL you'd prefer them to
> present to users, if the same content is being served under multiple
> URLs. It is not meant to artificially inflate rankings by counting
> unindexed pages as contributing to some entirely different page of
> your choosing, and using it that way won't actually work. Since
> search engines were already using heuristics to identify duplicate
> content, and might well continue to use those exact same heuristics to
> validate rel=canonical, it might not improve rankings at all.
>
I am not so sure that such inflation is artifical. The artifical one
would be when the article/revision is not the same or, even mixing
MediaWiki generated HTML and other HTML. But, anyway I cannot change how
the search engines will interpret it.
Dmitriy

_______________________________________________
Wikitech-l mailing list
Wikitech-l [at] lists
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikipedia wikitech RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.