
tstarling at wikimedia
Jun 20, 2008, 3:59 PM
Post #2 of 3
(1254 views)
Permalink
|
|
Re: Help with crawling Special:AllPages for small proprietary wiki
[In reply to]
|
|
Christopher Desmarais (Contractor) wrote: > We have a small propreitary wiki, and we would like to be able to search > the entire wiki content daily with sharepoint. > > It looks like the easiest way to do that would be to start a crawl at > special:allpages site. However, sharepoint immediately stops any such > crawl because the site has: > > <meta name="robots" content="noindex,nofollow" /> > > We looked for but can't seem to find any configuration options that we > set to include those tags. There is no robots.txt file in the root > directory, and we haven't set anything in LocalSettings or > DefaultingSettings to prevent robots from following the page (eg. > Defaultsettings.php has $wgNamespaceRobotPolicies = array(); and local > settings has no robot directives at all) > > 1) Is this a default setting for the special pages? It's hard-coded for all special pages. > 2) If it isn't where can we look for things we might have set that we > can turn off? > 3) If it is, is there anything we can turn on to stop that tag from > being put in the page? Index: includes/specials/Allpages.php =================================================================== --- includes/specials/Allpages.php (revision 36353) +++ includes/specials/Allpages.php (working copy) @@ -12,6 +12,8 @@ function wfSpecialAllpages( $par=NULL, $specialPage ) { global $wgRequest, $wgOut, $wgContLang; + $wgOut->setRobotPolicy( '' ); + # GET values $from = $wgRequest->getVal( 'from' ); $namespace = $wgRequest->getInt( 'namespace' ); _______________________________________________ MediaWiki-l mailing list MediaWiki-l [at] lists https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
|