randy at theoryx5
Jul 24, 2002, 10:16 AM
Post #1 of 9
I set up a mirror site of modperl-docs at
It was pretty straightforward to do, even getting
the local search to work - nicely designed! I've
appended below a bit of a how-to on setting up a mirror,
in case anyone wants to try.
mirror - mirroring the mod_perl site
=item * Perl
=item * a cvs client
=item * swish-e (version 2.1-dev), from http://www.swish-e.org/
Mirroring the mod_perl site starts off by obtaining the
sources from C<cvs>; this may be done as
% cd /usr/local/src
% cvs -d :pserver:anoncvs [at] cvs:/home/cvspublic login
% cvs -d :pserver:anoncvs [at] cvs:/home/cvspublic co modperl-docs
which will place the sources under F</usr/local/src/modperl-docs>.
Next, decide the URL underneath which you wish modperl
documents to appear on your site. For example, if we wished access
to be under http://your.server/modperl/, then we could use
the following directives in F<httpd.conf>:
Alias /modperl/ "/usr/local/src/modperl-docs/dst_html/"
Options Indexes MultiViews
Allow from all
SetEnv SWISH_BINARY_PATH "/usr/local/bin/swish-e"
SetEnv PERL5LIB "/usr/local/src/modperl-docs/dst_html/search/modules"
AddHandler cgi-script cgi
Here, I<SWISH_BINARY_PATH> is the path to your swish-e binary.
You can then build the document set by (this could also be
used as a shell script to be run under cron to keep your site current):
% cd /usr/local/src/modperl-docs
% cvs -z9 up -dR
% export MODPERL_SITE='http://your.server/modperl'
% export SWISH_BINARY_PATH='/usr/local/bin/swish-e'
Use the command appropriate for your shell in setting
the I<MODPERL_SITE> and I<SWISH_BINARY_PATH> environment
variables. You may see some errors from C<bin/build> about missing
Perl modules; these are available from CPAN. As well, if
your perl binary is not at F</usr/local/bin/perl>, you should
create the appropriate symbolic link.
The swish-e index files are built using a spidering program
which indexes the pages under what you set for I<MODPERL_SITE>.
A subtelty in this is present if there exists any links to your
site within the modperl documents, as the spidering
program will then start to follow these links. This can be
prevented by creating a temporary F<robots.txt> under
your I<DocumentRoot> which excludes these links outside
of your I<MODPERL_SITE>. The progress of the spidering
program can be monitored in your server's access log.
If all goes well, you should then create a shell script
to be run daily via cron to keep your site current - only the pages
changed since the last run will be regenerated.
To unsubscribe, e-mail: docs-dev-unsubscribe [at] perl
For additional commands, e-mail: docs-dev-help [at] perl