Gossamer Forum
Home : General : Internet Technologies :

regex and google sitemap

Quote Reply
regex and google sitemap
Hi

in filters section of config.xml following are the lines as examples:

<!-- Exclude URLs that end with a '~' (IE: emacs backup files) -->
<filter action="drop" type="wildcard" pattern="*~" />

<!-- Exclude URLs within UNIX-style hidden files or directories -->
<filter action="drop" type="regexp" pattern="/\.[^/]*" />

Taking Glinks 3 as example:

What do i do for excluding everything from say /path_to/cgi-bin/admin/ downwards. Similarly for say any generic durectory that needs to be excluded from sitemap generation what would be the regex

generic directory name : directory

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================

Last edited by:

HyperTherm: Jul 13, 2005, 6:53 AM
Quote Reply
Re: [HyperTherm] regex and google sitemap In reply to
Try this:

<filter action="drop" type="regexp" pattern="*/path_to/cgi-bin/admin/*" />

Virginia
Quote Reply
Re: [Virginia] regex and google sitemap In reply to
Thanks that worked.
Any feedback with the googlesitemaps? I find that though the sitemap.xml is downloaded frequently, crawling nver really takes place. The regular crawl however keeps going as usual

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================