Gossamer Forum
Home : Products : Gossamer Links : Discussions :

Google Sitemap and Glinks3

Quote Reply
Google Sitemap and Glinks3
Hi

https://www.google.com/webmasters/sitemaps/docs/en/protocol.html


Could Glinks3 have a feature which makes everything so easy for the New Google SiteMap launched. This is keeping in view almost everything like rewrite url's etc so that the static looking rewritten url of detailed pages etc is automaticaally included in the xml file with every build (complete or changed whichever fits their program policy best).

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] Google Sitemap and Glinks3 In reply to
Google Launches Program To Help Web Sites Get Indexed By Kevin W. FitzMaurice

E-Commerce Times
06/03/05 3:05 PM PT
The Google Sitemaps program does not replace the crawl technology the search engine company currently uses to find Web content; it is designed to enhance it. Also, using Sitemaps will neither guarantee that a Web site will be included in the index nor will it have any effect on how sites are ranked on results pages, Google said.
Google (Nasdaq: GOOG) today announced its latest salvo in the hot competition in the search landscape: a free online service intended to allow webmasters to automatically submit their Web pages to Google to be indexed in a bid to improve the results of online searches.
Google Sitemaps is a beta release and is designed to be a collaborative program between Google and webmasters intended to provide the search company with more information about Web content.
In an interview with SearchEngineWatch published in conjunction with the launch, Google Engineering Director Shiva Shivakumar explained that "currently Web crawling [searching] is limited. Crawlers don't know all the pages at a Web site (e.g., dynamic pages), when those pages change, how often to recrawl pages, how much load to put on a Web site. So they try to guess.
"We want to work collaboratively with webmasters to get a big picture of all the URLs we should be crawling, and how often they should be recrawled." No Effect on Ranking
The program does not replace the crawl technology Google currently uses to find Web content; it is designed to enhance it. Also, using Sitemaps will neither guarantee that a Web site will be included in the index nor will it have any effect on how sites are ranked on results pages, Google said.
The new offering is the latest in a long line of product releases in recent months in the heated battle for consumers by the various online search companies, from Yahoo's Mindset service allowing users to control the search sort to Ask Jeeves' Zoom and Web Answers features aimed at providing more accurate searches to MSN's Virtual Earth for local search and Google's own Desktop Search tool and its satellite maps feature.
Google said the Sitemaps launch demonstrates its ongoing efforts to create innovative Web search technologies that make finding relevant information faster and easier.
Until now, webmasters could only publish their pages to the Internet and simply wait for Google to crawl their site for inclusion in the Google search index, or they could submit their home page to Google's "Add URL" page and hope the search engine found all the pages linked from it. Technology Will Be Shared
With Google Sitemaps, webmasters can inform Google about all their existing Web pages, prioritize the pages they want crawled first, and tell the search engine company when pages are updated so that Google can index new content faster.
In a blog on the Google Web site, Shivakumar wrote: "We're undertaking an experiment called Google Sitemaps that will either fail miserably, or succeed beyond our wildest dreams, in making the Web better for webmasters and users alike. It's a beta 'ecosystem' that may help webmasters with two current challenges: keeping Google informed about all of your new Web pages or updates, and increasing the coverage of your Web pages in the Google index."
According to the blog entry, Google will be sharing the technology so other search sites can improve their service as well. "This project doesn't just pertain to Google, either: we're releasing it under the Attribution/Share Alike Creative Commons license so that other search engines can do a better job as well. Eventually we hope this will be supported natively in webservers (e.g. Apache, Lotus Notes, IIS)," Shivakumar wrote.
Google said the new service is intended for all Web sites, from those containing only a single Web page to companies with millions of ever-changing pages. How It Works
As for whether some Web sites might try to use the new service to spam the search index in bulk, Shivakumar told SearchEngineWatch, "We are always developing new techniques to manage index spam. All those techniques will continue to apply with the Google Sitemaps."
Here's how Sitemap works: Webmasters can sign up for the program at www.google.com/webmasters/sitemaps. They then generate and submit an XML formatted site list. This file can be created using the Sitemap Generator, a free open-source tool that generates an XML sitemap for a few simple use cases.
Webmasters do not need a Google account to generate and submit sitemaps. However, if they do sign up, they can log in to check the status of their sitemaps and view diagnostic information for their submissions. The product is available in the U.S. English and German language interfaces today. Google said it hopes to expand its availability to other languages in the near future.
Quote Reply
Re: [HyperTherm] Google Sitemap and Glinks3 In reply to
It should theoretically be possible to adopt the XMLResults plugin for this purpose? and somehow save the results to a static page or file?

The template could be edited to reflect the Google tags.


It would be possible to do a dynamic search displaying the results in xml_search_results.xml but there would be a limit to the number of results you could get.

Quote Reply
Re: [Alba] Google Sitemap and Glinks3 In reply to
I've been trying to figure out *why* google did this, what benefit it had.

There are two clues in the protocol.

1) Please note that the priority you assign to a page has no influence on the position of your URLs in a search engine's result pages. Search engines use this information when selecting between URLs on the same site, so you can use this tag to increase the likelihood that your more important pages are present in a search index.

2) any site where certain pages are only accessible via a search form would benefit from creating a Sitemap and submitting it to search engines

So, the clues to the value of this are -- telling the search engines which are the most important pages, reguardless of which keywords the crawlers seem to want to rank pages by. If your second level pages are more important than your home page, you can tell the crawlers that -- and the indexes may or may not consider it.

The other reason, is more useful. Because much content today is dynamic, crawlers are indexing a lot of bogus content. If you tell a crawler what's important on your site by using #1, then you can also "target" content pages by creating searches, recommended searches, and such, which can highlight the important content on your site. Using dynamic rewrites, you can use static-look urls, and provide direct links to "niche" or "targeted" content that is burried within your site.

This isn't going to help the spammers, but *might* help you compete if you properly analyze your site, create new links to niche content, and prioritize the pages that are most important.

Simply exporting your site as XML isn't going to do anything except raise your bandwidth.

Before implementing this, going through a "site redesign" would help. Give links , or categories, "Seach Engine Priority" as in #1, and create suggested searches that target high-content, but "hidden" content.

This is a potential for a new plugin to identify high-value keywords within your site (eg: keywords that return good content). For a kludge you can use the number of results returned on the search logger, but pure quantity doesn't mean quality.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [pugdog] Google Sitemap and Glinks3 In reply to
As part of my 3.x integration, I already have priority as a field within some of my sites.

This is the ideal time for me to label all the catgories/links I'm adding with their priority and therefore have it in a format that can easily be exported to xml.

I have the 2.x xml plugin installed and see how a similar plugin could work if it were to be adapted to export the xml to a file. This would save me a lot of work because I don't have any programmes that will write xml.
Quote Reply
Re: [Alba] Google Sitemap and Glinks3 In reply to
A couple of challenges:

1. The sitemap.xml file MUST be UTF-8.
2. All data values, including URLs, in your Sitemap files must be XML-encoded.
3. Due to the New/Updated inheritance bug/feature in GLinks, a correct list of updated categories is a bit tricky to make. Not critical, I guess.

Having said that, the XML itself is not the challenge. Make a global that shows all your categories, and use PagePuilder to create the XML-file. Check My Sitemap for an example.

That's made with a template:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
<%AllCats%>
<%loop new_categories%>
<url>
<loc><%URL%></loc>
<lastmod><%Newest_Link%></lastmod>
<changefreq>weekly</changefreq>
<priority>0.8</priority>
</url>
<%endloop%>
</urlset>
and a global called AllCats:
Code:
sub {
my $category = $DB->table('Category');
$category->select_options('ORDER BY Name');
my $sth = $category->select( ['Newest_Link','Full_Name'] );
my @cats;
while (my $cat = $sth->fetchrow_hashref) {
$cat->{URL} = "$CFG->{build_root_url}/" . $category->as_url($cat->{Full_Name}) . "/$CFG->{build_index}";
push @cats, $cat;
}
return { new_categories => \@cats };
}
Both in the pagebuilder areas.

Now, can anyone tell me how this priority thing works?

John

Last edited by:

gotze: Jun 7, 2005, 2:23 PM
Quote Reply
Re: [gotze] Google Sitemap and Glinks3 In reply to
Glad to see this discussion. Came over today to ask if someone had created something that would generate a nice site map file. And NO the file DOES NOT have to be xml. It can be a nice simple txt file of all your page links.

So I was wondering if anyone had a way to make a text file of all the category pages, including subcategories? That'll be a start.
Quote Reply
Re: [loxly] Google Sitemap and Glinks3 In reply to
Loxly,

Please check http://www.google.com/webmasters/sitemaps/docs/en/protocol.html

The Sitemap Protocol uses an XML Sitemap Format. That's XML Wink

The global I show above produces a list of all categories, including subcategories.
Quote Reply
Re: [gotze] Google Sitemap and Glinks3 In reply to
From the FAQ:
Quote:


[/url]9. What is the simplest sitemap I can submit?
We strongly recommend that you use an XML format such as Sitemap or OAI for your sitemaps, since they allow you to associate additional information with each URL. However, we can also accept sitemaps in the form of a text file containing a simple list of URLs. The simple sitemap format consists of a list of URLs with one URL per line. For example: http://www.example.com/catalog?item=1 http://www.example.com/catalog?item=11 ...
Notes about this format:
  • Your URLs must not include embedded newlines.
  • You must fully specify URLs because Google tries to crawl the URLs exactly as you provide them.
  • Your sitemap files must use UTF-8 encoding.

Doesn't HAVE to be xml, can be a text file.

Forgot the direct link:
https://www.google.com/.../docs/en/faq.html#s9

Last edited by:

loxly: Jun 8, 2005, 11:00 AM
Quote Reply
Re: [loxly] Google Sitemap and Glinks3 In reply to
OK, I missed that Blush
Anyway, if you have Pagebuilder, just use a simplified template:
Code:
<%AllCats%>
<%loop new_categories%>
<%URL%>
<%endloop%>
and the global I provided.
Quote Reply
Re: [gotze] Google Sitemap and Glinks3 In reply to
I don't have pagebuilder.

I actually figured out how to make a simple category/subcategory site map using a table dump of the category table, and doing some tweaking. Took about 5 minutes for 15000 categories :) Sumbitted it to google, it will let me know if there are errors. If it works, I'll post :)
Quote Reply
Re: [HyperTherm] Google Sitemap and Glinks3 In reply to
Any thoughts on this from GT?

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [gotze] Google Sitemap and Glinks3 In reply to
Could you use a similar global to make a sitemap of detailed pages?

I've been trying to use an adapted xml template but cannot get it to list more than 200 links in any single search.
Quote Reply
Re: [HyperTherm] Google Sitemap and Glinks3 In reply to
Hi,

found this in the google forum:

http://www.sitemaptools.com/

I am using the official Python generator but have to generate a list of my static urls.

Regards

n || i || k || o
Quote Reply
Re: [Alba] Google Sitemap and Glinks3 In reply to
Also having same problem - can this global/template be ammended easily to show all URLs generated when we 'build-all' (e.g. homepage, search page, cats, detailed and pagebuilder pages)?

Akex



Indigo Clothing is a t-shirt printing company based in the UK.
Indigo Clothing | Promotional Clothing | T-Shirt Printing | Embroidery
Quote Reply
Re: [el noe] Google Sitemap and Glinks3 In reply to
In Reply To:

I am using the official Python generator but have to generate a list of my static urls.



The probelm get compounded when these are rewritten url's otherwise the python generator works fine

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================

Last edited by:

HyperTherm: Aug 24, 2005, 9:06 PM
Quote Reply
Re: [loxly] Google Sitemap and Glinks3 In reply to
In Reply To:
I actually figured out how to make a simple category/subcategory site map using a table dump of the category table, and doing some tweaking. Took about 5 minutes for 15000 categories :) Sumbitted it to google, it will let me know if there are errors. If it works, I'll post :)

Just a follow up that using the dump of the category table and pulling out the field that has the full category path, appending the base url for the domain and saving it as a plain text file worked fine. The sitemap is pulled on a regular basis by google. Traffic was up, and is now back to being level, but appears to be more targeted. We'll see as 4th quarter rolls around and I have last year to compare traffic and sales to, to see if all this has helped or not!