Gossamer Forum
Home : General : Internet Technologies :

charset encoding / search engines

Quote Reply
charset encoding / search engines
I have been using charset=shift_jis for encoding Japanese pages. This has worked well except for the fact some pages contain English language characters and they are displayed wrong. Google translates this page.

I was advised to change the coding to UTF-8 and this displays all characters correctly but is not translated by Google.

It should be noted that when saving the text in the mySQL database fields, it is often reverting to ascii (or another) code. Text inserted in the templates is unchanged.

This only applies to the Japanase pages, the other template sets on the website are displaying correctly with original GT charset.

Does anyone know if Google can read/display the UTF-8 encoded pages or if the search engine will not be able to crawl the text correctly.
Quote Reply
Re: [Alba] charset encoding / search engines In reply to
Google's servers run a stripped down version of Red Hat Linux. My guess is that they (and the Google-specific software) fully supports thousands of different encodings. Are you sending your pages with the proper charset?

Content-Type: text/<xml|html|something>; charset=UTF-8

I exclusively use Unicode encodings on my Webpages now and haven't had issues with bots.

Last edited by:

mkp: Mar 13, 2006, 7:31 PM
Quote Reply
Re: [mkp] charset encoding / search engines In reply to
Interesting, thanks.