Gossamer Forum
Home : Products : Gossamer Links : Discussions :

foreign characters: danish dmoz import

Quote Reply
foreign characters: danish dmoz import
Hello all,

I have imported the section "world/Dansk" from dmoz. No problem using Andys script..

But: the danish language has 3 letters that is only used in danish æøå (i wonder if they will even print on this board)

my directory now has

æ listed as "æ"
ø listed as "ø"
å listed as "Ø"

this presents a problem for many reasons.. 1st it looks like crap, 2nd it makes searches on word with letters æøå impossible..

exampel http://www.buhuu.dk/1204/1437/index.html

if you have any ideas to solve this problem, please lmk

regards Dane
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Try changing the encoding on your page to UTF-8 (I just did this and it looks much better). You can do this with a metatag.
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
In Reply To:
Try changing the encoding on your page to UTF-8 (I just did this and it looks much better). You can do this with a metatag.
sorry to be so dumb, but how do i do that ?
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Try adding this to your template between the <head></head> tags.

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">

Last edited by:

afinlr: Oct 29, 2003, 2:48 PM
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
In Reply To:
Try adding this to your template between the <head></head> tags.

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
hey it works Smile thanks mate.................
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
### it does not work 100%.. it shows the pages imported from dmoz correct now, but the search functions are still not working with æøå ( as for as i can tell.
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Not sure whether you've done something to fix this - but I just searched for Århus Amt and it seemed to work.
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
In Reply To:
Not sure whether you've done something to fix this - but I just searched for Århus Amt and it seemed to work.
strange .. i get nothing..

http://www.buhuu.dk/cgi-bin/search.cgi?query=%C5rhus+Amt

http://www.buhuu.dk/....cgi?query=århus+Amt
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Strange - I copied and pasted the words into the search box and this is what the url became:

http://www.buhuu.dk/...query=%C3%85rhus+Amt
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
i dont get it Crazy......... help...
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
I *think* this may be a difference between pages which you have as UTF-8 character set and those which aren't. %C3%85 is UTF-8 encoded Å whereas %C5 is Latin-1 encoded Å
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
was missing in the search* templates, now i have them added - and they just produce blank pages ??

http://www.buhuu.dk/cgi-bin/search.cgi
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Sorry, there are lots of problems like this when you change the encoding of your pages - you do tend to get lots of ? appearing when you change from latin to utf-8 and then you get strange As all over the place if you try it the other way round. I had this problem when I was trying to import xml files that used a different encoding from my site. I decided to convert the records before inserting them into my database - I'm not sure whether this is possible with Andy's plugin but if so you could try this instead?

$code =~ s/([\x{80}-\x{FFFF}])/'&#' . ord($1) . ';'/gse;
Quote Reply
Re: [afinlr] foreign characters: danish dmoz import In reply to
i not sure what Andys plug can do, and i certainly cant code anything myself.... maybe Andy could make a comment if he's on ?
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Hi. Unfortunatly, all my plugin does is simply act as an intermeditory of nph-import.cgi (by putting the codes in dmoz_cron.cgi), and adding all the error checking codes etc. It basically just gives you an 'all-in-one' method to run an import, and a simple GUI.

I was thinking about writing my own RDF parser... but I'm not sure how fast it would be. It would also be a little while before I have time to start such a project Frown

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] foreign characters: danish dmoz import In reply to
so basically i can't use this method to import the danish section and i'm stuck Pirate ?
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
You can, but its just how accuarate the charachters will be. If you are more worried about content, than search results.. then it will work fine.

As I said, I'm going to start work on a new RDF parser sometime next week... probably end up putting it in as part of my DMOZ_Wizard plugin... but I'm not sure on that one yet.

Either way, I'll keep you updated :)

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] foreign characters: danish dmoz import In reply to
thanks bro.. i'll be standing by
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
any news on this subject ?
Quote Reply
Re: [dane] foreign characters: danish dmoz import In reply to
Still working on it. Its a bit weird though, cos even with my own RDF parser, it seems to be translating these charachters incorrectly, which is doing my head in :( I'm also a bit bogged down with all my financial stuff (having $1000+ stolen)... which has sidetracked me quite a bit :(

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] foreign characters: danish dmoz import In reply to
In Reply To:
(having $1000+ stolen)...


How did that happen? Pirate

Bent
Quote Reply
Re: [bannerzone] foreign characters: danish dmoz import In reply to
Long story: http://www.gossamer-threads.com/...i?post=255962#255962

Frown

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!