Gossamer Forum
Home : Products : Gossamer Links : Discussions :

"Foreign characters" in directory and file names

Quote Reply
"Foreign characters" in directory and file names
This puzzles me ...

It seems to be generally accepted in the GT forums that 'non-english' characters, eg: Éßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýÿ, cannot be used in the names of directories and files.

Nevertheless, I have used these characters in a testbed directory using flatfile Links. Provided they are url-encoded, there is no detectable problem building under FreeBSD.

Also I notice that the Open Directory uses them -- check out http://dmoz.org/...f5es_de_Professores/

Should I not be using them, even if they seem to work in practice.
Is there something I don't know about?
Is this OS-dependent (eg: OK under Unix, not OK under Windows)?

By the way, the urlencode sub is
Code:
sub urlencode {
# --------------------------------------------------------
# Escapes a string to make it suitable for printing as a URL.
#
my($toencode) = shift;
$toencode =~ s/([^a-zA-Z0-9_\-.])/uc sprintf("%%%02x",ord($1))/eg;
$toencode =~ s/\%2F/\//g;
return $toencode;
}


Last edited by:

YoYoYoYo: Oct 12, 2001, 1:19 AM
Quote Reply
Re: [YoYoYoYo] "Foreign characters" in directory and file names In reply to
Hi,

No, it's not recommended as different o/s have different limits on what are valid characters in directory names.

DMoz doesn't use them, they URL escape them. If you look at the source, you'll see the URL is:

/World/Portugu%eas/Educa%e7%e3o/

It just happens IE make it transparent.

Cheers,

Alex
--
Gossamer Threads Inc.
Quote Reply
Re: [Alex] "Foreign characters" in directory and file names In reply to
> they URL escape them ...

Does that mean they use something like sub urlencode?

Is there any reason why we cannot also urlencode directory names if our OS does not object?

We are using ID numbers for directory names because just two of the category names contain non-english characters. Is there a way of using build_directory_field conditionally? That is, use the category name unless there is a string in build_directory_field.


Quote Reply
Re: [YoYoYoYo] "Foreign characters" in directory and file names In reply to
Hi,

Yes, that's how build_directory_field works by default. If it is blank, it will fall back to the old method.

Cheers,

Alex
--
Gossamer Threads Inc.
Quote Reply
Re: [Alex] "Foreign characters" in directory and file names In reply to
So it does.