
aagero at eunetnorge
Jan 28, 2003, 9:55 AM
Post #2 of 3
(1087 views)
Permalink
|
|
Re: % characters in filenames - no standard!?
[In reply to]
|
|
On Mon, Jan 27, 2003 at 11:31:12PM +1000, Paul L Daniels wrote: | I have an interesting situation - There are a few mailpacks I have here which have | attachments of which the filename contains what looks like hex encoded octals, ie: | | | Content-Disposition: attachment; | filename="dresdnererkl%C3%A4rung.txt" | Content-Type: text/plain; | name="dresdnererkl%C3%A4rung.txt" | | | Sure, initially it seems easy enough, we'll just run over and decode those % values and | convert into native 8-bit... however! I tried something locally, I sent an attachment | with a filename of 'foobar%10.txt' and this is what I got in the headers: | | Content-Type: text/plain; | name="foobar%10.txt" | Content-Disposition: attachment; | filename="foobar%10.txt" | Content-Transfer-Encoding: 7bit | | | Thus, there seems to be no clear indicative method of knowing if one is to decode the %XX sequences or not in a filename. | | Anyone got any ideas? According to rfc2047, only "token" and "quoted-string" are allowed characters in "attribute=value"-pairs in the Content-type field, which suggests that if the result character is a CTL, it must be interpreted literally. Another problem one might encounter with decoding hex-encoded characters is characters like %00, %47, %20, perhaps all such encoded characters should be left untouched. -- Åge Røbekk, EUnet Norge AS
|