Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ripMIME: general

% characters in filenames - no standard!?

 

 

ripMIME general RSS feed   Index | Next | Previous | View Threaded


pldaniels at pldaniels

Jan 27, 2003, 6:31 AM

Post #1 of 3 (1128 views)
Permalink
% characters in filenames - no standard!?

I have an interesting situation - There are a few mailpacks I have here which have attachments of which the filename contains what looks like hex encoded octals, ie:


Content-Disposition: attachment;
filename="dresdnererkl%C3%A4rung.txt"
Content-Type: text/plain;
name="dresdnererkl%C3%A4rung.txt"


Sure, initially it seems easy enough, we'll just run over and decode those % values and convert into native 8-bit... however! I tried something locally, I sent an attachment with a filename of 'foobar%10.txt' and this is what I got in the headers:

Content-Type: text/plain;
name="foobar%10.txt"
Content-Disposition: attachment;
filename="foobar%10.txt"
Content-Transfer-Encoding: 7bit


Thus, there seems to be no clear indicative method of knowing if one is to decode the %XX sequences or not in a filename.

Anyone got any ideas?


--
Paul L Daniels http://www.pldaniels.com
Linux/Unix systems Internet Development
ICQ#103642862,AOL:cinflex,IRC:inflex
A.B.N. 19 500 721 806


aagero at eunetnorge

Jan 28, 2003, 9:55 AM

Post #2 of 3 (1087 views)
Permalink
Re: % characters in filenames - no standard!? [In reply to]

On Mon, Jan 27, 2003 at 11:31:12PM +1000, Paul L Daniels wrote:
| I have an interesting situation - There are a few mailpacks I have here which have
| attachments of which the filename contains what looks like hex encoded octals, ie:
|
|
| Content-Disposition: attachment;
| filename="dresdnererkl%C3%A4rung.txt"
| Content-Type: text/plain;
| name="dresdnererkl%C3%A4rung.txt"
|
|
| Sure, initially it seems easy enough, we'll just run over and decode those % values and
| convert into native 8-bit... however! I tried something locally, I sent an attachment
| with a filename of 'foobar%10.txt' and this is what I got in the headers:
|
| Content-Type: text/plain;
| name="foobar%10.txt"
| Content-Disposition: attachment;
| filename="foobar%10.txt"
| Content-Transfer-Encoding: 7bit
|
|
| Thus, there seems to be no clear indicative method of knowing if one is to decode the %XX sequences or not in a filename.
|
| Anyone got any ideas?

According to rfc2047, only "token" and "quoted-string" are allowed characters in
"attribute=value"-pairs in the Content-type field, which suggests that if the
result character is a CTL, it must be interpreted literally.

Another problem one might encounter with decoding hex-encoded characters is characters like
%00, %47, %20, perhaps all such encoded characters should be left untouched.

--
Åge Røbekk, EUnet Norge AS


pldaniels at pldaniels

Jan 29, 2003, 4:25 AM

Post #3 of 3 (1089 views)
Permalink
Re: % characters in filenames - no standard!? [In reply to]

>
> Another problem one might encounter with decoding hex-encoded characters
> is characters like%00, %47, %20, perhaps all such encoded characters
> should be left untouched.

This is a good point. I'll look over RFC2047 a bit more regarding this matter. Certainly it seems you can use %XX encoding if your text is of the form:

foo*0*=something%20else

Note the prefixing *0*. I suspect whom ever wrote the software which created the original emails which I have obviously failed to realise that they weren't supposed to encode like this. The other possibility I can think of is that they might have taken the filename from a URL.

Regards.


--
Paul L Daniels http://www.pldaniels.com
Linux/Unix systems Internet Development
ICQ#103642862,AOL:cinflex,IRC:inflex
A.B.N. 19 500 721 806

ripMIME general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.