sometimes email sent from dbman contains weird characters caused by punctuation encoding, character sets etc -- sorry i don't know the exact lingo. i read that i can specify utf-8 encoding when the email is sent and it will fix the problem. is that something i can do with the print "MAIL" sections in dbman? if so, what is the code? thanks.
Feb 13, 2009, 1:31 PM
Veteran (1141 posts)
Feb 13, 2009, 1:31 PM
Post #2 of 9
Views: 11653
http://www.gossamer-threads.com/forum/General_C8/Perl_Programming_F14/Sendmail_Charset_P277356/
see if the above thread helps any...
I think you want to add "charset=utf8" as suggested by Andy in the second post. There seems to be some issues around this though as a lot of people
feel it solves one problem but potentially creates another (?)
Here is the print section I use in dbman (starting with the subject line):
print MAIL "MIME-Version: 1.0\n";
print MAIL "Content-Type: text/html; charset=us-ascii\n";
print MAIL "Content-Transfer-Encoding: 7bit\n";
print MAIL qq| blah blah html mark up etc |;
Actually I'm embarassed to say that I haven't really paid any attention to the encoding and all. I saw a great tutorial on it some where
and if I remember where I'll post a link here.
see if the above thread helps any...
I think you want to add "charset=utf8" as suggested by Andy in the second post. There seems to be some issues around this though as a lot of people
feel it solves one problem but potentially creates another (?)
Here is the print section I use in dbman (starting with the subject line):
Code:
print MAIL "Subject: New Order - $rec{'f00_bLast'}, $rec{'f00_bFirst'}\n"; print MAIL "MIME-Version: 1.0\n";
print MAIL "Content-Type: text/html; charset=us-ascii\n";
print MAIL "Content-Transfer-Encoding: 7bit\n";
print MAIL qq| blah blah html mark up etc |;
Actually I'm embarassed to say that I haven't really paid any attention to the encoding and all. I saw a great tutorial on it some where
and if I remember where I'll post a link here.
Feb 13, 2009, 2:13 PM
Veteran (1141 posts)
Feb 13, 2009, 2:13 PM
Post #3 of 9
Views: 11653
After looking at Wikipedia regarding character sets my eyes started to glaze over... here is a plain english tutorial instead:
http://www.joelonsoftware.com/articles/Unicode.html
After reading that I can see that I should prolly change my print statement to include utf-8 as well.
http://www.joelonsoftware.com/articles/Unicode.html
After reading that I can see that I should prolly change my print statement to include utf-8 as well.
Feb 13, 2009, 3:03 PM
Enthusiast (661 posts)
Feb 13, 2009, 3:03 PM
Post #4 of 9
Views: 11642
I am trying your suggestions, so far without success. it was my understanding that emails were supposed to have a blank line between subject and body but i see that the code you're using doesn't have two \n before you start the body. i'm trying yours now but added an extra \n and took out the extra one on my subject line. the one i copied from the thread you referenced did not work, i.e. the emails still had weird characters. to test this i'm just copying and pasting from a word document, which is what my users will be doing. the document contains an em dash and curly quotes. will let you know if i get one to work!
i have tried several variations of yours and it doesn't fix the curly quotes. in addition, it's taking out the line breaks i had in the email. to prevent hackers and spammers, i use a fixed subject. i have the following after the subject in dbman:
print MAIL "-" x 75 . "\n\n";
print MAIL "Message from $in{'email'}\n\n"; #2/11/2008
print MAIL "$in{'subject'}\n\n";
print MAIL "$in{'emailmessage'}";
my testing is removed the blank lines after the row of dashes, aafter the message from line, and after the in-subject. maybe it's because i'm sending plain text messages, not html??
i have tried several variations of yours and it doesn't fix the curly quotes. in addition, it's taking out the line breaks i had in the email. to prevent hackers and spammers, i use a fixed subject. i have the following after the subject in dbman:
print MAIL "-" x 75 . "\n\n";
print MAIL "Message from $in{'email'}\n\n"; #2/11/2008
print MAIL "$in{'subject'}\n\n";
print MAIL "$in{'emailmessage'}";
my testing is removed the blank lines after the row of dashes, aafter the message from line, and after the in-subject. maybe it's because i'm sending plain text messages, not html??
Feb 15, 2009, 10:07 AM
Veteran / Moderator (3034 posts)
Feb 15, 2009, 10:07 AM
Post #6 of 9
Views: 11635
Any time you copy and paste from a word document you are going to getting characters that are not recognized as html coding.
I also suggest to people that if they are going to be copying information from a MSWord doc that they convert and save the document as a .txt file and then open that version before doing the copy.
I don't see where Word provides the exact codes they use, for instance, for quotes ... to be able to add the replacements into dbman.
Is this causing problems with your database in general having those MSWord codes in there? I would think it would cause more problems that just in the sending of those characters in emails.
Unoffical DBMan FAQ
http://creativecomputingweb.com/dbman/index.shtml/
I also suggest to people that if they are going to be copying information from a MSWord doc that they convert and save the document as a .txt file and then open that version before doing the copy.
I don't see where Word provides the exact codes they use, for instance, for quotes ... to be able to add the replacements into dbman.
Is this causing problems with your database in general having those MSWord codes in there? I would think it would cause more problems that just in the sending of those characters in emails.
Unoffical DBMan FAQ
http://creativecomputingweb.com/dbman/index.shtml/
Feb 16, 2009, 1:51 PM
Veteran (1141 posts)
Feb 16, 2009, 1:51 PM
Post #7 of 9
Views: 11627
I think Lois is right.. I'm sure there's a way to tell the form what encoding to use so that the data comes from the form all "cleaned up" and properly formatted.
In the meantime I found a work-around, but it's not pretty.
Open Word and type a quote " (in Word it will be curly)
Save page as txt but don't close the file. Instead, highlight the quote and then paste it into textpad (or at least that's what I use).
You hopefully will get something like a "fat pipe" character or square box then you can do something like:
$var =~ s/[]/"/g;
use CGI::Carp qw/fatalsToBrowser/;
$var = 'βmeβ';
$var =~ s/β/"/g;
print "Content-Type: text/html;\n\n";
print qq|<HTML><HEAD><TITLE></TITLE></HEAD>
<BODY BGCOLOR="#FFFFFF">
$var;
</BODY>
</HTML>
|;
It probably doesn't show correctly but it works for me - where you see backward quotes in the text file is actually a black square. I uploaded it to the webserver and it actually worked!
Of course I'm sure this totatlly politically incorrect. Instead you definitely want to look into "form encoding" I think that will solve your problem.
.
In the meantime I found a work-around, but it's not pretty.
Open Word and type a quote " (in Word it will be curly)
Save page as txt but don't close the file. Instead, highlight the quote and then paste it into textpad (or at least that's what I use).
You hopefully will get something like a "fat pipe" character or square box then you can do something like:
$var =~ s/[]/"/g;
Code:
#!/usr/local/bin/perl use CGI::Carp qw/fatalsToBrowser/;
$var = 'βmeβ';
$var =~ s/β/"/g;
print "Content-Type: text/html;\n\n";
print qq|<HTML><HEAD><TITLE></TITLE></HEAD>
<BODY BGCOLOR="#FFFFFF">
$var;
</BODY>
</HTML>
|;
It probably doesn't show correctly but it works for me - where you see backward quotes in the text file is actually a black square. I uploaded it to the webserver and it actually worked!
Of course I'm sure this totatlly politically incorrect. Instead you definitely want to look into "form encoding" I think that will solve your problem.
.
Feb 19, 2009, 2:32 PM
Veteran (1141 posts)
Feb 19, 2009, 2:32 PM
Post #8 of 9
Views: 11551
Also try adding the following attribute to your form tag:
<form name="blah" action="blah" accept-charset="UTF-8">
and see if that makes a difference.
You're issue is not uncommon see this discussion http://www.intertwingly.net/...oding-and-HTML-Forms.
<form name="blah" action="blah" accept-charset="UTF-8">
and see if that makes a difference.
You're issue is not uncommon see this discussion http://www.intertwingly.net/...oding-and-HTML-Forms.
Feb 22, 2009, 2:10 PM
Enthusiast (661 posts)
Feb 22, 2009, 2:10 PM
Post #9 of 9
Views: 11483
finally getting back to this -- thanks to both of you for your suggestions. i don't remember the copy and paste causing any problems in the database itself, just in the email form. i just tested adding a record and copying & pasting something from word. it's ok in the record.
i always save as text, close the file, open the file in notepad, and then copy & paste. but my CLIENTS aren't going to do that.
i tried adding the utf tag to the form as you suggested, but it had no effect.
i have lots of code in sub parse_form now that i got from moroniser http://www.fourmilab.ch/webtools/demoroniser/ but it doesn't fix the problem with email.
i always save as text, close the file, open the file in notepad, and then copy & paste. but my CLIENTS aren't going to do that.
i tried adding the utf tag to the form as you suggested, but it had no effect.
i have lots of code in sub parse_form now that i got from moroniser http://www.fourmilab.ch/webtools/demoroniser/ but it doesn't fix the problem with email.