
cliff at genwax
Feb 15, 2000, 12:15 PM
Post #1 of 2
(816 views)
Permalink
|
|
[Fwd: CGI.pm and the untrusted-URL problem]
|
|
it wasn't clear if they made Lincoln aware of this problem, and I didn't see this pop up on our list so.... i thought i'd forward it just in case. Not a modperl problem in particular, but I know that embperl uses CGI.pm for a lot of its processing, and I am sure some other modperl modules do as well. cliff rayman genwax.com Kragen Sitaker wrote: > Description of the Problem > -------------------------- > > CGI.pm contains a method self_url which returns the URL with which the > script was called, including all of the data fields submitted --- > except for the .submit= field added by CGI.pm. > > Normally, this is used something like this: > > my $self = self_url; > print qq(<a href="$self#Section2">Section 2</a>\n); > > If CGI.pm is running on Apache 1.3.6, probably other versions of > Apache, and possibly other Web servers, it is possible for a client to > cause self_url to include arbitrary sequences of characters at its > beginning, such as > > "><script language="JavaScript">evil_code()</script><a href=" > > which, if used in the manner described above, leads to the problem > described in CERT Advisory CA-2000-02, "Malicious HTML Tags Embedded in > Client Web Requests". > > Apparently, anything following an unencoded space in the URL used to > invoke the script ends up being inserted, unencoded but converted to > lower case, at the beginning of self_url's return value. > > Unencoded spaces are, of course, illegal in URLs. Most web browsers > accept them anyway in HREF attributes, and don't bother to %-encode > them when they send them in a GET request. > > Netscape 4.6, MSIE 3.0, Mozilla M12, and Lynx 2.8.1rel.2 at least, > allow HREF attribute values to be delimited by ' single-quotes instead > of " double-quotes, which allows insertion of unencoded " double-quotes > into the URL --- which is crucial to exploiting this problem. Lynx > 2.8.1rel.2, however, strips the spaces from the URL found in HTML, > preventing it from being exploited via <A HREF=''>. > > Diagnosis > --------- > > It appears that this happens because the unencoded space is interpreted > by the HTTP server (Apache 1.3.6 in my tests) as separating the URL > from the protocol name. So the environment variable SERVER_PROTOCOL > gets set to everything following the space, followed by a space and the > actual protocol, such as "HTTP/1.0". > > Three of the four tested browsers (Netscape 4.6, MSIE 3.0, and Mozilla > M12) send the unencoded space in the request URL, which generates an > illegal HTTP Request-Line. > > CGI.pm simply takes that environment variable, chops off everything > from the slash onwards, lowercases it, and returns the result as the > URL scheme. > > Suggested fixes > --------------- > > RFC 1738 and RFC 2068 say that only a-z, 0-9, "+", ".", > and "-" are allowed in scheme names. Accordingly, I suggest the > following change to CGI.pm: > > *** /usr/local/lib/perl5/5.00503/CGI.pm Tue May 18 00:04:20 1999 > --- /home/kragen/lib/perl5/site_perl/5.005//CGI.pm Mon Feb 14 12:07:37 2000 > *************** > *** 2594,2600 **** > return 'https' if $self->server_port == 443; > my $prot = $self->server_protocol; > my($protocol,$version) = split('/',$prot); > ! return "\L$protocol\E"; > } > END_OF_FUNC > > --- 2594,2602 ---- > return 'https' if $self->server_port == 443; > my $prot = $self->server_protocol; > my($protocol,$version) = split('/',$prot); > ! $protocol = lc $protocol; > ! $protocol =~ tr/-+.a-z0-9//cd; > ! return $protocol; > } > END_OF_FUNC > > (Sorry --- I'm using Solaris diff, which doesn't have unified diff > capability.) > > This prevents the exploit, but of course the resulting URL is > incorrect. It won't affect responses to well-formed HTTP requests, > which should never have anything other than HTTP for the $protocol to > begin with. > > It might be smarter to always return 'http' when not returning 'https'; > I'm not presently aware of any protocols other than HTTP and SSL HTTP used with > CGI. The current draft CGI spec says: > > Note that the scheme and the protocol are not identical; for > instance, a resource accessed via an SSL mechanism may have a > Client-URI with a scheme of "https" rather than "http". > CGI/1.1 provides no means for the script to reconstruct this, > and therefore the Script-URI includes the base protocol used. > > . . . in other words, implementing self_url in a way that is guaranteed > to be correct for future non-HTTP CGI implementations is not possible. > > The successful exploit requires a remarkable chain of extreme forgiveness: > 1- The web browser must accept an illegal URL from (possibly valid, > although very unusual) HTML. > 2- The web browser must send an illegal HTTP request with the illegal > URL, without %-encoding the URL to make it legal. > 3- The HTTP server must accept the illegal HTTP request. > 4- The HTTP server must invoke the CGI script with a nonsensical > SERVER_PROTOCOL. > 5- The CGI script must accept the nonsensical SERVER_PROTOCOL and use it to > produce an illegal URL, which it must then embed in HTML it outputs. > 6- The web browser must then trust the output of the CGI script in some > fashion inappropriate to the supplier of the original URL. > > Netscape 4.6, MSIE 3.0, and Mozilla M12 (and, I would guess, most Web > browsers) will happily perform steps 1 and 2; Apache 1.3.6 (and, I > would guess, most Web servers) will happily perform steps 3 and 4; any > program using CGI.pm and embedding self_url's return value in their > outputs will perform step 5; and as CERT advisory CA-2000-02 documents, > there are a wide variety of situations that can cause step 6 to > happen. > > My patch above breaks the chain at step 5. It would be nice to break > it at other steps as well. > > The HTTP requests used in this exploit are broken --- i.e. by having a > Request-Line that has a protocol name that not only fails to be "HTTP", > but actually fails to be a valid protocol name at all. Perhaps Apache > and other web servers should respond to such egregious protocol > violations with error messages, rather than passing the bogus data on > to CGI scripts. > > I have not sent copies of this mail to other web-server teams, because > I do not have the facilities or inclination to properly verify that > they are equally lenient. Preliminary testing suggests that they are > not: > > - IIS 5.0 responds, "The parameter is incorrect". > - Netscape-Enterprise/3.6 responds, "Your browser sent a > message this server could not understand." > - Zeus 3.3 responds with a 400 Bad Request error. > - thttpd 2.15 responds with a 400 Bad Request error. > > I also believe that Web browsers should take some steps to avoid > sending illegal HTTP requests; since the problem here happens only when > both the server and browser are trusted --- perhaps due to some earlier > authentication exchange between them --- while the URL is untrusted, > the browser should validate the URL, at least to the point of not > sending illegal requests to the server. > > References > ---------- > > http://www.w3.org/CGI/ --- information about CGI > http://Web.Golux.Com/coar/cgi/draft-coar-cgi-v11-03-clean.html --- current > draft specification for CGI > http://www.cert.org/advisories/CA-2000-02.html --- CERT advisory CA-2000-02, > "Malicious HTML Tags Embedded in Client Web Requests" > RFC 1738, http://info.internet.isi.edu:80/in-notes/rfc/files/rfc1738.txt --- > "Uniform Resource Locators (URL)" --- in particular, section 2.1, > which defines the syntax of scheme names > RFC 2068, http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2068.txt --- > "Hypertext Transfer Protocol -- HTTP/1.1" > --- in particular, section 3.2.1, which defines the syntax of > URI scheme names identically to RFC 1738, but including > uppercase US-ASCII letters. > --- and section 5.1, which defines the syntax of HTTP Request-Lines, > indicating (together with the sections defining URI syntax and > section 33.1, defining HTTP-Version syntax) that they must > contain exactly two spaces. > http://stein.cshl.org/WWW/CGI/ --- documentation for CGI.pm > http://www.apache.org/info/css-security/apache_specific.html --- changes made > to Apache in response to CA-2000-02 > http://www.netcraft.co.uk/survey/ --- Netcraft Web Server Survey, > which lists the most popular web server software > > -- > <kragen [at] pobox> Kragen Sitaker <http://www.pobox.com/~kragen/> > The Internet stock bubble didn't burst on 1999-11-08. Hurrah! > <URL:http://www.pobox.com/~kragen/bubble.html> > The power didn't go out on 2000-01-01 either. :)
|