Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: NANOG: users

Regular Expression for IPv6 addresses

 

 

NANOG users RSS feed   Index | Next | Previous | View Threaded


Richard.E.Brown at dartware

Feb 4, 2010, 2:50 PM

Post #1 of 14 (2507 views)
Permalink
Regular Expression for IPv6 addresses

Folks,

My company, Dartware, have derived a regex for testing whether an IPv6 address
is correct. I've posted it in my blog:

http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6

This has links to the regular expression, a (Perl) program that tests various
correct and malformed addresses, and a Ruby implementation of the same.

Hope it's useful.

Rich Brown richard.e.brown [at] dartware
Dartware, LLC http://www.dartware.com
66-7 Benning Street Telephone: 603-643-9600
West Lebanon, NH 03784-3407 Fax: 603-643-2289


jeroen at unfix

Feb 4, 2010, 4:31 PM

Post #2 of 14 (2443 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

Richard E. Brown wrote:
> Folks,
>
> My company, Dartware, have derived a regex for testing whether an IPv6
> address is correct. I've posted it in my blog:
>
> http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6
>
>
> This has links to the regular expression, a (Perl) program that tests
> various correct and malformed addresses, and a Ruby implementation of
> the same.

You know, link local addresses (fe80::/10) are quite useless without
specifying the zone of that address. See section 11 of RFC4007.

The only proper way of "testing" if an address is a valid IPv6 address
is to feed it to getaddrinfo() and then use it through that API.
Yes, you can make some assumptions, but it has shown that people
assuming that everything stayed under 2001::/16 also got it wrong at one
point in time. Thus just feed it to getaddrinfo() if you really need it.

Greets,
Jeroen
Attachments: signature.asc (0.19 KB)


marka at isc

Feb 4, 2010, 5:02 PM

Post #3 of 14 (2442 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

In message <4B6B66FF.50108 [at] spaghetti>, Jeroen Massar writes:
> Richard E. Brown wrote:
> > Folks,
> >=20
> > My company, Dartware, have derived a regex for testing whether an IPv6
> > address is correct. I've posted it in my blog:
> >=20
> > http://intermapper.ning.com/profiles/blogs/a-regular-expression-for=
> -ipv6
> >=20
> >=20
> > This has links to the regular expression, a (Perl) program that tests
> > various correct and malformed addresses, and a Ruby implementation of
> > the same.
>
> You know, link local addresses (fe80::/10) are quite useless without
> specifying the zone of that address. See section 11 of RFC4007.
>
> The only proper way of "testing" if an address is a valid IPv6 address
> is to feed it to getaddrinfo() and then use it through that API.
> Yes, you can make some assumptions, but it has shown that people
> assuming that everything stayed under 2001::/16 also got it wrong at one
> point in time. Thus just feed it to getaddrinfo() if you really need it.
>
> Greets,
> Jeroen

And now for the trick question. Is ::ffff:077.077.077.077 a legal
mapped address and if it, does it match 077.077.077.077?

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka [at] isc


jeroen at unfix

Feb 4, 2010, 5:16 PM

Post #4 of 14 (2441 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

Mark Andrews wrote:
[..]
> And now for the trick question. Is ::ffff:077.077.077.077 a legal
> mapped address and if it, does it match 077.077.077.077?

::ffff:0:0:0:0/96 should never ever be shown to a user, as it is
confusing (is it IPv6 or IPv4?) and does not make sense at all.
As such whatever one thinks of it, it is "illegal" in that context.

Internally inside a program though using a 128bit sequence of memory is
of course a great way to store both IPv6 and IPv4 addresses in one
structure and that is where the ::ffff:0:0:0:0::/96 format is very
useful and intended for. Of course still the representation to the user
of addresses stored that way would be 77.77.77.77 (and thus an IPv4
address and not IPv6) even though internally it is written as an IPv6
address.

As that usage is internal, you don't need any validation of the format
as the input will be either an IPv6 or IPv4 address without any of the
compatibility stuff, thus one does not need to handle it anyway.

Of course, there should be only limited places where a user can enter or
see IP addresses in the first place. There is this great thing called
DNS which is what most people should be using.

Greets,
Jeroen
Attachments: signature.asc (0.19 KB)


marka at isc

Feb 4, 2010, 5:53 PM

Post #5 of 14 (2444 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

In message <4B6B7185.2080708 [at] spaghetti>, Jeroen Massar writes:
> Mark Andrews wrote:
> [..]
> > And now for the trick question. Is ::ffff:077.077.077.077 a legal
> > mapped address and if it, does it match 077.077.077.077?
>
> ::ffff:0:0:0:0/96 should never ever be shown to a user, as it is
> confusing (is it IPv6 or IPv4?) and does not make sense at all.
> As such whatever one thinks of it, it is "illegal" in that context.
>
> Internally inside a program though using a 128bit sequence of memory is
> of course a great way to store both IPv6 and IPv4 addresses in one
> structure and that is where the ::ffff:0:0:0:0::/96 format is very
> useful and intended for. Of course still the representation to the user
> of addresses stored that way would be 77.77.77.77 (and thus an IPv4
> address and not IPv6) even though internally it is written as an IPv6
> address.

You missed the point 077 is octal and 077.077.077.077 is 63.63.63.63
in the IPv4 address whereas it is decimal dotted quad in a mapped
address *if* zero padded decimal dotted quad is legal in a IPv6
text form.

> As that usage is internal, you don't need any validation of the format
> as the input will be either an IPv6 or IPv4 address without any of the
> compatibility stuff, thus one does not need to handle it anyway.
>
> Of course, there should be only limited places where a user can enter or
> see IP addresses in the first place. There is this great thing called
> DNS which is what most people should be using.
>
> Greets,
> Jeroen
>
>
> --------------enig57675C04A65E0982D8079586
> Content-Type: application/pgp-signature; name="signature.asc"
> Content-Description: OpenPGP digital signature
> Content-Disposition: attachment; filename="signature.asc"
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.12 (MingW32)
>
> iEYEARECAAYFAktrcYgACgkQKaooUjM+fCPUCQCgmwJ8u2Zqi1ljQ+PVOByv45Jv
> OrgAn2iTiqdLdFWT5a9vlM6dUe6McqEO
> =OqJc
> -----END PGP SIGNATURE-----
>
> --------------enig57675C04A65E0982D8079586--
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka [at] isc


sthaug at nethelp

Feb 4, 2010, 10:15 PM

Post #6 of 14 (2443 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

> > And now for the trick question. Is ::ffff:077.077.077.077 a legal
> > mapped address and if it, does it match 077.077.077.077?
>
> ::ffff:0:0:0:0/96 should never ever be shown to a user, as it is
> confusing (is it IPv6 or IPv4?) and does not make sense at all.
> As such whatever one thinks of it, it is "illegal" in that context.

Define "user"? Both Cisco and Juniper use these addresses for IPv6
L3VPNs, and the addresses are definitely visible. Cisco and Juniper
examples:

B 2001:abcd:60:3::/64
[200/0] via ::ffff:172.16.101.204 (nexthop in vrf default), 4d10h
B 2001:abcd:60:4::/64
[200/0] via ::ffff:172.16.101.205 (nexthop in vrf default), 4d10h
B 2001:abcd:60:7::/64
[200/0] via ::ffff:172.16.1.7 (nexthop in vrf default), 6d13h


::ffff:172.16.1.1/128
*[LDP/6] 4d 11:01:30, metric 1
> to 172.16.102.201 via ge-0/3/0.0, Push 313008
::ffff:172.16.1.2/128
*[LDP/6] 1w0d 20:27:12, metric 1
> to 172.16.102.201 via ge-0/3/0.0, Push 312240
::ffff:172.16.1.3/128
*[LDP/6] 4d 11:01:30, metric 1
> to 172.16.102.201 via ge-0/3/0.0, Push 313024

Steinar Haug, Nethelp consulting, sthaug [at] nethelp


isabeldias1 at yahoo

Feb 5, 2010, 2:37 AM

Post #7 of 14 (2435 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

I Just Don't Know What To Do With Myself


----- Original Message ----
From: Jeroen Massar <jeroen [at] unfix>
To: Mark Andrews <marka [at] isc>
Cc: nanog [at] nanog; Richard E. Brown <Richard.E.Brown [at] dartware>
Sent: Fri, February 5, 2010 1:16:53 AM
Subject: Re: Regular Expression for IPv6 addresses

Mark Andrews wrote:
[..]
> And now for the trick question. Is ::ffff:077.077.077.077 a legal
> mapped address and if it, does it match 077.077.077.077?

::ffff:0:0:0:0/96 should never ever be shown to a user, as it is
confusing (is it IPv6 or IPv4?) and does not make sense at all.
As such whatever one thinks of it, it is "illegal" in that context.

Internally inside a program though using a 128bit sequence of memory is
of course a great way to store both IPv6 and IPv4 addresses in one
structure and that is where the ::ffff:0:0:0:0::/96 format is very
useful and intended for. Of course still the representation to the user
of addresses stored that way would be 77.77.77.77 (and thus an IPv4
address and not IPv6) even though internally it is written as an IPv6
address.

As that usage is internal, you don't need any validation of the format
as the input will be either an IPv6 or IPv4 address without any of the
compatibility stuff, thus one does not need to handle it anyway.

Of course, there should be only limited places where a user can enter or
see IP addresses in the first place. There is this great thing called
DNS which is what most people should be using.

Greets,
Jeroen


Richard.E.Brown at DARTWARE

Feb 6, 2010, 2:16 PM

Post #8 of 14 (2387 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

Folks,

Thanks for all the comments on the IPv6 regex...

-----
> Jeroen Massar <jeroen [at] unfix> wrote:
>
> The only proper way of "testing" if an address is a valid IPv6 address
> is to feed it to getaddrinfo() and then use it through that API.

Good point. One of the reasons to do this was for environments where getaddrinfo()
might not be availble. (For example, in a Javascript - see the page on the InterMapper
site: http://intermapper.com/ipv6validator )

> Of course, there should be only limited places where a user can enter or
> see IP addresses in the first place. There is this great thing called
> DNS which is what most people should be using.

Another good point. But look at Seiichi's note (below)...

-----
> Seiichi Kawamura <kawamucho [at] mesh> also> wrote:

> This might be of some interest to you.
>
> http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04

We believe this correctly recognizes all the cases specified by RFC4291. But it's
a great idea to update the Javascript page above to reformat to this recommendation.

-----
> Carsten Bormann <cabo [at] tzi>
>
> I was looking at this regexp for my Ruby course as a great example of "how not
> to use Regexps".
>
> But thank you for this textbook example!
> And, of course, an error did creep in.

You're quite welcome! :-)

But seriously... Do you have an example of an address that causes the RE to fail?

> If you really need an RE, the right approach here is to write a small program
>to
> generate the RE.

Absolutely. The Perl program cited includes code much like your example.

Rich Brown richard.e.brown [at] dartware
Dartware, LLC http://www.dartware.com
66-7 Benning Street Telephone: 603-643-9600
West Lebanon, NH 03784-3407 Fax: 603-643-2289

PS Dartware is sponsoring a table displaying our InterMapper network monitoring
software at the NANOG48 Beer 'n Gear. Please come by to introduce yourself and
take a look...


mysidia at gmail

Feb 6, 2010, 2:52 PM

Post #9 of 14 (2371 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

On Fri, Feb 5, 2010 at 12:15 AM, <sthaug [at] nethelp> wrote:
>> > And now for the trick question.  Is ::ffff:077.077.077.077 a legal
>> > mapped address and if it, does it match 077.077.077.077?

Wasn't there an internet draft on that subject, recently?
http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04

077.077.077.077 is equivalent to 77.77.77.77 if valid at all
RFC 4038 is very clear that the text representation of a mapped IPv4
address is Base 10. http://tools.ietf.org/html/rfc4038#section-5.1

This is a bit like asking if "::ffff:10.1.2" is a valid IP
address though.
And is it the same as the ip address "10.1.2" ?

(Which of course expands to 10.1.0.2, on common implementations of
inet_pton, inet_aton, and getaddrinfo) Or ::ffff:0xA010002

I would say these are perfectly valid _shorthands_ and
abbreviations for entering an IP address, which may be provided by
some systems, but that they are non-canonical text representations
for displaying publishing or sharing IP addresses.

--
-J


marka at isc

Feb 6, 2010, 4:43 PM

Post #10 of 14 (2362 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

In message <6eb799ab1002061452s51f9cf61p303d36130291301 [at] mail>, James
Hess writes:
> On Fri, Feb 5, 2010 at 12:15 AM, <sthaug [at] nethelp> wrote:
> >> > And now for the trick question. =A0Is ::ffff:077.077.077.077 a legal
> >> > mapped address and if it, does it match 077.077.077.077?
>
> Wasn't there an internet draft on that subject, recently?
> http://tools.ietf.org/html/draft-ietf-6man-text-addr-representation-04
>
> 077.077.077.077 is equivalent to 77.77.77.77 if valid at all
> RFC 4038 is very clear that the text representation of a mapped IPv4
> address is Base 10. http://tools.ietf.org/html/rfc4038#section-5.1

But 077.077.077.077 is octal dotted quad. Decimal dotted quad does
*not* have leading zeros. The point of allowing for dotted quad
is to allow for easy mapping between IPv4 representation and IPv6
with encoded IPv4 representations. Accepting a octal representation
as decimal is a bad thing and leads to none obvious failures.

% ping 077.077.077.077
PING 077.077.077.077 (63.63.63.63): 56 data bytes
^C
--- 077.077.077.077 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
%

"ping ::ffff:077.077.077.077" would not get to same box if my ping
accepted that as a address literal which luckily it doesn't.

> This is a bit like asking if "::ffff:10.1.2" is a valid IP
> address though.

Except it clearly isn't as there are not 4 components.

> And is it the same as the ip address "10.1.2" ?

> (Which of course expands to 10.1.0.2, on common implementations of
> inet_pton, inet_aton, and getaddrinfo) Or ::ffff:0xA010002

inet_pton() did not accept 10.1.2 when it was originally written.
This was a *deliberate* decision. Some vendors have changed it to
accept it but they are wrong. I can say that because I was involved
in making that decision.

> I would say these are perfectly valid _shorthands_ and
> abbreviations for entering an IP address, which may be provided by
> some systems, but that they are non-canonical text representations
> for displaying publishing or sharing IP addresses.


> --
> -J
>
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka [at] isc


thomas at habets

Feb 9, 2010, 7:01 AM

Post #11 of 14 (2223 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

On Fri, 5 Feb 2010, Mark Andrews wrote:
> And now for the trick question. Is ::ffff:077.077.077.077 a legal
> mapped address and if it, does it match 077.077.077.077?

Forget IPv6. The first question is does 077.077.077.077 match
077.077.077.077 in IPv4?

The answer is a long one full of different answers depending on
who's doing the parsing (gethostbyname(), inet_aton(),
inet_net_pton(), etc..) and on what OS. And also on many bugs.

And don't count on the documentation being right either, or parsers
respecting standards (single unix or RFCs, or which one when they
conflict). And don't expect an error code if you feed 080.080.080.080
into a parser, even one that *does* read it as octal.

Don't prefix IP (v4) address octets with zero wether you expect it to be
treated as octal or not. Just don't. World of hurt and all that.

E.g.:
http://kerneltrap.org/mailarchive/openbsd-bugs/2009/6/6/5882713/thread

We should all do like one vendor I've seen where you enter the IP (v4)
address in binary... and then pad with zeroes to whatever size html form
wanted. Yes, this decade.

---------
typedef struct me_s {
char name[] = { "Thomas Habets" };
char email[] = { "thomas [at] habets" };
char kernel[] = { "Linux" };
char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;


marka at isc

Feb 9, 2010, 2:12 PM

Post #12 of 14 (2204 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

In message <alpine.DEB.1.10.1002091548170.25663 [at] red>, Thomas
Habets writes:
> On Fri, 5 Feb 2010, Mark Andrews wrote:
> > And now for the trick question. Is ::ffff:077.077.077.077 a legal
> > mapped address and if it, does it match 077.077.077.077?
>
> Forget IPv6. The first question is does 077.077.077.077 match
> 077.077.077.077 in IPv4?

I think you meant "does 077.077.077.077 match 77.77.77.77 in IPv4".

> The answer is a long one full of different answers depending on
> who's doing the parsing (gethostbyname(), inet_aton(),
> inet_net_pton(), etc..) and on what OS. And also on many bugs.

Indeed. It's a minefield out there for application developers that
want consistancy. Even when you develop your own some OS vendor will
go and stuff it up on you.

> And don't count on the documentation being right either, or parsers
> respecting standards (single unix or RFCs, or which one when they
> conflict). And don't expect an error code if you feed 080.080.080.080
> into a parser, even one that *does* read it as octal.
>
> Don't prefix IP (v4) address octets with zero wether you expect it to be
> treated as octal or not. Just don't. World of hurt and all that.
>
> E.g.:
> http://kerneltrap.org/mailarchive/openbsd-bugs/2009/6/6/5882713/thread
>
> We should all do like one vendor I've seen where you enter the IP (v4)
> address in binary... and then pad with zeroes to whatever size html form
> wanted. Yes, this decade.
>
> ---------
> typedef struct me_s {
> char name[] = { "Thomas Habets" };
> char email[] = { "thomas [at] habets" };
> char kernel[] = { "Linux" };
> char *pgpKey[] = { "http://www.habets.pp.se/pubkey.txt" };
> char pgp[] = { "A8A3 D1DD 4AE0 8467 7FDE 0945 286A E90A AD48 E854" };
> char coolcmd[] = { "echo '. ./_&. ./_'>_;. ./_" };
> } me_t;
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: marka [at] isc


Valdis.Kletnieks at vt

Feb 9, 2010, 2:25 PM

Post #13 of 14 (2210 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

On Wed, 10 Feb 2010 09:12:11 +1100, Mark Andrews said:
> In message <alpine.DEB.1.10.1002091548170.25663 [at] red>, Thomas
> Habets writes:
> > On Fri, 5 Feb 2010, Mark Andrews wrote:
> > > And now for the trick question. Is ::ffff:077.077.077.077 a legal
> > > mapped address and if it, does it match 077.077.077.077?
> >
> > Forget IPv6. The first question is does 077.077.077.077 match
> > 077.077.077.077 in IPv4?
>
> I think you meant "does 077.077.077.077 match 77.77.77.77 in IPv4".

No, he had it right, because...

> > The answer is a long one full of different answers depending on
> > who's doing the parsing (gethostbyname(), inet_aton(),
> > inet_net_pton(), etc..) and on what OS. And also on many bugs.
>
> Indeed. It's a minefield out there for application developers that
> want consistancy. Even when you develop your own some OS vendor will
> go and stuff it up on you.

There's no guarantee that 2 different binaries on the same box will resolve
077.077.077.077 to the same 32-bit sequence, so it's in fact possible that
it's not even equal to itself, much less 77.77.77.77.


phil.pennock at spodhuis

Feb 19, 2010, 5:08 PM

Post #14 of 14 (1876 views)
Permalink
Re: Regular Expression for IPv6 addresses [In reply to]

On 2010-02-04 at 17:50 -0500, Richard E. Brown wrote:
> My company, Dartware, have derived a regex for testing whether an IPv6 address
> is correct. I've posted it in my blog:
>
> http://intermapper.ning.com/profiles/blogs/a-regular-expression-for-ipv6
>
> This has links to the regular expression, a (Perl) program that tests various
> correct and malformed addresses, and a Ruby implementation of the same.

There's a full grammar in RFC 3986 (URI Generic Syntax) already, which
can be translated straight. It too handles the embedded IPv4 addresses.

While your code is written in a more condensed manner, those who want to
be able to cross-check against the RFC might want to take a look at this
one, which emits a PCRE regexp:
http://people.spodhuis.org/phil.pennock/software/emit_ipv6_regexp-0.304
http://people.spodhuis.org/phil.pennock/software/emit_ipv6_regexp-0.304.asc

(Version numbers for repository, not for that one script :) ).

FWIW, the ability to grab a shell variable which contains an RE for IPv6
addresses, which can be used in:
pcregrep "$ipv6_regex" log_file
has proven very useful, especially when debugging newly-added IPv6
support for an app. This is also the most coherent justification I've
come up with so far for using a regexp instead of a dedicated parser,
other than "because I could".

Regards,
-Phil

NANOG users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.