Everything about nothing: ipv6

Showing posts with label ipv6. Show all posts

Saturday, January 2, 2016

Processing RA messages in NetworkManager

The goal of this post is to analyze processing of RA messages through the NetworkManager code, starting with the initial reception all the way through the assignment of parameters to a device through which the RA was received. As a special case will also take a look what happens when RS is sent, i.e. what is different in comparison to unsolicited RAs. But first, we'll take a look at the relevant code organization and initialization process.

Code to process RAs and initialization phase

For receiving RA and sending RA NetworkManager uses libndp library. This library is used in class NM_TYPE_LNDP_RDISC (defined in the file src/rdisc/nm-lndp-rdisc.c) which is a platform specific class tailored for the Linux OS. This class inherits from class NM_TYPE_RDISC (defined in the file src/rdisc/nm-rdisc.c) which is a platform independent base class. It contains functionality that is platform independent so theoretically NetworkManager can be more easily ported to, e.g. FreeBSD.

To create a new object of the type NM_TYPE_LNDP_RDISC it is necessary to call function nm_lndp_rdisc_new(). This is, for example, done by the class NM_DEVICE_TYPE in function addrconf6_start() for each device for which IPv6 configuration is started.

Now, if NetworkManager will use RAs or not depends on the configuration setting for IPv6 that the user defines. If you go to configuration dialog for some network interface there is a setting for IPv6 configuration which might be ON or OFF. In case it is OFF, no IPv6 configuration will be done. If IPv6 configuration is enabled (switch placed in ON state) then the specific configuration methods should be selected. The selected option is checked in the function src/devices/nm-device.c:act_stage3_ip6_config_start(), where, depending on the option selected, a specific initialization is started:

Automatic (method NM_SETTING_IP6_CONFIG_METHOD_AUTO)

Start full IPv6 configuration by calling src/devices/nm-device.c:addrconf6_start() function.
Automatic, DHCP only (method NM_SETTING_IP6_CONFIG_METHOD_DHCP)

Only configuration based on DHCP parameters received will be done. This type of configuration is initiated by calling function src/devices/nm-device.c:dhcp_start().
Manual (method NM_SETTING_IP6_CONFIG_METHOD_MANUAL)

Manual configuration using parameters specified in the configuration dialog and nothing else. The configuration of this type is initiated by calling function nm_ip6_config_new() which returns appropriate IPv6 configuration object.
Link-local Only (method NM_SETTING_IP6_CONFIG_METHOD_LINK_LOCAL)

Initiate only a link-local address configuration by calling function src/devices/nm-device.c:linklocal6_start().

Since in this post we are concerned with RA processing than we are obviously interested only in Automatic configuration type, the one that calls addrconf6_start() function. This function, in turn, calls function src/nm-device.c:linklocal6_start() to ensure that link local configuration is present. It might happen that link local address isn't configured and so RA configuration must wait, or link local configuration is still present. In either case, when link local configuration is present RA processing can start. RA processing is kicked off by calling src/nm-device.c:addrconf6_start_with_link_ready() which in turn calls src/nm-rdisc.c:nm_rdisc_start() to kick off RA configuration.

nm_rdisc_start() is called with a pointer to NM_LNDP_RDISC class (defined in src/rdisc/nm_lndp_rdisc.c). Note that a method (nm_rdisc_start()) from a base class (NM_RDISC_TYPE, defined in src/rdisc/nm_rdisc.c) is called with a pointer to a subclass of a NM_RDISC_TYPE! Method in a base class does the following:

Checks that there is a subclass which defined virtual method start() (gassert(klass->start)).
Initializes timeout for the configuration process. If timeout fires, then rdisc_ra_timeout_cb() will be called that emits NM_RDISC_RA_TIMEOUT signal.
Invokes a method start() from a subclass. Subclass is, as already said, NM_LNDP_RDISC and the given method registers a callback src/rdisc/nm-lndp-rdisc.c:receive_ra() with libndp library. The callback is called by libndp library whenever RA is received.
Starts solicit process by invoking solicit() method. This method schedules RS to be sent in certain amount of time (variable next) by send_rs() method. This method, actually, invokes send_rs() method from a subclass (src/nm-rdisc/nm-rdisc-linux.c:send_rs()) which sends RS using libndp library. Note that the number of RSes sent is bounded and after certain amount of them sent the process is stopped under the assumption that there is no IPv6 capable router on the network.
After RA has been received and processed the application of configuration parameters is done in src/device/nm-device.c:rdisc_config_changed() method. This callback is achieved by registering to NM_RDISC_CONFIG_CHANGED signal that is emitted by src/rdisc/nm-rdisc.c class whenever IPv6 configuration changes.

So, in conclusion, when link local configuration is finished, RA processing is started. The RA processing consists of waiting for RA in src/rdisc/nm-lndp-rdisc.c:receive_ra(). If RA doesn't arrive is certain amount of time then RS is sent in function src/nm-rdisc/nm-rdisc-linux.c:send_rs().

RA processing

When RA is received it is processed by the function src/rdisc/nm-lndp-rdisc.c:receive_ra(). The following configuration options are processed from RA by the given function:

DHCP level.
Default gateway.
Addresses and routes.
DNS information (RDNSS option).
DNS search list (DNSSL option).
Hop limit.
MTU.

All the options that were parsed are stored (or removed from) a private attributes of the base object (NMRDisc defined in src/rdisc/nm-rdisc.h).

Finally, the method src/nm-rdisc.c:nm_rdisc_ra_received() is called to cancel all the timeouts. It will also emit signal NM_RDISC_CONFIG_CHANGED that will trigger application of received configuration parameters to a networking device.

Processing RS/RA

The RS/RA processing differs only by the fact that RS is sent after certain amount of time has passed and RA wasn't received, as described in the Code to process RAs and initialization phase section. After RS is sent, the RA processing is the same as it would be without RS being sent.

Applying IPv6 configuration data

Application of received IPv6 configuration data is done in the method src/device/nm-device.c:rdisc_config_changed(). IPv6 configuration is stored in IPv6 configuration object NM_TYPE_IP6_CONFIG defined in src/nm-ip6-config.c.

Note that this isn't the real application of configuration data, but only that the configuration data is stored in the appropriate object.

The function that really applies configuration data is src/devices/nm-device.c:ip6_config_merge_and_apply().

Sunday, June 29, 2014

Private addresses in IPv6 protocol

It is almost a common wisdom that 172.16.0.0/12, 192.168.0.0/16, and 10.0.0.0/8 are private network addresses that should be used when you don't have assigned address, or you don't intend to connect to the Internet (at least not directly). With IPv6 being ever more popular, and necessary, the question is which addresses are used for private networks in that protocol. In this post I'll try to answer that question.

The truth is that in IPv6 there are two types of private addresses, link local and unique local addresses. Link local IPv6 addresses, as the name suggests, are valid only on a single link. For example, on a single wireless network. You'll recognize those addresses by their prefix, which is fe80::/10, and they are automatically configured by appending interface's unique ID. IPv4 also has link local address, though it is not so frequently used. Still, maybe you noticed it when your DHCP didn't work and suddenly you had address that starts with 169.254.0.0./16. This was a link local IPv4 address configured. The problem with link local addresses is that they can not be used in case you try to connect two or more networks. They are only valid on a single network, and packets having those addresses are not routable! So, we need something else.

Unique local addresses (ULA), defined in RFC4193, are closer to IPv4 private addresses. That RFC defines ULA format and how to generate them. Basically, those are addresses with the prefix FC00::/7. These addresses are treated as normal, global, addresses, but are only valid inside some restricted area and can not be used on the global Internet. This is the same as saying that 10.0.0.0/8 addresses can be used within some private networks, but are not allowed on a global Internet. You choose how this conglomerate of networks will be connected, what prefixes used, etc.

There is difference, though. Namely, it is expected that ULA will be unique in the world. You might ask why is that important, when those addresses are not allowed on the Internet anyway. But, that is important. Did it ever happened to you that you had to connect two private IPv4 networks (directly via router, via VPN, etc.), and coincidentally, both used, e.g. 192.168.1.0/24 prefix? Such situations are a pain to debug, and require renumbering or some nasty tricks to make them work. So, being unique is an important feature.

So, the mentioned RFC, actually specifies how to generate ULA with /48 prefix and a high probability of the prefix being unique. Let's first see the exact format of ULA:

| 7 bits |1| 40 bits | 16 bits | 64 bits |
+--------+-+------------+-----------+----------------------------+
| Prefix |L| Global ID | Subnet ID | Interface ID |
+--------+-+------------+-----------+----------------------------+

First 8 bits have a fixed value 0xFD. As you can see, prefix is 7 bit, but L bit must be set to 1 if the address is specified according to the RFC4193. So, first 8 bits are fixed to the value 0xFD. Note that L bit set to 0 isn't specified, it is something left for the future. Now, the main part is Global ID, whose length is 40 bits. That one must be generated in such a way to be unique with high probability. This is done in the following way:

Obtain current time in a 64-bit format as specified in the NTP specification.
Obtain identifier of a system running this algorithm (EUI-64, MAC, serial number).
Concatenate the previous two and hash the concatenated result using SHA-1.
Take the low order 40 bits as a Global ID.

The prefix obtained can be used now for a site. Subnet ID can be further used for multiple subnets within a site. There are Web based implementations of the algorithm you can use to either get a feeling of the generated addresses, or to generate prefix for your concrete situation.

Occasionally you'll stumble upon so called site local addresses. Those addresses were defined starting with the initial IPv6 addressing architecture in RFC1884 and were also defined in subsequent revisions of addressing architecture (RFC2373, RFC3513) but were finally deprecated in RFC3879. Since they were defined for so long (8 years) you might stumble upon them in some legacy applications. They are recognizable by their prefix FEC0::/10. You shouldn't use them any more, but use ULA instead.

Thursday, January 31, 2013

IPV6 in enterprise best practices/white papers

From time to time I look what's going on on a Nanog mailing list. It is a very interesting mailing list in which quite often something very interesting pops up. You don't have to sign to this mailing list in order to see posts, there are publicly available archives, which might be a better option for those sporadically looking at this list. This time my eye caught a thread with the subject line as in the title of this post. So, since IPv6 is hot topic these days, or at least it seems so, I decided to read through this thread and make summary along with pointers to materials that were linked to.

The thread was started on January 26th, 2013. by Pavel Dimow who asked for a real world example of IPv6 deployment in enterprise. More specifically, he said that he thinks that the procedure to introduce IPv6 is:

Create address plan.
Implement security on routers/switches and then hosts.
Create AAAA and PTR records in DNS.
Configure DCHPv6.
Test IPv6 in LAN.
Configure BGP with ISP.

He also wondered how to maintain PTR records in case SLAAC or DHCPv6 is used and should he use DDNS for that purpose. Finally, he asked weather to use SLAAC or DHCPv6.

The general consensus of repliers was that first IPv6 connectivity to the Internet should be established. The reason is that operating systems prefer IPv6 over IPv4 and if there is AAAA record, along with localy assigned IPv6 address, then IPv6 connection will be first established. Since, if you configure Internet connectivity as a last step, there is no path to destination, timeouts will have to expire in order to detect missing IPv6 connectivity and in the end users will experience delays. This scenario actually happened in one network I used. Namely, intranet Web server was given IPv6 address to test that IPv6 worked. Since all operating systems today have IPv6 enabled by default clients on a local network tried to connect to Web server using IPv6 which wasn't possible since only a small part of intranet got IPv6 connectivity. Still, it turns out that it is possible to configure address preferences in an OS (though, I don't know which ones yet). And, there is a draft that defines how address preferences can be distributed via DHCPv6.

After obtaining addresses from ISP and making address plan the next step would be to configure network equipment, preferably not everything, something for testing. Very important is to get at least some experience with IPv6 before deploying it in a production environment. To get experience there are tunnel broker services that are free and very good. HE.net apparetnly also allows free IPv6 BGP connectivity via tunnels.

Here is more specific series of steps to introduce IPv6. This one was written by a person doing an actual deployment. Note that deployer had its own ASN:

get a /48 PI from the local LIR
configure the border routers to announce the prefix and do connectivity tests (ping Google/Facebook addresses using an IPv6 address from our own /48 - loopback on the router)
configure IPv6 addresses on internal router and do connectivity tests again
configure firewall interfaces with IPv6 addresses and again connectivity tests
configure IPv6 firewall rules (mostly a mirror of the IPv4 rulesets)
configure IPv6 address on DMZ servers (actually the first one configured were the DNS servers)
do connectivity tests again
publish IPv6 records for the DNS servers and for the domain and run ping/telnet 80 tests from another ipv6 enabled network to check that everything is OK.
publish AAAA records for all the hosts in the DMZ and making sure all the services available on IPv4 were also available on IPv6
did the same for the servers in the "Server network"
last step was to enable IPv6 on the network that served the users using RA with the stateful configuration bit set on the firewall and DHCPv6 to serve up DNS servers for IPv6

Security is very important aspect in any network, so it is in IPv6, too. Some of the IPv4 security mechanisms translate to IPv6 security, e.g. DHCP snooping, but there are some IPv6 specific things to be aware of, like RAs.

Scalability is other very important aspect of any network. There was subthread about snooping MLD, or lack of snooping. Namely, there are high density VM deployments in which even high end switches don't have enough processing/storage power. In that case, multicasting degrades to broadcasting. In one post a poster asked about some figures from real world switches, e.g. maximum number of multicast groups, but unfortunately, there was no answer.

Finally, very good source of different documentation about IPv6 deployment is Internet Society's Deploy360 pages. There are documents that describes how to develop address plan and Aaron Hughes presentation from NANOG.

Monday, November 12, 2012

Do you want to connect to IPv6 Internet in a minute or so?

Well, I just learned a very quick way to connect to IPv6 Internet that really works! That is, if you have Fedora 17, but probably for other distributions it is equally easy. Here are two commands to execute that will enable IPv6 network connectivity to you personal computer:

yum -y install gogoc
systemctl start gogoc.service

First command installs package gogoc, while the second one starts it. Next time you'll need only start command. After the start command apparently nothing will happen but in a minute or so you'll have working IPv6 connection. Check it out:

# ping6 www.google.com
PING www.google.com(muc03s02-in-x13.1e100.net) 56 data bytes
64 bytes from muc03s02-in-x13.1e100.net: icmp_seq=1 ttl=54 time=54.0 ms
64 bytes from muc03s02-in-x13.1e100.net: icmp_seq=2 ttl=54 time=55.0 ms
^C
--- www.google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 54.023/54.551/55.080/0.577 ms

As you can see, Google is reachable on IPv6 addresses. You can also try traceroute6:

# traceroute6 www.google.com
traceroute to www.google.com (2a00:1450:4016:801::1011), 30 hops max, 80 byte packets
1 2001:5c0:1400:a::722 (2001:5c0:1400:a::722) 40.097 ms 42.346 ms 45.937 ms
2 ve8.ipv6.colo-rx4.eweka.nl (2001:4de0:1000:a22::1) 47.548 ms 49.498 ms 51.760 ms
3 9-1.ipv6.r2.am.hwng.net (2001:4de0:a::1) 55.613 ms 56.808 ms 60.062 ms
4 2-1.ipv6.r3.am.hwng.net (2001:4de0:1000:34::1) 62.570 ms 65.224 ms 66.864 ms
5 1-3.ipv6.r5.am.hwng.net (2001:4de0:1000:38::2) 72.339 ms 74.596 ms 77.970 ms
6 amsix-router.google.com (2001:7f8:1::a501:5169:1) 80.598 ms 38.902 ms 39.548 ms
7 2001:4860::1:0:4b3 (2001:4860::1:0:4b3) 41.833 ms 2001:4860::1:0:8 (2001:4860::1:0:8) 46.500 ms 2001:4860::1:0:4b3 (2001:4860::1:0:4b3) 48.142 ms
8 2001:4860::8:0:2db0 (2001:4860::8:0:2db0) 51.250 ms 54.204 ms 57.569 ms
9 2001:4860::8:0:3016 (2001:4860::8:0:3016) 64.727 ms 67.339 ms 69.540 ms
10 2001:4860::1:0:336d (2001:4860::1:0:336d) 80.203 ms 82.302 ms 85.290 ms
11 2001:4860:0:1::537 (2001:4860:0:1::537) 87.769 ms 91.180 ms 92.931 ms
12 2a00:1450:8000:1f::c (2a00:1450:8000:1f::c) 61.213 ms 54.156 ms 55.931 ms

It simply can not be easier that that. Using ip command you can check address you were given:

# ip -6 addr sh
1: lo: mtu 16436
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: wlan0: mtu 1500 qlen 1000
inet6 fe80::f27b:cbff:fe9f:a33b/64 scope link
valid_lft forever preferred_lft forever
5: tun: mtu 1280 qlen 500
inet6 2001:5c0:1400:a::723/128 scope global
valid_lft forever preferred_lft forever

And also routes:

# ip -6 ro sh
2001:5c0:1400:a::722 via 2001:5c0:1400:a::722 dev tun metric 0
cache
2001:5c0:1400:a::723 dev tun proto kernel metric 256 mtu 1280
2a00:1450:4008:c01::bf via 2a00:1450:4008:c01::bf dev tun metric 0
cache
2a00:1450:400d:803::1005 via 2a00:1450:400d:803::1005 dev tun metric 0
cache
2a00:1450:4013:c00::78 via 2a00:1450:4013:c00::78 dev tun metric 0
cache
2a03:2880:2110:cf01:face:b00c:: via 2a03:2880:2110:cf01:face:b00c:: dev tun metric 0
cache
2000::/3 dev tun metric 1
unreachable fe80::/64 dev lo proto kernel metric 256 error -101
fe80::/64 dev vmnet1 proto kernel metric 256
fe80::/64 dev vmnet8 proto kernel metric 256
fe80::/64 dev wlan0 proto kernel metric 256
fe80::/64 dev tun proto kernel metric 256
default dev tun metric 1

Probably I don't have to mention that if you open Google in a Web browser you'll be using IPv6. :) In case you don't believe me, try using tcpdump (or wireshark) on tun interface.
You can stop IPv6 network by issuing the following command:

systemctl stop gogoc.service

If you try ping6 and traceroute6 commands after that, you'll receive Network unreachable messages, meaning Google servers can not be reached via their IPv6 address.

Friday, June 29, 2012

BIND and network unreachable messages...

Sometimes you'll see messages like the following ones in your log file (messages are slightly obfuscated to protect innocent :)):

Jun 29 14:32:11 someserver named[1459]: error (network unreachable) resolving 'www.eolprocess.com/A/IN': 2001:503:a83e::2:30#53
Jun 29 14:32:11 someserver named[1459]: error (network unreachable) resolving 'www.eolprocess.com/A/IN': 2001:503:231d::2:30#53

What these messages say is that network that contains address 2001:503:231d::2:30 is unreachable. So, what's happening?

The problem is that all modern operating systems support IPv6 out of the box. The same is for growing number of software packages, among them is BIND too. So, operating system configures IPv6 address on interface and application thinks that IPv6 works and configures it. But, IPv6 doesn't work outside of the local network (there is no IPv6 capable router) so, IPv6 addresses, unless in local networks, are unreachable.

So, you might ask now: but everything otherwise works, why is this case special! Well, the problem is that some DNS servers, anywhere in hierarchy, support IPv6, but not all. And when our resolver gets IPv6 address in response, it defaults to it and ignores IPv4. It obviously can not reach it so it logs a message and then tries IPv4. Once again, note that this IPv6 address can pop up anywhere in hierarchy, it isn't necessary to be on the last DNS server. In this concrete case name server for eolprocess.com doesn't support IPv6, but some name server for the top level com domain do support it!

To prevent those messages from appearing add option -4 to bind during startup. On CentOS (Fedora/RHEL) add or modify the line OPTIONS in /etc/sysconfig/named so that it includes option -4, i.e.

OPTIONS="-4"

Everything about nothing