Everything about nothing

Thursday, February 14, 2013

Hotspot JIT output disassembly on Fedora 18

Well, I was very thrilled when I saw that it is possible to output assembly code produced by Hotspot. But, the problem is that this isn't enabled by default, at least not on Fedora 18. It is necessary to compile decompiler plugin before you can try this. To make things worse, this compilation process assumes that you don't have binutils already installed so it tries to compile that too. In the end, I managed to get that working and here is how.

First, you need to download OpenJDK's source. Note that there is source in Fedora's binary repository but this is only the source of Jaba API packages. So, you have to download the real source, either from java.net or approriate SRPM. In both cases be careful to download source that matches OpenJDK you have installed on your machine.

Next, unpack the source and go to the directory openjdk/hotspot/src/share/tools/hsdis. Now, open hsdis.c file and replace the following line:

#include <sysdep.h>

with the following lines:

#include <string.h>
#include <errno.h>

Now, compile the source using the following command:

gcc -o hsdis-amd64.so -DLIBARCH_amd64 -DLIBARCH="amd64" \
-DLIB_EXT=".so" -m64 -fPIC -O hsdis.c -shared \
-ldl -lopcodes

The compilation will fail unless you have binutils-devel package installed. So, take care about that. In case the compilation was successful you'll have hsdis-amd64.so file. It's a dynamic library. Note that I'm using 64 bit AMD/Intel architecture. If you are using 32 bit version replace amd64 with i386 and -m64 with -m32. In case of some other architecture you'll have to find out yourself what's the name.

Now, you'll need some Java class that you'll run and that will produce assembly output. The main point you should have in mind is that the code has to be such to provoke JIT to be started. Otherwise, you'll don't get any assembly output. I used the following simple class file:

import java.math.BigInteger;

class Multiply
{
    public static void main(String[] args)
    {
        BigInteger a = BigInteger.ONE;

        for (int i = 0; i < 10000; i++)
            a = a.multiply(BigInteger.valueOf(2));
        System.out.println(a);
    }
}

After compiling it, run it using the following command:

LD_LIBRARY_PATH=. java -XX:+UnlockDiagnosticVMOptions \
-XX:+PrintAssembly -XX:PrintAssemblyOptions=intel \
Multiply

Note that I'm using LD_LIBRARY_PATH to tell JIT where disassembler (hsdis) is. In my case everything is in the current directory. Note that in the previous command I specified that I want Intel assembly syntax. The default one is AT&T.

Tuesday, February 5, 2013

Fun when mail server receives SERVFAIL instead of NXDOMAIN...

Ok, I got log files overflowed with error messages like this one:

Feb 5 11:01:35 mail named[994]: error (host unreachable) resolving 'sbs-music.com/NS/IN': 50.56.243.69#53

In essence, name server for this domain (50.56.243.69) is unreachable from the DNS server used by mail server. Trying to manually query the server, I get:

$ host -t ns sbs-music.com
Host sbs-music.com not found: 2(SERVFAIL)

Note the status, it's SERVFAIL. The result is that mail server thinks it is a temporary error and retries later, with the same results. Trying this on another host (that uses another DNS server) I get:

$ host -t ns sbs-music.com
Host sbs-music.com not found: 3(NXDOMAIN)

Well, this time it tells me that there is no such domain. An error message like this would tell mail server to give up and return error response.

So, why is there discrepancy between the two? Using tcpdump in the first case (i.e. when we get SERVFAIL) the following requests/responses are exchanged (slightly edited for readability):

192.168.x.y.51892 > 206.72.97.238.53: 39504 [1au] NS? sbs-music.com. (42)
206.72.97.238.53 > 192.168.x.y.51892: 39504 4/0/5 NS ns4.shepherdhosting.com., NS ns1.shepherdhosting.com., NS ns2.shepherdhosting.com., NS ns3.shepherdhosting.com. (194)
192.168.x.y.10749 > 50.56.243.69.53: 40104 [1au] NS? sbs-music.com. (42)
timeout
192.168.x.y.63081 > 206.72.97.238.53: 56636 [1au] NS? sbs-music.com. (42)
206.72.97.238.53 > 192.168.x.y.63081: 56636 4/0/5 NS ns1.shepherdhosting.com., NS ns2.shepherdhosting.com., NS ns3.shepherdhosting.com., NS ns4.shepherdhosting.com. (194)
192.168.x.y.31948 > 50.56.243.69.53: 27220 [1au] NS? sbs-music.com. (42)
timeout

So, let me interpret this trace. The first query is to IP address 206.72.97.238 and it asks for the name server of a domain sbs-music.com. Doing reverse DNS query, we get:

# host 206.72.97.238
238.97.72.206.in-addr.arpa domain name pointer sh214.shepherdhosting.com.

So, it's some hosting provider. Now, what we get in response is that name servers for that domain are ns1.shepherdhosting.com through ns4.shepherdhosting.com. Ok, our DNS server choose ns4.shepherdhosting.com with IP address 50.56.243.69. Then, it queried it for sbs-music.com domain. This time the query timed out. So, our DNS server decided to query again 206.72.97.238 for name servers. It again received the same list and then it again queried ns4 which didn't answer the query.

Let us manually try some other server. So, querying ns1.shepherdhosting.com we get:

# host -t ns sbs-music.com. 206.72.97.238
Using domain server:
Name: 206.72.97.238
Address: 206.72.97.238#53
Aliases:

sbs-music.com name server ns2.shepherdhosting.com.
sbs-music.com name server ns3.shepherdhosting.com.
sbs-music.com name server ns4.shepherdhosting.com.
sbs-music.com name server ns1.shepherdhosting.com.

This is the same response we saw in the first part of the trace. Trying ns2:

$ host -t ns sbs-music.com. 206.72.100.134
;; connection timed out; trying next origin
;; connection timed out; no servers could be reached

Well, ns2 doesn't respond. Neither do ns3, nor as was always obvious, ns4. What does the trace looks like (again, slightly edited):

12:20:22.532877 IP 192.168.x.y.34364 > 206.72.100.134.53: 31539+ NS? sbs-music.com. (31)
12:20:27.532864 IP 192.168.x.y.34364 > 206.72.100.134.53: 31539+ NS? sbs-music.com. (31)
12:20:32.533445 IP 192.168.x.y.45322 > 206.72.100.134.53: 5827+ NS? sbs-music.com. (31)
12:20:52.733389 IP 206.72.100.134.53 > 192.168.x.y.34364: 31539 ServFail 0/0/0 (31)
12:20:52.733410 IP 192.168.x.y > 206.72.100.134: ICMP 192.168.x.y udp port 34364 unreachable, length 67
12:20:52.734042 IP 206.72.100.134.53 > 192.168.x.y.45322: 5827 ServFail 0/0/0 (31)
12:20:52.734053 IP 192.168.x.y > 206.72.100.134: ICMP 192.168.x.y udp port 45322 unreachable, length 67

Now, we have interesting situation here. First, DNS server for sbs-music.com takes significant time to answer, and when it answers our local DNS isn't listening any more (thus those ICMP error messages). But, in the end local DNS concludes correctly that something's wrong with the name servers for that domain.

The final piece of puzzle comes from querying com name server for sbs-music.com domain:

$ host -t ns sbs-music.com l.gtld-servers.net.
Using domain server:
Name: l.gtld-servers.net.
Address: 192.41.162.30#53
Aliases:

sbs-music.com has no NS record

Obviously, this domain was removed. It is clear now that this sbs-music.com domain existed for some time, and DNS server that produces error cached IP address of its domain name. If it were to query com domain name server again, it would receive NXDOMAIN error and properly notify mail server.

To see currently cached entries of BIND name server use the following command:

rndc dumpdb

Then look for file cache_dump.db in /var/named/data (or in /var/named/chroot/var/named/data if you are running BIND in chroot). It is a textual file that you can inspect with text editor, less or something similar. In my case there were the following lines there:

; glue
sbs-music.com. 172669 NS ns1.shepherdhosting.com.
172669 NS ns2.shepherdhosting.com.
172669 NS ns3.shepherdhosting.com.
172669 NS ns4.shepherdhosting.com.

To flush a single entry use the following command

rndc flushname sbs-music.com internal

This removes sbs-music.com from caches in internal view (I configured viewes so that server behaves differently depending on who asks it). Yet, this didn't help. Then I tried to flush everything in internal view using:

rndc flush internal

But this, while helped, didn't actually solve the problem. Namely, looking into packet trace it turns out that BIND server receives from b.gtld-servers.net. that given domain doesn't exist and from somewhere it pulls the old IP address 206.72.97.238?!

So, I finally decided to look into log and there are a lot of the following messages:

error (FORMERR) resolving 'sbs-music.com/NS/IN': 206.72.97.238#53

along with the one that triggered all this. Ok, I also tried to upgrade bind, but no luck, still SERVFAIL errors.

Now, its time for heavy artillery, or Wireshark. So I saved packet trace and loaded it into Wireshark. Guess what! Wireshark crashed on requests sent by local DNS server!?

Ok, after a bit more fiddling I realised that some com domain name servers do know for this domain. But note the difference between the output of host command (that I've used previously) and nslookup command:

$ nslookup -type=ns sbs-music.com 192.5.6.30
Server: 192.5.6.30
Address: 192.5.6.30#53

Non-authoritative answer:
*** Can't find sbs-music.com: No answer

Authoritative answers can be found from:
sbs-music.com nameserver = ns1.shepherdhosting.com.
sbs-music.com nameserver = ns2.shepherdhosting.com.
sbs-music.com nameserver = ns3.shepherdhosting.com.
sbs-music.com nameserver = ns4.shepherdhosting.com.
ns1.shepherdhosting.com internet address = 206.72.97.238
ns2.shepherdhosting.com internet address = 206.72.100.134
ns3.shepherdhosting.com internet address = 206.72.97.237
ns4.shepherdhosting.com internet address = 50.56.243.69

Ok, if this com server tells me that there is no this domain, why is then it pointing me to those nameservers? dig command is a bit more informative:

$ dig @192.5.6.30 sbs-music.com

; <<>> DiG 9.9.2-P1-RedHat-9.9.2-6.P1.fc18 <<>> @192.5.6.30 sbs-music.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64382
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 4, ADDITIONAL: 5
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sbs-music.com. IN A

;; AUTHORITY SECTION:
sbs-music.com. 172800 IN NS ns1.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns2.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns3.shepherdhosting.com.
sbs-music.com. 172800 IN NS ns4.shepherdhosting.com.

;; ADDITIONAL SECTION:
ns1.shepherdhosting.com. 172800 IN A 206.72.97.238
ns2.shepherdhosting.com. 172800 IN A 206.72.100.134
ns3.shepherdhosting.com. 172800 IN A 206.72.97.237
ns4.shepherdhosting.com. 172800 IN A 50.56.243.69
xxx
;; Query time: 149 msec
;; SERVER: 192.5.6.30#53(192.5.6.30)
;; WHEN: Tue Feb 5 13:49:22 2013
;; MSG SIZE rcvd: 194

What a mess?!

Then, I googled a bit to find why BIND is returning SERVFAIL instead of NXDOMAIN. This is something interesting that I found:

BIND could have returned SERVFAIL instead of NXDOMAIN responses for nonexistent resource records from the unsigned child zone if the parent zone was signed. (BZ#643012)

Trying to lookup that bug in RedHat's Bugzilla gives be big red square which tells me that I'm not allowed to see it (despite being logged in) so it's some security issue obviously!?

Looking at how different BIND versions behave I get the following results:

bind-9.3.6-20.P1.el5_8.6 and bind-9.9.2-6.P1.fc18.x86_64 return NXDOMAIN.
bind-9.8.2-0.10.rc1.el6_3.6.x86_64 and bind-9.8.2-0.10.rc1.el6_3.5.x86_64 return SERVFAIL.

Could it be somehow related to DNSSEC?

Ok, let me conclude. The problem is that name servers for domain sbs-music.com aren't correctly configured, while the domain itself is registered with com domain servers. This triggers different behavior from BIND. So, there are two possibilities from here:

Persuade somehow BIND to return NXDOMAIN instead of SERVFAIL.
Find what is causing queries for this domain in the first place.

Stay tuned... :)

Thursday, January 31, 2013

IPV6 in enterprise best practices/white papers

From time to time I look what's going on on a Nanog mailing list. It is a very interesting mailing list in which quite often something very interesting pops up. You don't have to sign to this mailing list in order to see posts, there are publicly available archives, which might be a better option for those sporadically looking at this list. This time my eye caught a thread with the subject line as in the title of this post. So, since IPv6 is hot topic these days, or at least it seems so, I decided to read through this thread and make summary along with pointers to materials that were linked to.

The thread was started on January 26th, 2013. by Pavel Dimow who asked for a real world example of IPv6 deployment in enterprise. More specifically, he said that he thinks that the procedure to introduce IPv6 is:

Create address plan.
Implement security on routers/switches and then hosts.
Create AAAA and PTR records in DNS.
Configure DCHPv6.
Test IPv6 in LAN.
Configure BGP with ISP.

He also wondered how to maintain PTR records in case SLAAC or DHCPv6 is used and should he use DDNS for that purpose. Finally, he asked weather to use SLAAC or DHCPv6.

The general consensus of repliers was that first IPv6 connectivity to the Internet should be established. The reason is that operating systems prefer IPv6 over IPv4 and if there is AAAA record, along with localy assigned IPv6 address, then IPv6 connection will be first established. Since, if you configure Internet connectivity as a last step, there is no path to destination, timeouts will have to expire in order to detect missing IPv6 connectivity and in the end users will experience delays. This scenario actually happened in one network I used. Namely, intranet Web server was given IPv6 address to test that IPv6 worked. Since all operating systems today have IPv6 enabled by default clients on a local network tried to connect to Web server using IPv6 which wasn't possible since only a small part of intranet got IPv6 connectivity. Still, it turns out that it is possible to configure address preferences in an OS (though, I don't know which ones yet). And, there is a draft that defines how address preferences can be distributed via DHCPv6.

After obtaining addresses from ISP and making address plan the next step would be to configure network equipment, preferably not everything, something for testing. Very important is to get at least some experience with IPv6 before deploying it in a production environment. To get experience there are tunnel broker services that are free and very good. HE.net apparetnly also allows free IPv6 BGP connectivity via tunnels.

Here is more specific series of steps to introduce IPv6. This one was written by a person doing an actual deployment. Note that deployer had its own ASN:

get a /48 PI from the local LIR
configure the border routers to announce the prefix and do connectivity tests (ping Google/Facebook addresses using an IPv6 address from our own /48 - loopback on the router)
configure IPv6 addresses on internal router and do connectivity tests again
configure firewall interfaces with IPv6 addresses and again connectivity tests
configure IPv6 firewall rules (mostly a mirror of the IPv4 rulesets)
configure IPv6 address on DMZ servers (actually the first one configured were the DNS servers)
do connectivity tests again
publish IPv6 records for the DNS servers and for the domain and run ping/telnet 80 tests from another ipv6 enabled network to check that everything is OK.
publish AAAA records for all the hosts in the DMZ and making sure all the services available on IPv4 were also available on IPv6
did the same for the servers in the "Server network"
last step was to enable IPv6 on the network that served the users using RA with the stateful configuration bit set on the firewall and DHCPv6 to serve up DNS servers for IPv6

Security is very important aspect in any network, so it is in IPv6, too. Some of the IPv4 security mechanisms translate to IPv6 security, e.g. DHCP snooping, but there are some IPv6 specific things to be aware of, like RAs.

Scalability is other very important aspect of any network. There was subthread about snooping MLD, or lack of snooping. Namely, there are high density VM deployments in which even high end switches don't have enough processing/storage power. In that case, multicasting degrades to broadcasting. In one post a poster asked about some figures from real world switches, e.g. maximum number of multicast groups, but unfortunately, there was no answer.

Finally, very good source of different documentation about IPv6 deployment is Internet Society's Deploy360 pages. There are documents that describes how to develop address plan and Aaron Hughes presentation from NANOG.