Everything about nothing: August 2013

Thursday, August 29, 2013

Problem with automatic VMWare upgrade from 9.0.1 to 9.0.2

VMWare Workstation checks upon startup if there are new version of the software, and if it is then it asks you if you want to upgrade. In case automatic checking is disabled, there is an option under the Help menu that allows manual triggering of that process. For some time now I was getting notification about free upgrade from 9.0.1 to 9.0.2 which I would enable, but which never finished for unknown reason. Instead of trying to figure out what was wrong I decided to try to do it manually. In the end, it turned out to be a semi-manual process. Namely, when I received an error message about being unable to upgrade I looked where downloaded files are. It turned out that they are stored in /tmp directory but that they have UUID like names, i.e.:

$ ls -l
total 381768
-rw-r--r--. 1 sgros zemris 66211840 Kol 29 13:28 06dd4484-5c02-4c5d-992c-ff705703e6cb
-rw-r--r--. 1 sgros zemris 11253760 Kol 29 13:29 1cf5e724-f7d4-4f24-b484-30ebb16d593e
-rw-r--r--. 1 sgros zemris 61777920 Kol 29 13:29 2007e14d-3593-416f-b60e-08c4cd18693a
-rw-r--r--. 1 sgros zemris 232693760 Kol 29 13:31 364c998b-8b3b-4cfd-a2dc-67352a3eb082
-rw-r--r--. 1 sgros zemris 13096960 Kol 29 13:31 4b7424a6-e114-4832-be21-f0a3acf8c24b
-rw-r--r--. 1 sgros zemris 81920 Kol 29 13:31 8a8d105f-fd3d-404a-afc9-28411d6566fe
-rw-r--r--. 1 sgros zemris 5795840 Kol 29 13:31 d969898b-30bb-4c6d-bf45-5a7d52918359

Now, using file command it was easy to determine that they are actually tar archives so. So, I created new directory and unpacked all those files there. What I've got was one file with extension bundle and several files with an extension component.

# ls -l
total 390904
-rw-r--r--. 1 201 201 1161 Vel 26 2013 descriptor.xml
-rw-r--r--. 1 201 201 15207519 Vel 26 2013 vmware-tools-freebsd-9.2.3-1031769.x86_64.component
-rw-r--r--. 1 201 201 66202162 Vel 26 2013 vmware-tools-linux-9.2.3-1031769.x86_64.component
-rw-r--r--. 1 201 201 76615 Vel 26 2013 vmware-tools-netware-9.2.3-1031769.x86_64.component
-rw-r--r--. 1 201 201 13088243 Vel 26 2013 vmware-tools-solaris-9.2.3-1031769.x86_64.component
-rw-r--r--. 1 201 201 61771010 Vel 26 2013 vmware-tools-windows-9.2.3-1031769.x86_64.component
-rw-r--r--. 1 201 201 11247429 Vel 26 2013 vmware-tools-winPre2k-9.2.3-1031769.x86_64.component
-rwxr-xr-x. 1 sgros zemris 232680125 Vel 26 2013 VMware-Workstation-9.0.2-1031769.x86_64.bundle

The bundle type file is actually an installer for a new version of VMWare, so I've run it as a root user and it installed new version of VMWare. The component files, on the other hand, have to be installed using vmware-install tool, e.g. to install vmware-tools-windows-9.2.3-1031769.x86_64.component file, execute the following command as a root user:

vmware-installer --install-component=vmware-tools-windows-9.2.3-1031769.x86_64.component

The same command has to be repeated for the other files too. But, note that those files are optional, depending what you've running. You can check which components you have installed using vmware-installer command with an option -t, i.e.
vmware-installer -t
And that was it. Of course, before being able to run VMWare I had to patch it again. But that was it.

Tuesday, August 27, 2013

VMWare Workstation and kernel 3.10 (again)

Change in the kernel version again broke VMWare Workstation. It would definitely be the best option for VMWare to try to integrate their modules into the kernel. In such a way maintenance of those modules would be bound with kernel and a lot of people would have a lot less annoyance. But, that's not the case and so such problems happen. In this case, the solution is again relatively easy. Just run the following commands, as is, and everything should work:

cd /tmp
curl -O http://pkgbuild.com/git/aur-mirror.git/plain/vmware-patch/vmblock-9.0.2-5.0.2-3.10.patch
curl -O http://pkgbuild.com/git/aur-mirror.git/plain/vmware-patch/vmnet-9.0.2-5.0.2-3.10.patch
cd /usr/lib/vmware/modules/source
tar -xvf vmblock.tar
tar -xvf vmnet.tar
patch -p0 -i /tmp/vmblock-9.0.2-5.0.2-3.10.patch
patch -p0 -i /tmp/vmnet-9.0.2-5.0.2-3.10.patch
tar -cf vmblock.tar vmblock-only
tar -cf vmnet.tar vmnet-only
rm -rf vmblock-only
rm -rf vmnet-only
vmware-modconfig --console --install-all

The commands were taken from this link. Note that curl commands are broken to two lines due to the space problems, but they should be typed in a single line. I tried this with VMWare Workstation 9.0.1 and kernel 3.10.9 and it worked flawlessly.

Saturday, August 10, 2013

Getting Libreswan to connect to Cisco ASA 5500

Here are some notes about problems I had while trying to make Libreswan connect to Cisco ASA. Note that I'm using certificate based authentication in a roadwarrior configuration. The setup is done on Fedora 19.

The procedure, at least in theory, is quite simple. Edit, /etc/ipsec.conf file, modify /etc/ipsec.d/ipsec.secrets file, import certificates into NSS database and then start libreswan daemon. Finally, activate the connection. Well, it turned out that the theory is very far from the practice. So, here we go.

Preparation steps

First, I used the following ipsec.conf file when I started to test connection to the ASA:

version 2.0 # conforms to second version of ipsec.conf specification
# basic configuration
config setup
nat_traversal=yes
nhelpers=1
protostack=netkey
interfaces=%defaultroute

conn VPN
# Left side is RoadWarrior
left=%defaultroute
leftrsasigkey=%cert
leftcert=SGROS
leftca=ROOTCA
leftid=%fromcert
leftsendcert=always
# Right side is Cisco
right=1.1.1.1 # IP address of Cisco VPN
rightrsasigkey=%cert
rightcert=CISCO
rightca=%same
rightid=%fromcert
# config
type=tunnel
keyingtries=2
disablearrivalcheck=no
authby=rsasig
auth=esp
keyexchange=ike
auto=route
remote_peer_type=cisco
pfs=no

Note few things about this configuration. First, my client machine (roadwarrior) is a left node, while Cisco is the right one. Next, I didn't arrive to this configuration immediately, I had to experiment with the value of interfaces and left statements. The reason was that I'm assigned dynamic NATed address. So, those settings will cause openswan to automatically select appropriate values (interface and IP address) at the time I'm connecting to VPN. Also, certificate related stuff also took me some time to figure it out.

Into ipsec.secrets file I added the following line:

: RSA SGROS

this will cause openswan to use RSA key for authentication. Finally, I had to import certificates and keys into the NSS database. Note that NSS database is already precreated in /etc/ipsec.d directory. More specifically, the database consists of files cert8.db, key3.db and secmod.db. To see imported certificates (if there are any), use the following command:

# certutil -L -d /etc/ipsec.d/
Certificate Nickname Trust Attributes
SSL,S/MIME,JAR/XPI
ROOTCA CT,C,C
SGROS u,u,u
CISCO P,P,P

In my case there are three certificates in the database. Mine (SGROS), VPN's (CISCO) and CA that signed them (ROOTCA). Note that I'm referencing those certificates in ipsec.conf file. If you are configuring this database for the first time, it will be empty and you'll have to import all the certificates.

To import CA into your database, use the following command:

certutil -A -i rootca.pem -n ROOTCA -t "TC,TC,TC" -d /etc/ipsec.d/

Note that I'm assuming you have PEM version of certificate stored in the current directory (argument to the -i option). For the rest options, and their meaning, please consult man page.

To import your certificate, with a private key, use the following command:

certutil -A -i certkey.pfx -n SGROS -t "u,u,u" -d /etc/ipsec.d/

Note that this time certificate and private key are stored in PKCS#12 file (named certkey.pfx).

Finally, to import certificate of Cisco ASA, use the following command:

certutil -A -i rootca.pem -n ROOTCA -t "P,P,P" -d /etc/ipsec.d/

Note that the command is very similar to the one used to import ROOTCA, but the trust attributes (option -t) are different. Namely, you don't want ASA's certificate to be CA, i.e. to be able to issue new certificates.

Starting and debugging

To start Libreswan daemon, I used the following command:

ipsec pluto --stderrlog --config /etc/ipsec.conf --nofork

That way I was forcing it not to go to the background (--nofork) and to log to stderr (--stderrlog). Then in another terminal I would trigger VPN establishment using the following command:

ipsec auto --up VPN

The first problem was that Libreswan said it can not determine which end of the connection it is, i.e. I was receiving the following error message.

022 "VPN": We cannot identify ourselves with either end of this connection.

That error message took me some time to resolve. I tried everything possible to let Libreswan knows if it is left or right part in the configuration, which included changing roles several times, changing different parameters and other stuff. In the end, it turned out that that has nothing to do with the configuration file, namely the problem actually was missing kernel module!? NETKEY wasn't loaded, and libreswan couldn't access IPsec stack within the kernel. To be honest, it could be inferred from the log by nothing the following lines:

No Kernel XFRM/NETKEY interface detected
No Kernel KLIPS interface detected
No Kernel MASTKLIPS interface detected
Using 'no_kernel' interface code on 3.10.4-300.fc19.x86_64

But then again, some more informative error message would actually help a lot! In the end, the following command solved that problem:

modprobe af_key

and then, in the logs I saw the following line:

Using Linux XFRM/NETKEY IPsec interface code on 3.10.4-300.fc19.x86_64

that confirmed NETKEY was now accessible, and the error message about not knowing which end of the connection it is, disappeared.

Next, I had problem with peer's public key. The error message I received was:

003 "VPN" #1: no RSA public key known for 'C=HR, L=Zagreb, O=Some ORG, CN=Cisco ASA'

Again, I lost a lot of time trying to figure out why it can not access public key even though it is in the certificate. I also tried to extract public key and write it directly into configuration file. Nothing helped. Then, when I turned on debugging of x509 in ipsec.conf, I found some suspicious messages, like the following one:

added connection description "VPN"
| processing connection VPN
| trusted_ca called with a=CN=ROOTCA b=
| trusted_ca returning with failed
| trusted_ca called with a=CN=ROOTCA b=\001\200\255\373
| trusted_ca returning with failed

Note the garbage as the second argument of the function trusted_ca!? Googling around for something about this didn't reveal anything useful. But then, out of desperation I tried removing leftca and rightca parameters from ipsec.conf, and guess what! Everything started to work. Checking again at the logging output I saw that now b parameter has the same value as a.

Yet, it still didn't work and after some tinkering I suspected that on the Cisco side XAuth is enabled and required. This I concluded based on the log output where Libreswan says what it received from Cisco:

"VPN" #1: received Vendor ID payload [Cisco-Unity]
"VPN" #1: received Vendor ID payload [XAUTH]
"VPN" #1: ignoring unknown Vendor ID payload [...]
"VPN" #1: ignoring Vendor ID payload [Cisco VPN 3000 Series]

At first, I thought that Libreswan will support XAuth, but obviously, if it is not configured Libreswan can not use it. Also, looking into manual page it says there that Xauth is disabled by default. So, after adding the following statements into ipsec.conf file:

leftxauthclient=yes
leftxauthusername=sgros

and adding appropriate line into ipsec.secrets file:

@sgros : XAUTH "mypassword"

I managed to get further. Yet, it still didn't work. Looking again in the log output I realised that something was wrong with client configuration. Also, I got segfaults there that I didn't report to upstream for a simple fear that I might send some secret information. But, after adding the following statements into ipsec.conf segmentation fault was solved:

modecfgpull=yes
leftmodecfgclient=yes

In the logging output of Libreswan I saw that configuration parameters were properly obtained, i.e.:

"VPN" #1: modecfg: Sending IP request (MODECFG_I1)
"VPN" #1: received mode cfg reply
"VPN" #1: setting client address to 192.168.2.33/32
"VPN" #1: setting ip source address to 192.168.2.33/32
"VPN" #1: Received subnet 192.168.0.0/16, maskbits 16
"VPN" #1: transition from state STATE_MODE_CFG_I1 to state STATE_MAIN_I4

Now, it seemed like everything was connected, but ICMP probes were not going through. Using setkey command I checked that policies and associations are correctly installed into the kernel, which they were. I quickly realised that the problem was that Libreswan didn't assign IP address to my local interface, nor did it assign routes. That was easy to check by just listing interface's IP addresses. To see if this is really problem, I manually assigned address and route:

# ip addr add 192.168.2.33/32 dev wlp3s0
# ip ro add 192.168.0.0/22 src 192.168.2.33 via 192.168.1.1

and after that I was able to reach addresses within a destination network. Note that IP address given as argument to via keyword (in my case 192.168.1.1) isn't important since XFRM will change it anyway. So, the problem was why this address isn't added in the first place.

After some poking around I found that the script /usr/libexec/ipsec/_updown.netkey is called to setup all the parameters, and also, looking into that script, I found that it didn't do anything when pluto calls it using up-client parameter! So, no wonder nothing happened. I also found on the Internet post about that problem. The fix is simple, as shown in the post I linked, but it messes something with routes. After some further investigation I discovered that when adding locally assigned IP address, the script messes up netmask. To cut the story, I changed the following line:

-it="ip addr add ${PLUTO_MY_SOURCEIP%/*}/${PLUTO_PEER_CLIENT##*/} dev ${PLUTO_INTERFACE%:*}"
+it="ip addr add ${PLUTO_MY_CLIENT} dev ${PLUTO_INTERFACE%:*}"

and also, I changed the following lines for IP address removal:

-it="ip addr del ${PLUTO_MY_SOURCEIP%/*}/${PLUTO_PEER_CLIENT##*/} dev ${PLUTO_INTERFACE%:*}"
+it="ip addr del ${PLUTO_MY_CLIENT} dev ${PLUTO_INTERFACE%:*}"

You can get the complete patch here.

Some random notes

It might happen that Libreswan suddenly stops working and that you can not access network, i.e. you can only ping you local address, but not local router. In that case try to clear XFRM policy using setkey command:

setkey -FP

you can also check if there is anything with:

setkey -DP

Monday, August 5, 2013

TCP client self connect...

This is so cool and unexpected, but then nothing out of spec, that I had to reblog it. Namely, if you run the following snippet of the Bourne shell code:

while true
do
telnet 127.0.0.1 50000
done

You'll constantly receive message 'Connection refused', but at one point the connection will be established and whatever you type, will be echoed back:

Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
test1
test1
test2
test2

Note that you didn't start any server and there is no process listening on port 50000 on localhost, but yet, it connected! Looking at the output of netstat command we see that there is really established connection:

$ netstat -tn | grep 50000
tcp 0 0 127.0.0.1:50000 127.0.0.1:50000 ESTABLISHED

and, if we monitor traffic using tcpdump we observe a three way handshake:

21:31:02.327307 IP 127.0.0.1.50000 > 127.0.0.1.50000: Flags [S], seq 2707282816, win 43690, options [mss 65495,sackOK,TS val 41197287 ecr 0,nop,wscale 7], length 0
21:31:02.327318 IP 127.0.0.1.50000 > 127.0.0.1.50000: Flags [S.], seq 2707282816, ack 2707282817, win 43690, options [mss 65495,sackOK,TS val 41197287 ecr 41197287,nop,wscale 7], length 0
21:31:02.327324 IP 127.0.0.1.50000 > 127.0.0.1.50000: Flags [.], ack 1, win 342, options [nop,nop,TS val 41197287 ecr 41197287], length 0

What happened? In short, client connected to itself. :) A bit longer explanation follows...

Let's start with a fact that when a client (in this case it is a telnet application) creates socket and tries to connect to server, kernel assigns it a random source port number. This is because each TCP connection is uniquely identified with 4-tuple:

(source IP, source port, destination IP, destination port)

Of those, three parameters are predetermined, i.e. source IP, destination IP and destination port, what's left is source port that has to be somehow arbitrarily assigned, and usually applications leave that to the kernel which takes it from the range of ephemeral ports. Applications can choose source port using bind(2) system call, but it is very rarely done. Now, in what range do these ephemeral ports live? They are high ports, and you can take a look into /proc file system to see specific values for your Linux machine, e.g.:

$ cat /proc/sys/net/ipv4/ip_local_port_range
32768 61000

In this case, ephemeral ports are taken between 32768 and 61000.

Now, back to our example with telnet application. When telnet is started, kernel selects some free port from the given range of ephemeral ports and tries to connect to localhost (destination IP 127.0.0.1), port 50000. Since no process usually listens on ephemeral ports RST response is sent back and telnet client gives error message Connection refused. This exchange on the network can be seen by using tcpdump tool:

# tcpdump -nni lo port 50000
21:31:02.326447 IP 127.0.0.1.49999 > 127.0.0.1.50000: Flags [S], seq 1951433652, win 43690, options [mss 65495,sackOK,TS val 41197286 ecr 0,nop,wscale 7], length 0
21:31:02.326455 IP 127.0.0.1.50000 > 127.0.0.1.49999: Flags [R.], seq 0, ack 387395547, win 0, length 0

It is interesting to note that Linux chooses ephemeral ports sequentially, not randomly. This allows easy guessing of the ports, and might be a security problem, but further investigation is necessary to confirm this.

Anyway, during many unsuccessful connections, at one iteration, telnet client is assigned source port 50000 and SYN request is sent to port 50000, i.e. to itself. So, it establishes connection with itself! This is actually fully according to the TCP specification which supports a so called feature simultaneous open, illustrated in Figure 8 in RFC793 (note, there is errata to this example in RFC1122).

Yet, the example from RFC793 assumes that there are two independent endpoints trying to connect at the same time, but in our case it is only one side so there is a small deviation from prescribed behavior. Let's take a look. Here is a TCP state machine taken from Wikipedia page about TCP:

When telnet client starts, source port 50000 is assigned and state machine is instantiated which is immediately initialized into CLOSED state. Then, telnet tries to connect to a server which means SYN is sent and TCP state machine of the source port goes to SYN SENT state. Now, this same source port, i.e. state machine, receives this SYN and because of this goes into SYN RECEIVED state (arrow from right to left marked with SYN/SYN+ACK). While transiting to a new state, SYN+ACK is emitted that is again received by the state machine. Now, we come to a bit of a mystery, namely how the state machine transitions to ESTABLISHED state and when an ACK is emitted to finish three way handshake?

To answer that, we'll have to dig a bit into the kernel's source code. First, note that there is an explicit case for self connect which is also commented. This case is triggered in TCP_SYN_SENT state. Then, socket is placed into TCP_SYN_RECV state and SYN+ACK is sent back. This SYN+ACK is immediately looped back and processed in function tcp_rcv_state_process(). In that function, the function tcp_validate_incoming() is called. That function, finally, after few checks calls function tcp_send_challenge_ack() that sends ACK. The state of the TCP connection (i.e. socket) is changed to ESTABLISHED in function tcp_rcv_state_process() within a part that processes ACK flag. And that concludes the description what happens actually happens, and what is seen on a network.

The scenario of self connect described in this post is quite specific and requires specific preconditions. First, obviously, you need to (ab)use ephemeral ports for listening servers so that you clients try to connect to ephemeral ports. Next, client and server have to run on the same IP address, otherwise client will not be able to self connect. Finally, this can only happen during initial handshake phase. If you find some client using some ephemeral port and try to connect to it, you'll be refused. So, the conclusion is: Don't use ephemeral ports for servers! Or otherwise, you risk very interesting behavior that is nondeterministic and hard to debug.

Everything about nothing