Showing posts with label linux. Show all posts
Showing posts with label linux. Show all posts

Wednesday, September 1, 2021

Getting uniqe docker container name from within container

I had a use case in which I needed to obtain a unique identifier of a container from within the container. The problem is that container is started with docker-compose using scale option so that a number of indentical containers are started. Googling around gave me some options like examining /proc/self/cgroups, but in my case, this file didn't contain anything useful. Another solution was to pass environment variable from the outside which also looked like a mess. In the end, I came up with not so elegant solution, but the one that works. I use Python, but the concept could be used with other programming languages as well.

First, environment variable contains image name, e.g. crawler. So, you can use the following Python expression to obtain this name:

import os
hostname = os.environ['HOSTNAME']

Variable hostname will contain crawler in our example. Next, resolve this name to IP address:

import socket
ip = socket.gethostbyname(hostname)

This will give you IP address of the container, e.g. 172.25.0.12. Now, you should do a reverse lookup. BUT, you can not use gethostbyaddr since it will return to you the same hostname as the one you obtained from HOSTNAME environment variable. The reason is that gethostbyname and gethostbyaddr use /etc/hosts file first. Instead, you have to use DNS. Unfortunately, this is where external library is needed. You can install and use dnspython, and then do a reverse lookup:

from dns import resolver,reversename
addr=reversename.from_address(ip)
str(resolver.resolve(addr,"PTR")[0]).split('.')[0]

The reason for the last mess is that resolver.resolve returns tupple with FQDN in the zero-th element, and then FQDN has to be splitted on dots to take the hostname only.

Sunday, June 3, 2018

Emulating Amstrad PC1512

My first computer was Amstrad Schneider PC1512SD  so it's understandable that I'm attached to that computer. I own two of them but since lately I don't have enough time to play with them I started to search for emulators to be able to try from time to time old software and games I've used. Since I lost some time to figure out how to emulate Amstrad, I decided to document everything in this blog post. This should be useful to me when I decide I want to play with it again in the future, but it can also help anyone else following my footsteps.

First I needed to find PC XT emulator. Modern day emulators/virtualization solutions like Qemu, VirtualBox, VMWare, and even Bochs do not support anything older than Pentium. So, some other emulator has to be used. After some time spending searching for XT emulators I found the following candidates:
  1. MAME
  2. PCem
  3. PCjs
Turns out MAME and PCem support Amstrad PC1512 while PCjs doesn't. So I decided to go with PCem and MAME. After some trying I didn't manage to get anything from PCem. Namely, after starting emulation the screen was completely garbled so I decided to try MAME. I spent some time trying to figure out how to emulate Amstrad using MAME. Here is the essence of it on Fedora 27:
  1. First, you need to install mame package. This package is present in Fedora repository so except of 'dnf install mame' no additional configuration is necessary.
  2. Next, you need to obtain ROM images from Amstrad. After some (actually a lot of) Googling I managed to find them. If I remember correctly, I obtained them via MAME ROMS package.
ROMs that you should have are any of the following three ones:
  1. Version 1 ROMs: 40044.ic132 (8kB, SHA1: 7781d4717917262805d514b331ba113b1e05a247) and 40043.ic129 (8kB, SHA1: 74002f5cc542df442eec9e2e7a18db3598d8c482)
  2. Version 2 ROMs: 40044v2.ic132 (8kB, SHA1: b12fd73cfc35a240ed6da4dcc4b6c9910be611e0) and 40043v2.ic129 (8kB, SHA1: c376fd1ad23025081ae16c7949e88eea7f56e1bb)
  3. Version 3 ROMs: 40044-2.ic132 (8kb, SHA1: b77fa44767a71a0b321a88bb0a394f1125b7c220) and 40043-2.ic129 (8kB, SHA1: 18a17b710f9eb079d9d7216d07807030f904ceda).
The names are as expected by MAME. In addition there'll be some other ROMs too:
  1. 40045.ic127 (8kb, SHA1: 7d858bbb2e8d6143aa67ab712edf5f753c2788a7)
  2. 40078.ic127 (8kB, SHA1: bc8dc4dcedeea5bc1c04986b1f105ad93cb2ebcd)
  3. wdbios.rom (8kB, SHA1: 601d7ceab282394ebab50763c267e915a6a2166a)
The first two are, I believe, fonts while the third one is necessary only if you want to emulate HD version of PC1512.

Create in the current (working) directory folder named pc1512 and place selected ROMs into it.

We can now start emulator. Use the following command line:
$ mame pc1512 -rompath . -window -uimodekey DEL_PAD
The first argument to MAME emulator is machine that should be emulated, in our case its PC1512. The option rompath instructs MAME to search for ROMs in the current directory. In our case it'll search for folder named pc1512 and within it for ROMs names as given above. The the option window prevents MAME from going fullscreen (and I had some problems exiting). Finally, the option uimodekey defines escape key to access internal menu.

This will give famous "Pleas wait..." message from BIOS and then "Insert SYSTEM disk into drive A" message. Now we are at the point of providing boot disks to the emulated machine. For that it is necessary to obtain images of Amstrad PC1512 disks. You can find them here. The problem is that disks 1 and 4 are given in CFI format (Compressed Floppy Image, created by tool FDCOPY.COM), while disks 2 and 3 are archives.

So, after unpacking disk 1 (46001.Zip) and disk 4 (46004.Zip) you are presented with files 46001.CFI and 46004.CFI which are not recognized by MAME. To convert them into appropriate format use the following command:
dsktrans -itype cfi 46001.cfi 46001.dsk
dsktrans is a tool that is part of libdsk-tools package, also part of the Fedora repository. So, just run 'dnf install libdsk-tools' and that should be it. After converting 1st disk you can also convert 4th disk.

Now, we are ready to start MAME with system disk provided. One way to do that is to open internal MAME menu after staring PC1512 and then attaching disk image. The other way is to use command line:
mame pc1512 -rompath . -window -uimodekey DEL_PAD -flop 46001.dsk
The new option is flop that defines image to be used as a floppy in a floppy disk. By the way, to find out supported floppies you can use the following command:
mame pc1512 -listmedia
and take a note of (brief) column.

What happens now is that you are provided with MS DOS command prompt in emulated machine.

The next step is to start GEM, but before that I have to find out how to create floppy disk image. Note that the tools imgtool that is part of MAME gives segmentation fault on almost any command you try. Anyway, stay tuned for GEM...

Some useful resources I found while working on this:

Monday, April 3, 2017

How to run Firefox in a separate network name space

In this post I'll explain how to run Firefox (or any other application) in a separate network name space. If you wonder why would you do that, here are some reasons:
  1. You connect to a VPN and want a single application to connect via VPN. All the other applications should access network as usual.
  2. You want to know what network resources specific application does access. For example, there is a JavaScript application that runs within the Web browser and you want to monitor it on a network level.
  3. You want to temporarily use another IP address, but in the same time keep the existing network configuration because some applications use it and they wouldn't react well on the change.
  4. You have alternative connection to the Internet (e.g. one via wired interface, and another via LTE) and you want some applications to use LTE, default being wired interface. This is actually variation of the cases 1 and 3, but obviously it's not the same.
Probably there are some other reasons, too, but I think this is enough to persuade you into advantages of using network namespaces on Linux. And note that you can run two instances of Firefox in the same time. One "normal" in the "normal" network namespace, and another one in new and potentially restricted network namespace. More on that later in the post.

So, here is how to create new network name space with network interface(s). Note that there are several different cases, depending on how you connect to the Internet and what you want to achieve. So, there will be several subcases. But first, create a new network name space using the following command (as a root user):
# ip netns add <NSNAME>
NSNAME will be the name of the network name space. You should use something short and meaningful, i.e. something that will associate you to what the network namespace is used for. You can check that there is a network name space using the following command:
# ip netns list
From this point on we have two subcases:
  • You are connected using wired Ethernet interface and you can attach new machines to the Ethernet network.
  • You are connected to the Internet using wireless Ethernet interface or you are connected to the wired Ethernet interface and are not allowed to connect new machines.
All those cases are described in the following subsections.

Wired Ethernet interface

This is the easiest case, and there are several options you can use. We'll use macvlan type of the interface that will create a clone of an existing wired Ethernet interface which will appear on the physical network with its own parameters. This is, in effect, like attaching a new host on the local network. Note that if you are not allowed to connect devices to the network, you should use routing method described for wireless interface.

First step is to create new interface:
# ip link add link <ETHIF> name <IFNAME> type macvlan
The parameters are: ETHIF is your existing Ethernet interface, while IFNAME is a new interface that will be created. You should then move the interface into the target network namespace (we assume here that you want to move it to NSNAME):
# ip link set <IFNAME> netns <NSNAME>
and then you have to activate it:
# ip netns exec <NSNAME> ip link set <IFNAME> up
note that the activation has to be done using "ip netns exec" since to access network interface you have to swich to the network namespace where the interface is! What is left is to assign it an IP address. This can be done statically or via DHCP.

Now that the network part is ready, skip to the section Starting Firefox.

Wireless LAN

In case you are connected to a wireless LAN, then macvlan link type will not work, so another mechanism is necessary. There are two options, bridging and routing. The problem with bridging is that you have to turn off wireless interface before enslaving it into a bridge. That creates two problems. The first one is that all current TCP connections will break, and the second is that it doesn't play nicely with NetworkManager and similar software. Thus, I'll describe routing case.

First, create pair of virtual Ethernet interfaces like this:
ip link add type veth
This will create two new interfaces veth0 and veth1. Those interfaces are actually two ends of a single link. We'll move one interface into another network namespace:
ip link set veth1 netns <NSNAME>
Next we'll configure interfaces with IP addresses. I'll use 10.255.255.1/24 for the interface that's left in the main network namespace (veth0) and 10.255.255.2/24 for the interface in the NSNAME network name space (veth1):
# ip addr add 10.255.255.1/24 dev veth0
# ip link set dev veth0 up
# ip netns exec <NSNAME> ip addr add 10.255.255.2/24 dev veth1
# ip netns exec <NSNAME> ip link set dev veth1 up
# ip netns exex <NSNAME> ip ro add default via 10.255.255.1
we also need to configure NAT because the network 10.255.255.0/24 is only used for the communication of two network namespaces and it should not go outside the host computer:
# iptables -A POSTROUTING -t nat -o wlp3s0 -s 10.255.255.2 -j MASQUERADE
you should change wlp3s0 with the name of your wireless interface. You should take note of two things in case it doesn't work:
  1. Forwarding has to be enabled. This is achieved/checked via sysctl /proc/sys/net/ipv4/ip_forward (it should contain 1).
  2. Maybe your host has firewall that blocks traffic. To check if that's the problem, temporarily disable firewall and try again. Note that disabling a firewall will most likely remove iptables rule you added so you'll have to add it again.

Starting Firefox

Now, when you handled creating the interface within the new network name space, to start Firefox (or any other application) in it, first you should switch into new network name space. Do this in the following way:
# ip netns exec <NSNAME> bash
Note that it is important to do it that way in order to preserve environmental variables, i.e. if you do "su -" or something else, you'll reset environment and you won't be able to start graphical applications. After you got bash shell as a root, switch again to a "normal" user:
su <userid>
again, it is very important to preserve network namespace, so you have to use command su as shown. Obviously, substitute userid with the user ID logged into graphical interface. Next, you should start Firefox:
$ firefox &
In case you already have running instance of Firefox that, for whatever reason, you don't wont to stop then you can start a new instance like this:
$ firefox -P -no-remote&
This will start new instance even though there is a running Firefox proces (-no-remote) and present you with a dialog box to choose a profile to run to. You can not use existing profile so it means that you have to create a new one specially for this purpose. The drawback is that your bookmarks, cookies and other thing won't be visible in a new instance. 

Friday, January 6, 2017

Few thoughts about systemd and human behavior

I was just reading comments on a post on Hackernews about systemd. Systemd, as you might know, is a replacement for the venerable init system. Anyway, reading the comments was reading about all the same story over and over again. Namely, there are those strongly pro and those strongly con the systemd, in some cases based on arguments (valid or not) and in other cases based on feelings. In this post I won't go into technical details about systemd but I'll concentrate on a human behavior that is the most interesting to me. And yes, if you think I'm pro systemd, then you're right, I am!

Now, what I think is the key characteristic of people is that they become too religious about something and thus unable to critically evaluate that particular thing. It happened a lot of times, and in some cases the transition from controversy was short, in other cases it took several or more generations of human lives. Take as an example the Christian religion! It also started as something controversial, but ended as a dogma that isn't allowed to be questioned. Or something more technical, ISO/OSI 7 layer model. It started as a controversy - home many layers, 5, 6, or 7? The result of this controversy we know, and after some short period of time it turned into a dogma, i.e. that 7 layers is some magical number of layers that isn't to be questioned. Luckily, I don't think that it is the case any more, that is, it is clear that 7 layers was too much. Anyway, I could list such cases on and on, almost ad infinitum. Note that I don't claim that any controversial change succeeded in the end, some were abandoned and that's (probably) OK.

I should also mention one other interesting thing called customs (as in norm). People lives are intervened with customs. Anyway, we have a tendency to do something that our elders did just because, i.e. we don't know why. I don't think that's bad per se, after all, probably that helped us to survive. The problem with the customs is that they assume slow development and change in environment. In such cases they are very valuable tool to collect and pass experience from generation to generation. But, when development/change speed reaches some tipping point, customs become a problem, not an advantage - and they stall adjustment to new circumstances. So, my personal opinion about customs is that we should respect them, but never forget to analyze if they are applicable/useful or not in a certain situation.

Finally, there is one more characteristic of a human beings, and that is inertia. We used to do things in certain way, and that's hard to change. Actually, I do not think that it is unrelated to religion and customs, actually on the contrary, I think they are related and it might be something else that is behind. But i won't go into that, at least not in this post.

So, what all this has to do with the systemd? Well, there is principle or philosophy in Unix development that states that whatever you program/create in Unix, let it do one thing and let it do it right. For example, tool to search for file should do it well, but not do anything else. And my opinion is that this philosophy turned into a custom and a religion in the same time. Just go through the comments related to SystemD and read them a bit. A substantial number of arguments is based on the premise that there is a principle and it should be obeyed under any cost/circumstance. But all those who bring this argument forget to justify why this principle would be applicable in this particular scenario.

And the state of computing has drastically changed from the time when this philosophy was (supposedly) defined (i.e. 1970-ties) and today's world. Let me say just a few big differences. Machines in the time when Unix was created were multiuser and stationary, with limited resources and capabilities, and they were used for much narrower application domains than today. Today, machines are powerful and inexpensive, used primarily by a single user. They do a lot more than they used to do 40 years ago, and they offer to users a lot more. Finally, users expectations from them are much higher than they used to be.

One advantage of doing one thing and doing it well was that it reduces complexity. In a world when programming was done in C or assembler, this was very important. But it also has a drawback, and that it is that you lose ability to see above the simple things. This in turn, costs you performance but also functionality, i.e. what you can do. Take for example pipes in Unix. They are great for data stored in text organized in a records consisting of lines. But what about JSON, XML and other complex structures? In other word, being simple means you can do just a simple things.

This issue of simple and manageable and complex and more able is actually a theme that occurs in different areas, too. For example, in networking where you have layers, but because cross layer communication is restricted means you have a problems with modern networks. Or, take for example programming and organizing software in simple modules/objects. Again, the more primitive base system is, the more problems you have to achieve complex behavior - in terms of performance, complexity, and so on.

Few more things to add to the mix about Unix development. First, Unix is definitely success. But it doesn't mean that everything that Unix did is a success. There are things that are good, bad, and ugly. Nothing is perfect, nor will ever be. So, we have to keep in mind that Unix can be better. The next thing we have to keep in mind is that each one of us has a particular view on the world, a.k.a. Unix, and our view is not necessarily the right view, or the view of the majority. This fact should influence the way we express ourselves in comments. So, do not over generalize each single personal use case. Yet, there are people who's opinion is more relevant, and that are those that maintain init/systemd and similar systems, as well as those that write scripts/modules for them.

Anyway, I'll stop here. My general opinion is that we are in 21st century and we have to re-evaluate everything we think should be done (customs) and in due course not be religious about certain things.

P.S. Systemd is not a single process/binary but a set of them so it's not monolithic. Yet, some argue its a monolithic! What is a definition of "monolithic"? With that line of reasoning GNU's coreutils is a monolithic software and thus not according to the Unix philosophy!

Tuesday, November 1, 2016

A bit about RSS feed readers on Linux

I'm monitoring lot of sites using RSS so having a good RSS feed reader is mandatory for me. Once upon a time, I used Liferea but since I have a lots of RSS feeds with lots of posts I want to keep around, turned out that Liferea wasn't designed with scalability in mind. So, I decided to find another one. Web based readers are out of question, because I prefer desktop applications. Not to mention that locally I have lot of disk storage that I don't have to pay, while storage in the cloud I would have to pay due to my heavy use of it.

After a search I settled on QuiteRSS. In the process I tried RSSOwl but I wasn't able to start it due to different XULRunner version on my Fedora. Besides, it turns out the last version of RSSOwl was released in December 2013, and isn't maintained any more. QuiteRSS was very good, but it turned out that the bug in Webkit started to annoy me. So, I started to explore RSS feed readers again. Note that I have the following requirements:
  • No Web application! I want desktop RSS feed reader with GUI interface. It would be nice, though, that I can synchronize it with a reader on a mobile phone!
  • I have a large number of feeds and keep a lot of new (that is unread :D) posts around. So, scalability is of paramount importance.
  • And last, but not least, nice looking and usable GUI. 
This brought me to three candidates: QuiteRSS, FeedReader and RSSGuard. I'll describe each of them in a bit more details below. But before that, note that this is a live post, i.e. I'll still try all the mentioned readers and update it with new experience. Also, I would like to hear you comments/sugestions, so if you have any, please leave a comment.

QuiteRSS

QuiteRSS is quite good and I'm using it all the time. There is a homepage and GitHub development page. It has the ability to tag posts, mark them as a read, etc.

It is interesting to look at QuiteRSS GitHub page. From there, the following conclusions can be inferred:
  1. QuiteRSS is quite popular, 33 watches, 180 stars and 28 forks.
  2. QuiteRSS is basically in maintenance mode since there is no substantial activity since 2014. From 2012 to 2014 development was very intensive.
  3. There are 212 open issues and 719 closed ones. I think that there are a lot of open issues but more thorough statistics has to be performed to know for certain.
The problems are the following ones, from the most important to the least important ones:
  • You have to disable JavaScript because QuiteRss often freezes on some feeds while loading. It still freezes with some RSS feeds and if that happens some history is lost (read feeds, marked/tagged feds, etc).
  • If you accidentally click on a link to PDF file, QuiteRSS freezes!
  • Once I mistakenly selected the option "Mark all news read" which is irreversible. There is no confirmation dialog for such cases.
  • Some posts on GitHub are in Russian. That's a problem because not everyone is speaking Russion. ;)
  • It depends on Qt4 and Webkit4 that are not maintained any  more.

FeedReader

FeedReader is a interesting because it has two components, daemon and a front end. This is uniqe to other readers that bundle those functions together into a single binary. You can read more about this reader on its homepage, and there is also GitHub development page. Looking at the Web page, it has a lots of features but I'm using only a few, if any at this stage. Take this into account while reading this review. Looking at the GitHub page of FeedReader, the following conclusions can be inferred:
  1. FeedReader is somehow less popular than QuiteRss. It has 26 watches (against 33 for QuiteRss), 152 stars (against 180) and 6 forks (against 28).
  2. FeedReader is in active development, and all the activity is concentrated in 2016 with some additional in 2015.
  3. There are 27 open issues and 197 closed ones. This is better ratio than for QuiteRss, but again more research has to be done!
First problem I had was while removing feeds. It was painful because it doesn't allow selection of multiple feeds or feed groups at once.

The next problem was that only two level hierarchy supported, while in QuiteRSS I have three level. So, importing OPML file with multiple levels will result in transforming everything into two layers.

While removing certain feed folders, some of them kept coming back! Maybe the problem was that I right-clicked on a feed and selected delete but it was necessary to first left-click and then right-click. Who will know...

RSS Guard

RSS Guard, as all the other feed readers mentioned above, has its GitHub development page. As for the homepage, it uses Wiki on GitHub. Again, by looking into GitHub page, the following conclusions can be made:
  1. RSSGuard has 6 watches, 21 stars and 6 forks. This makes it the lowest ranked by popularity of the three RSS readers reviewed here.
  2. RSSGuard is in development since 2013 with evenly spread development efforts. This probably means it isn't going to be finished soon.
  3. It has 11 open issues reported and 51 closed. Which isn't that bad.
So, some shortcomings from the personal experience. It is a bit non-intuitive. It took me some time to realize that in order to import OPML file, first I have to create account. Another non-intuitive task was the process of importing itself. When you select OPML file and all the feeds appear, you click OK, but then you have to click Close. First time I clicked twice OK and got all the feeds imported twice!

It support multilevel feed organization, but it is not possible to fold certain feed groups, i.e. they are always unfolded! I finally realized that it is possible to fold a folder, you just need to click twice in order to fold/unfold it. But, this isn't something particularly intuitive, nor visible. Namely, if the folder is folded there is no indication nor there is indication that that the folder can be folded.

When I click "Update all items" button in a toolbar, I expected that all feeds will be updated. But for some reason, that didn't happen.

Conclusion

Comparing development of each of the proposed readers, it turns out that each one of them basically depends on a single developer and has its own pros and cons. In the end, I think that despite its shortcomings, QuiteRSS is still the best feed reader closely followed by FeedReader. If development activity of FeedReader continues with the same intensity, expect that it will become the best RSS among the three.

ChangeLog

  • 20161101 - Initial version


Wednesday, June 29, 2016

Quick note about Fedora 24 and VMWare Workstation 12.1

I just updated Fedora 24 from update-testing repository and that pulled Linux kernel 4.6. Well, as usual, VMWare Workstation needed some patching in order to work. Luckily, I quickly found a fix on VMWare forums. Note that at the end of the thread there is a script you can use to automatically patch necessary files. But, be careful, I didn't try it!

Anyway, after patching, run:

vmware-modconfig --console --install-all

and that should be it!

Just as a sidenote, turns out that the same info I found on Arch Wiki page devoted to VMWare. And that page is full of information so you should bookmark it and whenever you have a problem with VMWare check there.

Edit 20160707: The script mentioned above had some errors in it. Here is the fixed version.

Wednesday, March 16, 2016

Network namespaces and NetworkManager

This post documents a process to implement support for network namespaces in the NetworkManager. The code described in the post can be found on GitHub. While my personal motivation to add namespace support to NetworkManager was to be able to add support for provisioning domains as specified by IETF MIF WG, it also brings benefits to existing users by allowing isolation of different applications into different network namespaces. The best example is a VPN connection isolation. When network namespaces are used then certain applications can be started in namespace in which only network connectivity is via VPN. Those applications can access VPN resources while all the other applications, that are in different network namespaces, will not see VPN connection and thus couldn't use them. The additional benefit would be from using multiple connections, as described by MIF Architecture RFC.

Note that after I started to work on this, Thomas Haller implemented basic support for namespaces in NetworkManager that was a bit different in some implementation details from mine.

The idea


The intention of implementing support for network namespaces was to allow applications to be isolated so that they use specific network connections. Network namespaces in Linux kernel have some properties that anyone that wants to use them must be aware of. This is documented in a separate post.

So, an application can not be moved between different network namespace by some other application, i.e. only application itself can change network namespaces, and only if it has appropriate permissions.

So, the idea is the following one. Application is started in some network namespace. This can be done easily (e.g. see 'ip netns exec' command). Then, this network namespace is manipulated by either the application itself - it is aware of NM's support for network namespaces, or by third party application (that application could be nmcli). The manipulation means that requests are sent to NetworkManager via D-Bus to make changes to network namespace. The changes can be activation and deactivation of certain connections. NetworkManager, based on those requests and on specifics of connections and devices those connections are bound to, determines what to do. For example, it can create virtual device in the network namespace, or can move physical device. Basically, this part isn't important to the application itself, the only thing that is important is that the application is assigned requested connections.

Implementation


The following changes were made to NetworkManager in order to introduce support for network namespaces:

  1. Created new object NMNetnsController whose purpose is to allow management of all network namespaces controlled by NetworkManager. Via the interface org.freedesktop.NetworkManager.NetworkNamespacesController it is possible to create a new network namespace, or to remove the existing one. It is also possible to obtain a list of existing network namespaces.
     
  2. Created new object NMNetns that represents a single network namespace. So, when new network namespace is created a new object NMNetns is created and exposed on D-Bus. This object allows manipulation with network namespace via the interface org.freedesktop.NetworkManager.NetNsInstance. So, it is possible to get a list of all devices within the network namespace, take certain device from some other network namespace and to activate some connection.
     
  3. NMSettings is now singleton object. This wasn't so significant change because there was also one object of this type before, but now it is more explicitly exposed as such.
     
  4. NMPlatform, NMDefaultRouteManager and NMRouteManager aren't singleton objects any more. They are now instantiated for each new network namespace that is created.


VPN isolation


VPN isolation was done as a first user of network namespaces implementation. It was easier then other connections because the assumption was that VPN connection should live only in single network namespace and it should be the only available connection.

At the beginning, there was doubt on where to place the knowledge of network namespace and two places were candidates, in NMActiveConnection and NMVPNConnection classes. NMActiveConnection is actually a base class of NMVPNConnection class. The modification of NMVPNConnection approach is better for the following reason because the idea was to introduce new configuration parameter in the configuration file of a VPN connection that will specify that isolation is necessary and also some additional behaviors:
  • netns-isolate

    Boolean parameter (yes/no) which defines weather VPN connection should be isolated within a network namespace or not. For backwards compatibility reasons
     
  • netns-persistent

    Should network namespace be persistant (yes) or not (no). Persistant namespace will be retained when VPN connection is terminated, while non-persistant will be removed.
     
  • netns-name

    Network namespace name. Special value is uuid which means connection's UUID should be used, also name is special value that requests connection's name to be used. Finally, any other string is taken as-is and used as a network namespace name.
     
  • netns-timeout

    How much time to wait (in milliseconds) for device to appear in target namespace.
Basically, the implementation is such that when device appears in root network namespace it is taken from there (using TakeDevice method, but called directly instead via D-Bus).  When device appears in the target network namespace network parameters are assigned to the interface. This was tested with OpenVPN type of VPN.

The implementation has two problems. First, the case of VPN connections that don't create virtual devices but instead just modify packet processing rules in the Linux kernel (i.e. XFRM). Secondly, hostname and name resolution parameters aren't assigned because the infrastructure is lacking in that respect.

Conclusion


The initial goal of having network namespaces support in NetworkManager was achieved. There are functionalities missing like isolation of any connection, hostname handling and DNS resolution handling. Those are things that will have to be resolved in the future.

Tuesday, December 29, 2015

NetworkManager and OpenVPN - How it works?

I spent a lot of time trying to figure out how NetworkManager works. Very early in the process I came to conclusion that NM is a very complex piece of a software while in the same time documentation is lacking. What adds to the complexity is GObject mechanism that tries to fit object oriented programming into C programming language. So, in the end, I decided to write everything I managed to learn. First, in that way I'm leaving notes for myself later. In addition, I hope I'll help someone and thus save someone's time.

NetworkManager


As a first goal to understand NetworkManager I set to understand how NM manages VPN connections. The following sequence of steps is a result of a research to answer the given question. Note that the flow isn't complete, but is enough to understand mechanics of VPN establishment. In addition, note that all the given file pathnames are relative to NetworkManager git repository.

So, everything starts when user activates VPN. At that moment a message is sent via DBus by nm-applet, nmcli or some other mechanism the following happens:
  1. Message to activate VPN sent via the DBus ends up in the function src/nm-manager.c:impl_manager_activate_connection().
  2. The function src/nm-manager.c:impl_manager_activate_connection() calls function src/nm-manager.c:_new_active_connection().
  3. Function _new_active_connection() function creates a new object of a type NM_TYPE_ACTIVE_CONNECTION, i.e. new active connection. To create new active connection object a method src/vpn-manager/nm-vpn-connection.c:nm_vpn_connection_new() in this particular case is used. This triggers a chain of initialization events described later.
  4. After the object is created a control returns to impl_manager_activate_connection() where asynchronous authorization check is initiated. Callback function to call when authorization check finishes is src/nm-manager.c:_activation_auth_done().
  5. After authorization is done the callback function _activation_auth_done() is called which in turn, if everything is OK, calls function src/nm-manager.c:_internal_activate_generic() which in turn calls function src/nm-manager.c:_internal_activate_vpn().
  6. The function _internal_activate_vpn() calls the function src/vpn-manager/nm-vpn-manager.c:nm_vpn_manager_activate_connection() is called.
  7. The function src/vpn-manager/nm-vpn-manager.c:nm_vpn_manager_activate_connection() calls function src/vpn-manager/nm-vpn-connection.c:nm_vpn_connection_activate().
  8. The function nm_vpn_connection_activate() connects asynchronously to DBus and when the connection is made a callback function on_proxy_acquired() is called.
  9. The function on_proxy_acquired() connects to signal "notify::g-name-owner" and then calls src/vpn-manager/nm-vpn-connection.c:nm_vpn_service_daemon_exec(). The purpose of the function nm_vpn_service_daemon_exec() is to start plugin binary (nm-openvpn-service).

    The goal of connecting to the signal "notify::g-name-owner" is to receive notification when the service appears, i.e. when it is initialized and available over the DBus so that appropriate signals can be registered. The most important registered signals, in our case, are Ip4Config and Ip6Config. When the function on_proxy_acquired() connects callback functions to DBus signals the process of establishing VPN can be continued so it also calls the function src/vpn-manager/nm-vpn-connection.c:get_secrets().
  10. The function get_secrets() calls function nm_settings_connection_get_secrets() and registers a callback src/vpn-manager/nm-vpn-connection.c:get_secrets_cb().
  11. The function src/vpn-manager/nm-vpn-connection.c:get_secrets_cb() sends secrets to VPN plugin via DBus. The send process is asynchronous and when finished callback src/vpn-manager/nm-vpn-connection.c:plugin_need_secrets_cb() is called.
  12. The callback function src/vpn-manager/nm-vpn-connection.c:plugin_need_secrets_cb() calls function src/vpn-manager/nm-vpn-connection.c:really_activate().
  13. The function src/vpn-manager/nm-vpn-connection.c:plugin_need_secrets_cb() calls Connect method on VPN plugin via DBus interface. Since DBus call is asynchronous callback function is registered, src/vpn-manager/nm-vpn-connection.c:connect_cb(). Callback function just calls src/vpn-manager/nm-vpn-connection.c:connect_success() that clears all timers.
The call to Connect method mentioned in the last step above initiates the chain of events in the VPN plugin described in the next section that, in the end, results in real connection establishment. When VPN connection is established, the VPN plugin will send three signals: Config, Ip4Config and Ip6Config. Those signals are caught by callbacks config_cb(), ip4_config_cb() and ip6_config_cb() respectively. Those signals, when activated, will be caught in nm-vpn-connection.c (look at step 9 in the previous list).


OpenVPN plugin


VPN plugins for NetworkManager are written so that they inherit some base classes from Network Manager and they only have to implement code specific to a particular VPN, while generic parts are implemented by NetworkManager  and placed in the base class. For example, DBus interface that each plugin has to implement is common to all of them and thus implemented in the base class. This is classic OO programming paradigm. But, in this case OO paradigm is emulated in C so when you start to study the code of some specific plugin, at first sight nothing will make sense and it will be hard to grasp what happens, where and when. So, before I describe sequence of events, I'll describe a code structure first as it aids a lot in understanding the code.

Static code structure and plugin initialization


The main part of the NetworkManager OpenVPN plugin is in the file src/nm-openvpn-service.c. This file inherits a class defined in libnm/nm-vpn-service-plugin.c from NetworkManager. Additionally, VPN DBus interface is defined in NetworkManager source in file introspection/nm-vpn-plugin.xml.

So, when you build OpenVPN plugin, a new binary is created, nm-openvpn-service. When NetworkManager executes this plugin its main method is invoked. In main method, the most important line is the following one:
plugin = nm_openvpn_plugin_new (bus_name);
That line causes new object of type NM_TYPE_OPENVPN_PLUGIN to be instantiated. Looking at the code of the function nm_openvpn_plugin_new() nothing special can be seen at a first glance. Basically, there is only the following line:
plugin = (NMOpenvpnPlugin *)
    g_initable_new (NM_TYPE_OPENVPN_PLUGIN,
                    NULL, &error,
                    NM_VPN_SERVICE_PLUGIN_DBUS_SERVICE_NAME,
                    bus_name,
                    NULL);
But, because GObject type system is in background, actually a lot of things happen. First, initialization methods/constructors of OpenVPN plugin class and objects are called (nm_openvpn_plugin_init() and nm_openvpn_plugin_class_init()). Also, initialization methods/constructors of base class are called (nm_vpn_service_plugin_class_init() and nm_vpn_service_plugin_init()).

In addition, base class nm-vpn-service-plugin defines an interface that is initialized separately from class and object. The mechanism used is described here.

Note that after the plugin is initialized, NetworkManager receives information about this via a signal notify::g-name-owner in src/vpn-manager/nm-vpn-connection.c. This causes src/vpn-manager/nm-vpn-connection.c to connect to DBus interface and starts the sequence of steps described in the following subsection.

VPN activation sequence of steps


In the following code, when I write base class I mean on the code in the file libnm/nm-vpn-plugin-service.c in NetworkManager. When I write OpenVPN class or OpenVPN service then I'm thinking on file src/nm-openvpn-service in network-manager-openvpn code.

The following sequence of steps is initiated by calling method Connect on VPN plugin via DBus:
  1. DBus signal initiates a function impl_vpn_service_plugin_connect() in base class. This function is registered as a handler for DBus Connect method serviced by the plugin in the function init_sync() in the base class. If you look at the code of the given function you'll notice that it only transfers control to the function _connect_generic() in the base class.
  2. The function _connect_generic() does some checks and transfers function to OpenVPN class, i.e. the method real_connect() is called in the file src/nm-openvpn-service.c.
  3. The method real_connect() just redirects control to the function _connect_common().
  4. The method _connect_common() does some sanity check, obtains some parameters, and calls method nm_openvpn_start_openvpn_binary().
  5. The method nm_openvpn_start_openvpn_binary() searches for openvpn binary on a local file system, constructs command line options based on preferences set for the VPN connection and from environment variables, and finally starts openvpn binary (call to function g_spawn_async()). One important part of the command line construction is that openvpn binary is told not to assign parameters itself, but to call helper scripts and pass it all the parameters. This helper script is nm-openvpn-service-openvpn-helper contained in the src directory of networkmanager-openvpn-plugin. So, when OpenVPN binary establishes VPN connection it calls helper script.

    It also registers two callbacks. The first one monitors process using a g-child-watch-source-new() function. A callback function is 
    openvpn_watch_cb() that, if called, assumes an error occurred or openvpn binary just finished. The second callback is timer that, when fires, calls function nm_openvpn_connect_timer_cb(). It only tries to connect to OpenVPN's management socket. So, in other words, notification about successful VPN establishment to OpenVPN plugin isn't done using GLib or DBus notification system, but just by waiting and checking. As a final note, timer isn't used in every case. Actually, it seems that it is used rarely, only when authentication type is static key.
  6. The nm-openvpn-service-openvpn-helper script, in its main function, collects all network related data from the environment (set by openvpn binary) and sends it via DBus to nm-openvpn-service using methods SetConfig, SetIp4Config and SetIp6Config.
  7.  The configuration is caught by functions impl_vpn_service_plugin_set_config()impl_vpn_service_plugin_set_ip4_config(), and impl_vpn_service_plugin_set_ip6_config() in VPN plugin base class. Those function, in turn, call functions nm_vpn_service_plugin_set_config()nm_vpn_service_plugin_set_ip4_config() and nm_vpn_service_plugin_set_ip6_config(). They also call finish DBus proxies so that helper script can continue executing after each call.
  8. Each of the functions nm_vpn_service_plugin_set_config()nm_vpn_service_plugin_set_ip4_config() and nm_vpn_service_plugin_set_ip6_config() does two things. It emits signal ("config", "ip4-config" and "ip6-config") from base class and OpenVPN class. [Note: I don't fully understand this mechanism yet]

Literature

It seems to me that it is very hard to come up with a good documenation that describes this topic. So, here are some better references I used:

  1. For signals the best reference I could find was from Maemo 4 documentation, available on the following link: http://maemo.org/maemo_training_material/maemo4.x/html/maemo_Platform_Development_Chinook/Chapter_04_Implementing_and_using_DBus_signals.html

Monday, October 26, 2015

Intercepting and redirecting traffic from VMWare Workstation

After some time I decided to try again sslstrip (note that there is enhanced version of sslstrip on GitHub). First, I tried to do everything on a single host, i.e. sslstrip is running on a same machine as is browser whose traffic I wanted to intercept. It turned out that this isn't so easy because there is no rule for iptables that would allow me to differentiate traffic of Firefox and SSLStrip.  Because of that fact, if I were to redirect traffic that tries to go the port 80 on the Internet so that Firefox's traffic goes to SSLStrip, I would also redirect traffic from SSLStrip to itself creating infinite loop. For some reason filtering based on PID, a.k.a. process identified, isn't possible. It used to be possible, but then it was removed. There are two solutions to overcome the problem of running everything within a single OS:
  1. Run SSLStrip as a separate user (or Firefox as a separate user). IPTables allows match on UID, or
  2. Run Firefox or SSLStrip in a separate network namespace.
Instead, I decided to use VMWare Workstation (I had a version 11.1.2 when I wrote this post) and to intercept its traffic. The reason I chose VMWare was that I had one virtual machine already running at hand. But it turned out that it isn't so easy, either. In the end, the problem was caused by my lack of understanding on how VMWare Workstation works. Additional problems were caused by delays that happen after changing network parameters, i.e. it takes time that the changes take effect and thus to be observable.

So, the main problem was that traffic generated by the virtual machine isn't intercepted by standard Netfilter rules when default configuration is used along with NAT networking. To see why, look at the following figure which show logical network topology:



The figure shows that traffic from Guest OS goes to virtual switch (vswitch) and then to the external network bypassing Host's netfilter hooks. So, even though there is vmnet8 in the host OS and traffic can be seen on the given interface, it doesn't go through standard NETFILTER hooks. Actually, vmnet8 is Host's interface to virtual switch. Also, if you take look at how the network is configured within Guest OS you'll note that the gateway address is set to x.y.z.2 while IP address of vmnet8 is x.y.z.1.

The behavior when the gateway is x.y.z.2 is:
  1. If you try to filter packets going from Guest OS to the Internet using iptables you won't succeed. I tried to put DENY rule in filter table of all chains and it didn't work. Actually, nothing will work as netfilter hooks aren't called at all.
So, the trick, in the end, is to change default GW on Guest OS so that traffic is routed to HostOS where you can then manipulate it. In that case you'll have to also:
  1. enable IP forwarding (/proc/sys/net/ipv4/ip_forward) on Host OS.
  2. Activate NAT or masquerade for outgoing traffic if you want guest machine to communicate with the outside world!
Note that you can observe some delay between setting some parameter/behavior (like adding NAT rule) and the time the behavior starts to be observable, i.e. it works. Anyway, at this point you can redirect network traffic, in other words, you can then use netfilter hooks.

So, to finish this post, you should start SSLStrip and add the following rule on Host OS:
iptables -A PREROUTING -t nat -i vmnet8 -p tcp \
        --dport 80 -j REDIRECT --to-port 10000
And then try to access, e.g. http://www.facebook.com. Be careful that you didn't previously access Facebook because in that case browser will know it must use https and it won't try to use http, even if you typed it in the URL bar. If this happens than just clean the complete browser history and try again.

Friday, January 9, 2015

Getting free disk space in Linux

While working on a script to have full Zimbra backups as many days in the past as possible, I was trying to automatically remove old backups based on the free space value. Basically, the idea was to remove directory by directory until free space reached some threshold. To find out free space on a disk is easy, use df(1) command. Basically, it looks like this:
$ df -k /
Filesystem 1K-blocks     Used Available Use% Mounted on
/dev/sda1   56267084 39311864  16938836  70% /
The problem is that it is necessary to use some postprocessing in order to obtain desired value, i.e. 5th or 5th column. cut(1) command, in this case, is a bit problematic because in general you can not expect that the output is so nicely formatted, nor it is fixed. For example, based on the width of the widest device node in the first column, it is automatically resized. That in turn means number of whitespaces varies, and you end up being forced to use something else than cut(1). Probably, the most appropriate tool is awk(1), since awk(1) can properly parse fields separated with variable number of whitespaces. In addition, you need to get rid of first line. That can be done using head(1)/tail(1), but it is more efficient to use awk(1) itself. So, you end up with the following construct:
$ df -k / | awk 'NR==2 {print $4}'
16938836
But, for some reason, I wasn't satisfied with the given solution because I thought I'm using too complex tools for something that should be simpler than that. So, I started to search is there some other way to obtain free space of some partition. It turned out that stat(1) command is able to do that, but it's rarely used for that purpose. It is used to find out data about files, or directories, but not file systems. Yet, there is an option, -f, that tells stat(1) we are querying file system, and also there is an option --format which accepts format sequences in a style of date(1) command. So, to get the free space on root file system you can use it as follows:
$ stat -f --format "%f" /
4238805
stat(1) command without --format option prints all the data about file system it can find out:
$ stat -f /
  File: "/"
    ID: b8a4e1f0a2aefb22 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 14066771   Free: 4238805    Available: 4234709
Inodes: Total: 3588096    Free: 2151591
This makes it in some way analogous to df(1) command. But, we are getting values in blocks, instead of kilobytes! You can get block size using %S format sequence, but that's it. So, some additional trickery is needed. One solution is to output arithmetic expression and evaluate it using bc(1) command, like this:
$ stat -f --format "%f * %S" / | bc
17362145280
Alternatively, it is also possible to use shell's arithmetic evaluation like this:
$ echo $((`stat -f --format "%f * %S" /`))17362145280
But, in both cases we are starting two process. In a first case the processes are stat(1) and bc(1), and in the second case it is a new subshell (for backtick) and stat(1). Note that this is the same as the solution with awk(1). But in case of awk(1) we are starting two more complex tools of which one, df(1), is more targeted to display value to a user than to be used in scripts. One additional advantage of a method using awk(1) might be portability, i.e. I'm df(1)/awk(1) combination is probably more common than stat(1)/bc(1) combination.

Anyway, the difference probably isn't so big with respect to performance, but obviously there is another way to do it, and it was interesting to pursue an alternative. 

Tuesday, September 17, 2013

DHCPNAK messages in log file

When I was checking log files I spotted the following log entries that were strange:
Sep  7 11:32:20 srv dhcpd: DHCPREQUEST for 1.1.1.151 from 00:40:5a:18:83:56 via eth0
Sep  7 11:32:20 srv dhcpd: DHCPACK on 1.1.1.151 to 0:4:5:1:8:5 via eth0
Sep  7 11:32:20 srv dhcpd: DHCPREQUEST for 1.1.1.151 from 0:4:5:1:8:5 via 1.1.1.10
Sep  7 11:32:20 srv dhcpd: DHCPACK on 1.1.1.151 to 0:4:5:1:8:5 via 1.1.1.10
Sep  7 11:32:20 srv dhcpd: DHCPREQUEST for 1.1.1.151 from 0:4:5:1:8:5 via 1.1.0.10: wrong network.
Sep  7 11:32:20 srv dhcpd: DHCPNAK on 1.1.1.151 to 0:4:5:1:8:5 via 1.1.0.10
The problem is that DHCP request is received three times, on two of which the answer is positive (DHCPACK) while one received negative response (DHCPNAK) and dhcpd logged the error message 'wrong network'.

The important thing is the network configuration in this specific scenario, which looks something like follows:
  +----+            +-----+              +----+
  |    |------------|     |--------------|    |
  +----+            +-----+              +----+
  Client      Firewall/DHCP relay      DHCP server
1.1.1.151    1.1.1.10     1.1.0.10       1.1.0.4
Looking into log entries, not much can be inferred. The only thing that can be seen is that third DHCPREQUEST came from 1.1.0.10 which isn't on the same network with a client requesting IP address. Sniffing the network gave a bit more information on what's happening. Analyzing the network trace the following were conclusions:

  1. There are three DHCPREQUEST messages with the same transaction ID, the same destination (1.1.0.4, i.e. DHCP server) and also client IP address field within DHCP request is set to 1.1.1.151.
  2. The first DHCPREQUEST comes directly from the client. It has source IP 1.1.1.151, and there is no relay field (i.e. the value is 0.0.0.0). Also, client MAC address field within DHCP request has MAC address of a given client. 
  3. The second DHCP request comes from DHCP relay on the firewall. It has source set to 1.1.0.10, and relay field is properly set to 1.1.1.10, i.e. the IP address from the client's network,.
  4. The third DHCP request also comes from DHCP relay on the firewall, but this time relay field is set to 1.1.0.10. This contradicts client's IP address and DHCP server rejects this request.
So, the conclusion is that client sends request to 1.1.0.4. This request is forwarded by the firewall to the server, but also intercepted by DHCP relay on the firewall that creates two proxy requests and sends them to DHCP server too, one of which is rejected.

The interesting thing, not visible in logs, is that DHCP relay upon receiving NAK from the DCHP server, generates new NAK that is broadcasted on the network where DHCP server lives. 

So, the conclusion is that firewall is wrongly configured. It should not forward DHCP requests if there is a relay agent running. Furthermore, those NAKs aren't seen by the client, only by DHCP relay that reflects them back to DHCP servers.

Thursday, September 12, 2013

Adding Zimbra disclaimer using shell scripts...

While Zimbra 8 (and 7, too) have domain wide disclaimer support built in, there are two shortcomings that forced me to fall back to the old way of doing it:
  1. There is no support for not adding disclaimer if it already exists, and
  2. No support to exclude some addresses from adding disclaimer.
The second problem I managed to solve by patching Amavis script. That approach adds extra effort for maintainability (primarily during the upgrades), but it works. To solve the first problem the same way was too much work that I wasn't prepared to invest so I had to abandon domain wide disclaimer provided by Zimbra. There was also a third problem. Namely, for all mail messages sent from Outlook, Zimbra added two extra characters at the end of a HTML disclaimer, namely characters "= A". Why is this, I don't have slightest clue. I suspect it has something to do with encoding and decoding messages while going through the mail system, but exact reasons are unknown to me.

So, I went to solve all those problems and first I tried the old way, namely modifying postfix subsystem. It turned out that it didn't work. Just for a reference, at the end of this post, I described what I did. Next, option was modifying amavis. But that turned out to be too complicated and error prone - as I said in the introduction paragraph. Finally, I decided to put a proxy script in front of altermime that will be called by amavis and that will check if there is already disclaimer. If it isn't, then it calls altermime. Note that in this way there was no need to change amavis, and that means a lot from the maintenance perspective. So, here is what I did.

First, I created the following simple script in /opt/zimbra/altermime directory:
#!/bin/bash
echo "`date +%Y%m%d%H%M%S` $@" >> /tmp/altermime-args
exec /opt/zimbra/altermime-0.3.10/bin/altermime-bin "$@"
What it does is it just logs how it was called and then it calls altermime. Note one more important thing here. In order to be able to put this script before altermime, I had to call it altermime, and altermime binary I renamed to altermime-bin. If you are doing this on a live system be very careful how you do this switch. I suggest that you first create script called altermime.sh, check that it works, and then use the following command to make a switch:
mv altermime altermime-bin && mv altermime.sh altermime
Ok, in this way I was able to find out how altermime is actually called. This is what I saw in /tmp/altermime-args file:
20130912100915 --input=/opt/zimbra/data/amavisd/tmp/amavis-20130912T100229-30384-pc8afS_K/email-repl.txt --verbose --disclaimer=/opt/zimbra/data/altermime/global-default.txt --disclaimer-html=/opt/zimbra/data/altermime/global-default.html
That's just one line of the output. As it can be seen, the first argument specifies file with mail message, and the rest specify disclaimer to be added. So, in order not to add disclaimer, if there is already one, I modified the altermime.sh script to have the following content:
#!/bin/bash
grep "DISCLAIMER:" ${1#--input=} > /dev/null 2>&1
if [ ! "$?" = 0 ]; then
    exec /opt/zimbra/altermime-0.3.10/bin/altermime-bin "$@"
fi
Again, be careful if you are modifying this script on a live system.

Now, in order to control where disclaimer is added, you can modify this simple shell script. One more thing you should be aware of, this approach impacts performance as, instead of running one process, it now runs at least 3 per mail message, and there are few extra file accesses. 

Finally, as a side note, I managed to get rid of those strange characters added to Outlook's email messages. I just edited a little bit html file that contains disclaimer, and those characters were gone. That's definitely a bug somewhere, but who knows where...

The old way that didn't work

As I said, the first approach I tried is to use the procedure from Wiki. But it didn't work. Anyway, for a reference, here is what I tried to do. Note that, as Zimbra already ships with altermime, there is no need to install it. The altermime is in /opt/zimbra/altermime/bin directory and you can safely use it. Ok, now to changes:

First, change a line in master.cf.in that reads
smtp    inet  n       -       n       -       -       smtpd
into
smtp    inet  n       -       n       -       -       smtpd        -o content_filter=dfilt:
and also add the following two lines:
dfilt   unix  -       n       n       -       -       pipe
        flags=Rq user=filter argv=/opt/zimbra/postfix/conf/disclaimer.sh -f ${sender} -- ${recipient}
Note that by this last line you specified that your script is called disclaimer.sh and that it is placed in /opt/zimbra/postfix/conf directory. This script, when run, should be run with a user privileges filter. Also, be careful where you put those lines. Namely, put them after the following three lines:
%%uncomment SERVICE:opendkim%%  -o content_filter=scan:[%%zimbraLocalBindAddress%%]:10030
%%uncomment LOCAL:postjournal_enabled%% -o smtpd_proxy_filter=[%%zimbraLocalBindAddress%%]:10027
%%uncomment LOCAL:postjournal_enabled%% -o smtpd_proxy_options=speed_adjust
The reason is that those line logically belong to the first smtp line, and if you add dfilt in front of it, you'll mess things, probably very badly, depending on your luck!

If you had Zimbra's domain wide disclaimer enabled, then disable it using:
zmprov mcf zimbraDomainMandatoryMailSignatureEnabled FALSE
as a zimbra user, and then restart amavis:
zmamavisdctl restart
still as a zimbra user.

Finally, to active custom script to add disclaimer run the following command as zimbra user:
zmmtactl restart
After I did all that, it didn't work. :D But, then I realized that there are two content_filter options to smtp which might not work, and so I resorted to proxying altermime.

Saturday, August 10, 2013

Getting Libreswan to connect to Cisco ASA 5500

Here are some notes about problems I had while trying to make Libreswan connect to Cisco ASA. Note that I'm using certificate based authentication in a roadwarrior configuration. The setup is done on Fedora 19.

The procedure, at least in theory, is quite simple. Edit, /etc/ipsec.conf file, modify /etc/ipsec.d/ipsec.secrets file, import certificates into NSS database and then start libreswan daemon. Finally, activate the connection. Well, it turned out that the theory is very far from the practice. So, here we go.

Preparation steps

First, I used the following ipsec.conf file when I started to test connection to the ASA:
version 2.0     # conforms to second version of ipsec.conf specification
# basic configuration
config setup
    nat_traversal=yes
    nhelpers=1
    protostack=netkey
    interfaces=%defaultroute

conn VPN
    # Left side is RoadWarrior
    left=%defaultroute
    leftrsasigkey=%cert
    leftcert=SGROS
    leftca=ROOTCA
    leftid=%fromcert
    leftsendcert=always
    # Right side is Cisco
    right=1.1.1.1 # IP address of Cisco VPN
    rightrsasigkey=%cert
    rightcert=CISCO
    rightca=%same
    rightid=%fromcert
    # config
    type=tunnel
    keyingtries=2
    disablearrivalcheck=no
    authby=rsasig
    auth=esp
    keyexchange=ike
    auto=route
    remote_peer_type=cisco
    pfs=no
Note few things about this configuration. First, my client machine (roadwarrior) is a left node, while Cisco is the right one. Next, I didn't arrive to this configuration immediately, I had to experiment with the value of interfaces and left statements. The reason was that I'm assigned dynamic NATed address. So, those settings will cause openswan to automatically select appropriate values (interface and IP address) at the time I'm connecting to VPN. Also, certificate related stuff also took me some time to figure it out.

Into ipsec.secrets file I added the following line:
: RSA SGROS
this will cause openswan to use RSA key for authentication. Finally, I had to import certificates and keys into the NSS database. Note that NSS database is already precreated in /etc/ipsec.d directory. More specifically, the database consists of files cert8.db, key3.db and secmod.db. To see imported certificates (if there are any), use the following command:
# certutil -L -d /etc/ipsec.d/
Certificate Nickname               Trust Attributes
                                   SSL,S/MIME,JAR/XPI
ROOTCA                             CT,C,C
SGROS                              u,u,u
CISCO                              P,P,P
In my case there are three certificates in the database. Mine (SGROS), VPN's (CISCO) and CA that signed them (ROOTCA). Note that I'm referencing those certificates in ipsec.conf file. If you are configuring this database for the first time, it will be empty and you'll have to import all the certificates.

To import CA into your database, use the following command:
certutil -A -i rootca.pem -n ROOTCA -t "TC,TC,TC" -d /etc/ipsec.d/
Note that I'm assuming you have PEM version of certificate stored in the current directory (argument to the -i option). For the rest options, and their meaning, please consult man page.

To import your certificate, with a private key, use the following command:
certutil -A -i certkey.pfx -n SGROS -t "u,u,u" -d /etc/ipsec.d/
Note that this time certificate and private key are stored in PKCS#12 file (named certkey.pfx).

Finally, to import certificate of Cisco ASA, use the following command:
certutil -A -i rootca.pem -n ROOTCA -t "P,P,P" -d /etc/ipsec.d/
Note that the command is very similar to the one used to import ROOTCA, but the trust attributes (option -t) are different. Namely, you don't want ASA's certificate to be CA, i.e. to be able to issue new certificates.

Starting and debugging

To start Libreswan daemon, I used the following command:
ipsec pluto --stderrlog --config /etc/ipsec.conf --nofork
That way I was forcing it not to go to the background (--nofork) and to log to stderr (--stderrlog). Then in another terminal I would trigger VPN establishment using the following command:
ipsec auto --up VPN
The first problem was that Libreswan said it can not determine which end of the connection it is, i.e. I was receiving the following error message.
022 "VPN": We cannot identify ourselves with either end of this connection.
That error message took me some time to resolve. I tried everything possible to let Libreswan knows if it is left or right part in the configuration, which included changing roles several times, changing different parameters and other stuff. In the end, it turned out that that has nothing to do with the configuration file, namely the problem actually was missing kernel module!? NETKEY wasn't loaded, and libreswan couldn't access IPsec stack within the kernel. To be honest, it could be inferred from the log by nothing the following lines:
No Kernel XFRM/NETKEY interface detected
No Kernel KLIPS interface detected
No Kernel MASTKLIPS interface detected
Using 'no_kernel' interface code on 3.10.4-300.fc19.x86_64
But then again, some more informative error message would actually help a lot! In the end, the following command solved that problem:
modprobe af_key
and then, in the logs I saw the following line:
Using Linux XFRM/NETKEY IPsec interface code on 3.10.4-300.fc19.x86_64
that confirmed NETKEY was now accessible, and the error message about not knowing which end of the connection it is, disappeared.

Next, I had problem with peer's public key. The error message I received was:
003 "VPN" #1: no RSA public key known for 'C=HR, L=Zagreb, O=Some ORG, CN=Cisco ASA'
Again, I lost a lot of time trying to figure out why it can not access public key even though it is in the certificate. I also tried to extract public key and write it directly into configuration file. Nothing helped. Then, when I turned on debugging of x509 in ipsec.conf, I found some suspicious messages, like the following one:
added connection description "VPN"
| processing connection VPN
|   trusted_ca called with a=CN=ROOTCA b=
|   trusted_ca returning with failed
|   trusted_ca called with a=CN=ROOTCA b=\001\200\255\373
|   trusted_ca returning with failed
Note the garbage as the second argument of the function trusted_ca!? Googling around for something about this didn't reveal anything useful. But then, out of desperation I tried removing leftca and rightca parameters from ipsec.conf, and guess what! Everything started to work. Checking again at the logging output I saw that now b parameter has the same value as a.

Yet, it still didn't work and after some tinkering I suspected that on the Cisco side XAuth is enabled and required. This I concluded based on the log output where Libreswan says what it received from Cisco:
"VPN" #1: received Vendor ID payload [Cisco-Unity]
"VPN" #1: received Vendor ID payload [XAUTH]
"VPN" #1: ignoring unknown Vendor ID payload [...]
"VPN" #1: ignoring Vendor ID payload [Cisco VPN 3000 Series]
At first, I thought that Libreswan will support XAuth, but obviously, if it is not configured Libreswan can not use it. Also, looking into manual page it says there that Xauth is disabled by default. So, after adding the following statements into ipsec.conf file:
leftxauthclient=yes
leftxauthusername=sgros
and adding appropriate line into ipsec.secrets file:
@sgros : XAUTH "mypassword"
I managed to get further. Yet, it still didn't work. Looking again in the log output I realised that something was wrong with client configuration. Also, I got segfaults there that I didn't report to upstream for a simple fear that I might send some secret information. But, after adding the following statements into ipsec.conf segmentation fault was solved:
modecfgpull=yes
leftmodecfgclient=yes
In the logging output of Libreswan I saw that configuration parameters were properly obtained, i.e.:
"VPN" #1: modecfg: Sending IP request (MODECFG_I1)
"VPN" #1: received mode cfg reply
"VPN" #1: setting client address to 192.168.2.33/32
"VPN" #1: setting ip source address to 192.168.2.33/32
"VPN" #1: Received subnet 192.168.0.0/16, maskbits 16
"VPN" #1: transition from state STATE_MODE_CFG_I1 to state STATE_MAIN_I4
Now, it seemed like everything was connected, but ICMP probes were not going through. Using setkey command I checked that policies and associations are correctly installed into the kernel, which they were. I quickly realised that the problem was that Libreswan didn't assign IP address to my local interface, nor did it assign routes. That was easy to check by just listing interface's IP addresses. To see if this is really problem, I manually assigned address and route:
# ip addr add 192.168.2.33/32 dev wlp3s0
# ip ro add 192.168.0.0/22 src 192.168.2.33 via 192.168.1.1
and after that I was able to reach addresses within a destination network. Note that IP address given as argument to via keyword (in my case 192.168.1.1) isn't important since XFRM will change it anyway. So, the problem was why this address isn't added in the first place.

After some poking around I found that the script /usr/libexec/ipsec/_updown.netkey is called to setup all the parameters, and also, looking into that script, I found that it didn't do anything when pluto calls it using up-client parameter! So, no wonder nothing happened. I also found on the Internet post about that problem. The fix is simple, as shown in the post I linked, but it messes something with routes. After some further investigation I discovered that when adding locally assigned IP address, the script messes up netmask. To cut the story, I changed the following line:
-it="ip addr add ${PLUTO_MY_SOURCEIP%/*}/${PLUTO_PEER_CLIENT##*/} dev ${PLUTO_INTERFACE%:*}"
+it="ip addr add ${PLUTO_MY_CLIENT} dev ${PLUTO_INTERFACE%:*}"
and also, I changed the following lines for IP address removal:
-it="ip addr del ${PLUTO_MY_SOURCEIP%/*}/${PLUTO_PEER_CLIENT##*/} dev ${PLUTO_INTERFACE%:*}"
+it="ip addr del ${PLUTO_MY_CLIENT} dev ${PLUTO_INTERFACE%:*}"
You can get the complete patch here.

Some random notes

It might happen that Libreswan suddenly stops working and that you can not access network, i.e. you can only ping you local address, but not local router. In that case try to clear XFRM policy using setkey command:
setkey -FP
you can also check if there is anything with:
setkey -DP

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)

Blog Archive