Monday, December 15, 2014

Incremental backup with rsync and cp with low disk demand

I just created very simple incremental backup solution for my Zimbra installation which efficiently uses disk space. The idea is simple. First, using rsync, I'm making a copy of existing Zimbra installation to another disk:
rsync --delete --delete-excluded -a \
        --exclude zimbra/data/amavisd/ \
        --exclude zimbra/data/clamav/ \
        --exclude zimbra/data/tmp \
        --exclude zimbra/data/mailboxd/imap-inactive-session-cache.data \
        --exclude zimbra/log \
        --exclude zimbra/zmstat \
        /opt/zimbra ${DSTDIR}/
I excluded from synchronization some directories that are not necessary for restoring Zimbra. Then, using cp I'm creating copy of this directory but which only consists of hard links to original files, the content isn't copied:
cd ${DSTDIR}
cp -al zimbra zimbra.`date +%Y%m%d%H%M`
Note the option -l that tells cp to hard link files instead of making a new copy. Also, note that the copy created is named so that it contains timestamp when it was created. Here is the content of the directory:
$ ls -l ${DSTDIR}
total 16
drwx------ 7 root   root    4096 Pro  9 15:31 zimbra
drwx------ 7 root   root    4096 Pro  9 15:31 zimbra.201412131551
drwx------ 7 root   root    4096 Pro  9 15:31 zimbra.201412140326
drwx------ 7 root   root    4096 Pro  9 15:31 zimbra.201412150325
Next time rsync runs, it will delete files that don't exist any more, and when it copies changed files it will create a new copy, and then remove the old one. Removing the old one means unlinking which in essence leaves the old version saved in the directory made by cp. This way you'll allocate space only for new and changed files, while the old ones will share disk space.

This system uses only the space it needs. Now, it is interesting to note du's command behavior in case of hard links. Here is an example:
# du -sh zimbra*
132G      zimbra
3.4G      zimbra.201412131551
3.2G      zimbra.201412140326
114M      zimbra.201412150325
# du -sh zimbra.201412131551
132G      zimbra.201412131551
# du -sh zimbra.201412150325
132G      zimbra.201412150325
In the first case it tells us how much space is used by main directory, zimbra, and then it tells us the difference in usage of the other directories, e.g. zimbra is using 132G and zimbra.201412131551 uses 3.4G more/differently. But, when we give specific directory to du command, then it tells us how much this directory is by itself, so we see that all the files in zimbra.201412131551 indeed use 132G.

And that's basically it. These two commands (rsync and cp) are placed in a script with some additional boilerplate code and everything is run from cron.

Thursday, December 11, 2014

How to determine if some blob is encrypted or not

If you ever wondered how hard it is to differentiate between encrypted file and some regular binary file, then wonder no more, because the answer is: very easy, at least in principle. Namely, by looking at the distribution of octets in a file you can know if it is encrypted or not. The point is that after encryption the file must look like a random sequence of bytes. So, every byte, from 0 to 255, will occur almost the same number of times in a file. On the other hand, text files, images, and other files will have some bytes occurring more frequently than the others. For example, in text files space (0x20) occurs most frequently. So, the procedure is very easy, just count how many times each octet occurs in the file and then look at the differences. You can do it by hand(!), write your own application, or use some existing tool.

Of course, looking at the distribution isn't so easy, so, for a quick check it is possible to use entropy as a measure of randomness. Basically, low entropy means that the file is not encrypted, while high entropy means it is. Entropy, in case of file that consists of stream of octets, can be in a range [0,8], with 0 being no entropy at all, and 8 being the maximum entropy. I found a relatively good explanation of entropy that also has links to the code to calculate entropy in a file. Now, in case you don't have a tool for calculating entropy at a hand, some compression tool will be useful, too. Namely, random files can not be compressed. And that is the reason why compression is always done before encryption, because encrypted data looks random and is thus incompressible.

Let me demonstrate all this on an example. Take, for example, this file. That is a file linked to by the blog post I mentioned previously in which entropy is explained. It is a simple C file you can download on your machine and compile it into executable called entropy1:
$ gcc -Wall -o entropy1 entropy1.c -lm
So, let us see what is the entropy of the C file itself:
$ ./entropy1 entropy1.c
4.95 bits per byte
it is relatively low entropy, meaning, not much information content is there with respect to the size. Ok, let's encrypt it now. We'll do this using openssl tool:
$ openssl enc -aes-128-cbc \
        -in entropy1.c -out entropy1.c.aes \
        -K 000102030405060708090a0b0c0d0e0f \
        -iv 000102030405060708090a0b0c0d0e0f
The given command encrypts input file (entropy1.c) using 128-bit key 000102030405060708090a0b0c0d0e0f in AES-CBC mode and with initialization vector 000102030405060708090a0b0c0d0e0f. The output is written to a file with the name entropy1.c.aes. Let us see what is the entropy now:
$ ./entropy1 entropy1.c.aes
7.86 bits per byte
That's very high entropy, and in line with what we've said about encrypted files having high entropy. Lets check how compressible is the original file, and encrypted one:
$ zip entropy1.zip entropy1.c
  adding: entropy1.c (deflated 48%)
$ zip entropy2.zip entropy1.c.aes
  adding: entropy1.c.aes (stored 0%)
As can be seen, encrypted file isn't compressible, while plain text is. What about entropy of those compressed files:
$ ./entropy1 entropy1.zip
7.53 bits per byte
$ ./entropy1 entropy2.zip
7.79 bits per byte
As you can see, they both have high entropy. This actually means that entropy can not be used to differentiate between compressed files and encrypted ones. And now, there is a problem, how to differentiate between encrypted and compressed files? In that case statistical tests of randomness have to be performed. Good tool for that purpose is ent. Basically, that tool performs two tests: Chi Square and Monte Carlo pi approximation. I found a good blog post about using that tool. At the end, there is a list of rule of thumb rules that can be used to determine if the file is encrypted or compressed:
  • Large deviations in the chi square distribution, or large percentages of error in the Monte Carlo approximation are sure signs of compression.
  • Very accurate pi calculations (< .01% error) are sure signs of encryption.
  • Lower chi values (< 300) with higher pi error (> .03%) are indicative of compression.
  • Higher chi values (> 300) with lower pi errors (< .03%) are indicative of encryption.
Take those numbers only indicatively, because we'll see different values in the example later. But nevertheless, they are a good hint on what to look at. Also, let me explain from where did he get the value 300 for Chi square. If you watched linked video in that blog post, you'll know that there are two important parameters when calculating Chi Squared test, number of degrees of freedom and a critical value. Namely, in our case we have 256 values and that translates into 255 degrees of freedom. Next, if we select p=0.05, i.e. we want to determine if the stream of bytes is random with 95% of certainty, then looking into some table we obtain critical value 293.24, rounded it is 300. When Chi square is below that value, then we accept null hypothesis, i.e. the data is random, otherwise we reject null hypothesis, i.e. the data isn't random.

Here is the output from a given tool for encrypted file:
$ ./ent entropy1.c.aes Entropy = 7.858737 bits per byte.
Optimum compression would reduce the sizeof this 1376 byte file by 1 percent.
Chi square distribution for 1376 samples is 253.40, and randomlywould exceed this value 51.66 percent of the times.
Arithmetic mean value of data bytes is 129.5407 (127.5 = random).Monte Carlo value for Pi is 3.091703057 (error 1.59 percent).Serial correlation coefficient is 0.040458 (totally uncorrelated = 0.0).
Then, here is for just compressed file:
$ ./ent entropy1.zip Entropy = 7.530339 bits per byte.
Optimum compression would reduce the sizeof this 883 byte file by 5 percent.
Chi square distribution for 883 samples is 1203.56, and randomlywould exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 111.6512 (127.5 = random).Monte Carlo value for Pi is 3.374149660 (error 7.40 percent).Serial correlation coefficient is 0.183855 (totally uncorrelated = 0.0).
And finally, for encrypted and then compressed file:
$ ./ent entropy2.zip
Entropy = 7.788071 bits per byte.
Optimum compression would reduce the size
of this 1554 byte file by 2 percent.
Chi square distribution for 1554 samples is 733.20, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 121.0187 (127.5 = random).
Monte Carlo value for Pi is 3.166023166 (error 0.78 percent).
Serial correlation coefficient is 0.162305 (totally uncorrelated = 0.0).
First, note that the error in calculating Pi is higher in case of the compressed file (7.40 vs 0.78/1.59). Next, Chi square is in the first case less than 300 which indicates encryption (i.e. the data is random), while in the last two cases it is bigger than 300 meaning the data is not so random!

For the end, here is a link to interesting blog post describing file type identification based on the byte distribution and its application for reversing XOR encryption.

Thursday, December 4, 2014

Lenovo W540 and Fedora 21

At the end of the November 2014. I got a new laptop, Lenovo W540, and immediately I started taking notes about this machine with my impressions. I'm a long time W-series Lenovo user, and I think that those notebooks are very good machines, albeit a bit more expensive so probably not for an average user. Anyway, this post has been in making for some time now, and I'll update it in a due course. In it I'll write about my impressions, as well as about installing and using Fedora 21 on this machine. Note that when I was starting to write this post Fedora 21 was in beta. So, everything I say here might change in final release (in case I don't change a post).

First, let me start by positive observations about the machine:
  • The new machine is thinner than the W530.
  • Power adapter seems to be smaller than the one for W530 model.
  • Very easy to access RAM slots and hard disk slot in case you want to upgrade RAM and/or put another disk in it.
Well, true, that's a very short list. So, here are some negative ones:
  • They again changed connector for attaching power adapter. In other words, anything you have thus far (and I have quite a lot!) won't work for this machine.
  • There is no lock that holds the lid closed.
  • At first, I thought that there is no LEDs that, when you close a lid, show you the state of the laptop. This is important because now I don't know if the laptop is in sleep mode or not when the lid is closed. But later I realised that there is, it is a dot over letter i in ThinkPad logo in the lower right corner of a lid (looking from above).
  • There is numeric part of the keyboard that, honestly, I don't need. This space was gained by not having speakers on both sides of the keyboard as in W530. Later I realised even more how cumbersome this part of keyboard is. Namely, I'm holding this machine a lot of time on my lap and because of numerical keyboard I can not have laptop centered while holding it in my lap.
  • There are no more buttons on touchpad, the touchpad itself is a button. But I managed to get used to it by getting rid of the reflex to click with separate arm.
  • Fn keys are overloaded with different additional functionalities. For example, F1 key is mute now and it has also LED indicator! Furthermore, all the function keys have alternative functionality that you obtained in the previous models using Fn key. Now, it is opposite, you get the regular functionality of those keys by pressing Fn key in the lower left corner! This is weird! Only later I found out that there is a small led diode on the Fn key, and if you press Fn+Esc it turns on meaning that keys are now function keys, F1, F2, etc.
To be honest, I don't like those changes and probably it'll take some time until I get used to them.

Installing OS

Ok, now about Fedora 21 installation. First, I changed the laptop to use UEFI boot exclusively. I don't know if this is good or bad, but in the end I did it. Note that there is hybrid mode, i.e. both old BIOS and new UEFI will be used for boot process (which ever one manages to boot the machine), but I didn't use it. Anyway, since I removed CDROM I had to boot it somehow to install Fedora 21. First, I tried with PXEBOOT. But, no luck with UEFI. Note that I managed to boot the machine using old BIOS. This means that I properly configured everything for network boot using old BIOS, but not for new UEFI BIOS. Since I wanted to have UEFI boot, I gave up from this option.

Since I managed to obtain a USB stick I decided to go that route. First, I dd'ed efidisk.img file, and that booted laptop, but it couldn't find anything to install from, yet alone start installation. So, I downloaded live Fedora 21 Workstation and dd'ed that to USB disk. That worked.

For some strange reason, I decided to use BTRFS filesystem. Actually, the reason is that I can mount separate root and home partitions that use the same pool of free space. That way it won't happen that I have low free space on one partition, and a lot of space on another partition. But, I didn't notice that encryption is selected separately for the whole volume, and not for a specific file system, i.e. mount point. Since no reasonable person will install OS these days without encryption, I did installation several times until I managed to get over that problem.

While working with new OS what frustrated me a lot was a touchpad. It didn't get click when I tried to left click, it scrolled randomly, and I couldn't find middle click or the right click. Also, the problem was that I pressed mouse button and then scrolled while the mouse button was pressed. This is also somehow problematic on this touchpad.

Here are some additional links I found:
That's it for now. Stay tuned for more...

Monday, November 24, 2014

How to experiment and learn about BIOS malware

While trying to make VMWare Workstation work with new kernel in Fedora 20, on the link where I found solution there is a section about extracting BIOS. This section has a subsection in which it is shown how to use custom BIOS for some virtual machine. Because lately I'm all in malware analysis stuff, it occurred to me that this is actually a great opportunity to experiment with BIOS malware for educational and research purposes. Using real hardware for that purpose would be very problematic because it's not easy to modify BIOS just like that. So, in essence, what we would like to do is:
  1. Extract BIOS used by VMWare.
  2. Decompile it.
  3. Modify.
  4. Compile.
  5. Install and use.
So, while searching how to do that I stumbled on PHRACK magazine's article that describes just that, how to infect BIOS. It also describes how to instruct VMWare to stop in BIOS and allow gdb to be attached for BIOS debugging! In the end, it turned out that this topic is well studied already. Here are some interesting resources I found:

Lately, UEFI is much more interesting to experiment with because gradually all the manufacturers are switching from old BIOS to a new boot method that has additional protections. It turns out that VMWare Workstation, starting with version 8 supports UEFI boot, too. All that is necessary is to add the following line to vmx configuration file of a virtual machine:
firmware="efi"
So, this is a great research and learning opportunity. Yet, it is very hard to find information on how to manipulate UEFI BIOS. One reason might be that it is relatively new and not many people know what it does and how it works.

While searching for information on how to infect and manipulate UEFI, I found the following URLs to be interesting:
  1. http://www.projectosx.com/forum/index.php?showtopic=3018
  2. http://wiki.osdev.org/UEFI
  3. http://uefi.org/learning_center/presentationsandvideos
  4. http://linuxplumbers.ubicast.tv/videos/uefi-tutorial-part-1/
  5. http://tianocore.sourceforge.net/wiki/Welcome
  6. http://vzimmer.blogspot.com/2012/12/accessing-uefi-form-operating-system.html

Sunday, November 9, 2014

Fedora 20 update to kernel 3.17.2-200 and VMWare Workstation

Since I updated to VMWare Workstation 10.0.5 at the end of January 2015, things were again broken. Returning to this post I found that the link in the post now points to something that has changed and there is no patch nor instructions of what to do. So, I had to google again and now I placed the complete instructions in this post so that the next time I don't have to google again.

It turns out that there is a single line that has to be changed in the vmnet module in order for VMWare to be runnable again. So, here are the steps you have to do in order to patch the file:
  1. Create temporary directory, e.g. /tmp/vmware and position yourself in that directory.
  2. Create a file named vmnet.patch and put into it the following content:

    diff -ur vmnet-only.a/netif.c vmnet-only/netif.c
    --- vmnet-only.a/netif.c    2014-10-10 03:23:08.585920012 +0300
    +++ vmnet-only/netif.c  2014-10-10 03:23:09.245920008 +0300
    @@ -149,7 +149,7 @@
        memcpy(deviceName, devName, sizeof deviceName);
        NULL_TERMINATE_STRING(deviceName);
    
    -   dev = alloc_netdev(sizeof *netIf, deviceName, VNetNetIfSetup);
    +   dev = alloc_netdev(sizeof *netIf, deviceName, NET_NAME_UNKNOWN, VNetNetIfSetup);
        if (!dev) {
           retval = -ENOMEM;
           goto out;
    

  3. Unpack /usr/lib/vmware/modules/source/vmnet.tar in the current directory (/tmp/vmware):

    tar xf /usr/lib/vmware/modules/source/vmnet.tar
    

  4. Patch the module:

    cd vmnet-only; patch -p1 < ../vmnet.patch; cd ..
    

  5. Make a copy of old, unpatched, archive:

    mv /usr/lib/vmware/modules/source/vmnet.tar /usr/lib/vmware/modules/source/vmnet.tar.SAVED
    

  6. Create a new archive:

    tar cf /usr/lib/vmware/modules/source/vmnet.tar vmnet-only
    

  7. Start vmware configuration process:

    vmware-modconfig --console --install-all
    
Hopefully, that should be it.

Old instructions (not valid any more!)

Well, here we go again. After recent update which brought kernel 3.17 to Fedora 20, VMWare Workstation 10.0.4 had problems with kernel modules. Luckily, after some short googling I found a solution. That solution works. There are two things that might confuse you though:
  1. After cd command and before for loop you have to switch to root account (that is indicated by prompt sign change from $ to #).
  2. The substring kernel-version in patch command should be replaced with a string "3.17". That is actually the name you gave to a file while executing curl command at the beginning of the process.
Anyway, that's it.

Wednesday, September 24, 2014

Anonymous paper reviews and threat of a legal action

I just stumbled on a news story in which scientist claims that his career was severely damaged by anonymous comments on some of his works published on PubPeer. This is very interesting story to follow for several reasons.

For a start, PubPeer is a site for a post publication review. I strongly support such a practice because I believe that everything has to be scrutinized and tested, and it helps authors who can get the best possible feedback, but also helps society in general, too because there is ever increasing problem with scientific ethic. As a side note, I was, and I'm still a big proponent of doing review process in public. That, in my opinion, significantly increases transparency. Anyway, PubPeer fulfils my wishes, but unfortunately for me, it is only concerned with papers from medicine, chemistry and related fields, not from computer science.

In this particular case, the problem is that the author was offered a job on the University of Mississippi, with quite a large annual salary, and for that purpose he quitted his current job. University then revoked the offer and so he lost both the new job, and his current job. Now, he claims that the reason for this are some anonymous negative comments on PubPeer and threatens with a lawsuit asking for identities of those who made those negative claims.

While, as I said, it is very good to have such a site, it doesn't mean that everything should be allowed, more specifically:
  1. Any claims made have to be justified. Unfortunately, anonymity also allows people to make damaging or unjustified claims by being certain that there will be no repercussions.
  2. Unfortunately, negative claim even if not justified casts doubts, so that might be a problem.
  3. In this particular case it is also unknown why the author didn't respond to presented claims about problems in his paper. PubPeer claims they invite first and last author to comment on comments.
  4. Finally, no one should take lightly claims about some paper being invalid, not good, etc. In this particular case, I hope that University of Mississippi verified negative claims and that they didn't take lightly what some anonymous commenters said.
In any case, we'll see what will happen with this particular case.

Saturday, August 23, 2014

Memory access latencies

Once, I saw a table in which all the memory latencies are scaled in such a way that CPU cycle is defined to be 1 second, and then L1 cache latency is several seconds, L2 cache even more, and so on up to SCSI commands timeout and system reboot. This was very interesting because I have much better developed sense for seconds and higher time units that for nanoseconds, microseconds, etc. Few days ago I remembered that table and I wanted to see it again, but couldn't find it.  This was from some book I couldn't remember the name. So, I started to google for it, and finally, after an hour or so of googling, I managed to find this picture. It turns out that this was from the book Systems performance written by Brendan Gregg. So, I decided to replicate it here for a future reference:


Table 2.2: Example Time Scale of System Latencies
Event Latency Scaled
1 CPU Cycle 0.3 ns 1 s
Level 1 cache access 0.9 ns 3 s
Level 2 cache access 2.8 ns 9 s
Level 3 cache access 12.9 ns 43 s
Main memory access (DRAM, from CPU) 120 ns 6 min
Solid-state disk I/O (flash memory) 50 - 150 us 2-6 days
Rotational disk I/O 1-10 ms 1-12 months
Internet: San Francisco to New York 40 ms 4 years
Internet: San Francisco to United Kingdom 81 ms 8 years
Internet: San Francisco to Australia 183 ms 19 years
TCP packet retransmit 1-3 s 105-317 years
OS virtualization system reboot 4 s 423 years
SCSI command timeout 30 s 3 millennia
Hardware (HW) virtualization system reboot 40 s 4 millennia
Physical system reboot 5 min 32 millennia

It's actually impressive how fast CPU is with respect to other components. It is also very good argument for multitasking, i.e. assigning CPU to some other task while waiting for, e.g. disk, or something from the network.

One additional impressive thing is written below the table in the book. Namely, if you multiply CPU cycle with speed of light (c) you can see that the light can travel only 0.5m while CPU does one instruction. That's really impressive. :)

That's it for this post. For the end, while I was searching for this table, I stumbled on some additional interesting links:



Sunday, June 29, 2014

Private addresses in IPv6 protocol

It is almost a common wisdom that 172.16.0.0/12, 192.168.0.0/16, and 10.0.0.0/8 are private network addresses that should be used when you don't have assigned address, or you don't intend to connect to the Internet (at least not directly). With IPv6 being ever more popular, and necessary, the question is which addresses are used for private networks in that protocol. In this post I'll try to answer that question.

The truth is that in IPv6 there are two types of private addresses, link local and unique local addresses. Link local IPv6 addresses, as the name suggests, are valid only on a single link. For example, on a single wireless network. You'll recognize those addresses by their prefix, which is fe80::/10, and they are automatically configured by appending interface's unique ID. IPv4 also has link local address, though it is not so frequently used. Still, maybe you noticed it when your DHCP didn't work and suddenly you had address that starts with 169.254.0.0./16. This was a link local IPv4 address configured. The problem with link local addresses is that they can not be used in case you try to connect two or more networks. They are only valid on a single network, and packets having those addresses are not routable! So, we need something else.

Unique local addresses (ULA), defined in RFC4193, are closer to IPv4 private addresses. That RFC defines ULA format and how to generate them. Basically, those are addresses with the prefix FC00::/7. These addresses are treated as normal, global, addresses, but are only valid inside some restricted area and can not be used on the global Internet. This is the same as saying that 10.0.0.0/8 addresses can be used within some private networks, but are not allowed on a global Internet. You choose how this conglomerate of networks will be connected, what prefixes used, etc.

There is  difference, though. Namely, it is expected that ULA will be unique in the world. You might ask why is that important, when those addresses are not allowed on the Internet anyway. But, that is important. Did it ever happened to you that you had to connect two private IPv4 networks (directly via router, via VPN, etc.), and coincidentally, both used, e.g. 192.168.1.0/24 prefix? Such situations are a pain to debug, and require renumbering or some nasty tricks to make them work. So, being unique is an important feature.

So, the mentioned RFC, actually specifies how to generate ULA with /48 prefix and a high probability of the prefix being unique. Let's first see the exact format of ULA:
| 7 bits |1|  40 bits   |  16 bits  |          64 bits           |
+--------+-+------------+-----------+----------------------------+
| Prefix |L| Global ID  | Subnet ID |        Interface ID        |
+--------+-+------------+-----------+----------------------------+
First 8 bits have a fixed value 0xFD. As you can see, prefix is 7 bit, but L bit must be set to 1 if the address is specified according to the RFC4193. So, first 8 bits are fixed to the value 0xFD. Note that L bit set to 0 isn't specified, it is something left for the future. Now, the main part is Global ID, whose length is 40 bits. That one must be generated in such a way to be unique with high probability. This is done in the following way:
  1. Obtain current time in a 64-bit format as specified in the NTP specification.
  2. Obtain identifier of a system running this algorithm (EUI-64, MAC, serial number).
  3. Concatenate the previous two and hash the concatenated result using SHA-1.
  4. Take the low order 40 bits as a Global ID.
The prefix obtained can be used now for a site. Subnet ID can be further used for multiple subnets within a site. There are Web based implementations of the algorithm you can use to either get a feeling of the generated addresses, or to generate prefix for your concrete situation.

Occasionally you'll stumble upon so called site local addresses. Those addresses were defined starting with the initial IPv6 addressing architecture in RFC1884 and were also defined in subsequent revisions of addressing architecture (RFC2373, RFC3513) but were finally deprecated in RFC3879. Since they were defined for so long (8 years) you might stumble upon them in some legacy applications. They are recognizable by their prefix FEC0::/10. You shouldn't use them any more, but use ULA instead.

Friday, June 27, 2014

Detecting which directory is changing...

Suppose that you have some directory with a lot of subdirectories. Of all those subdirectories, one of them is changing in size, while all the others are of a constant size. The question is, how to detect which subdirectory is that?

This happened to me while I was downloading mail archives from IETF. lftp client, that I'm using, shows only a file that it is currently downloading, not a directory in which it is, i.e. the output looks something like this:
lftp ftp.ietf.org:/> mirror ietf-mail-archive
`2010-04.mail' at 518040 (50%) 120.1K/s eta:4s [Receiving data]
                             
A solution to search for a given file won't work because this particular filename is in almost every directory.

The solution I used, was the following shell command:
$ ( du -sk *; sleep 5; du -sk * ) | sort | uniq -u
36204 mmusic
36848 mmusic
This command has to be executed inside ietf-mail-archive directory. It works as follows:
  1. First 'du -sk *' command lists all directory sizes.
  2. Then it sleeps for a five seconds (sleep 5) waiting for a directory that is currently changing, to change its size.
  3. Again we get all the directory sizes using the second du -sk command.
  4. Parentheses around all three are used so all of those commands execute within a subshell and that we receive output of both du commands.
  5. Then, we sort output. Note that the directories that don't change will be one after the another, while the one that changes won't be. 
  6. Finally, we use uniq command to filter out all the duplicate lines, meaning, only the directory that changed will be passed to the output.

Monday, May 5, 2014

uClibc versus eglibc

This is a post about differences between uClibc and eglibc libraries. Namely, OpenWRT can be built on either of those two libraries with the default being uClibc, so the question one might ask is what's the difference between those libraries and why uClibc. I have to say that I'm not affiliated with neither of those, and what I wrote here is purely my personal opinion based on the information I managed to find on the Internet. I suppose that you know what C standard library is, what is its purpose and that the default C library on desktop and server Linux is glibc.

First, both eglibc and uClibc were developed with the intention of having a small footprint and thus to be suitable for use in embedded devices. glibc is a huge library that wasn't well suited for that purpose. Yet, those two libraries differ in a way they try to achieve that purpose.

eglibc


eglibc, or embedded glibc, was developed with the intention of being source and binary compatible with glibc. That means that isn't necessary to recompile existing binary applications compiled for glibc in order to be possible to run them on eglibc.  On the other hand, since it is source compatible with glibc, that means it is possible to recompile the source without any modifications. Yet, according to FAQ on eglibc page, eglibc development is stopped, and all embedded development will be done directly in glibc tree. This was announced on July 20th, 2013. That also means that almost all patches from eglibc will be ported to glibc. You can find more information here about the patches that are not going to be ported back.

What might confuse is that the newest eglibc release is based on glibc 2.19 which was released in April, 2014, a year after announcement that eglibc development will stop. But, this is according to the plan of phasing out separate eglibc tree. Also according to the plan, this is a last branch. All the branches will be maintained as long as the base glibc versions are maintained.

In June 2014 Debian announced that it's going to switch back from eglibc to glibc due to the changes in the governing structure of GLibc project. The reason for the change is that Ulrich Drepper left RedHat.

uClibc


uClibc, on the other hand, was developed with the intention to be source compatible only, i.e. no binary compatibility and thus binary programs compiled for glibc (or eglibc) have to be recompiled. uClibc is actively maintained even though the latest release is from 2012.

Conclusion


So, what is the conclusion? The conclusion is that if you don't need to be binary compatible, you should use uClibc  on OpenWRT. After all, all the binary packages on OpenWRT's site were compiled against that library. If binary compatibility is important to you, then glibc is the way to go. Unfortunately, since in OpenWRT there is eglibc and not glibc, you have to go eglibc route. Note that this also means you'll have to recompile all the sources for OpenWRT since you'll not be able to use precompiled binary packages!

Sunday, March 23, 2014

Two or more applications listening on the same (IP, port) socket

It might happen that you need two applications to listen to the same (IP, port) UDP socket with the idea that the applications know how to differentiate between packets that are intended for them. If this is the case, then you'll have to do something special because the kernel doesn't allow two, or more, applications to bind in such a way. As a side note, starting with kernel 3.9 there is a socket option SO_REUSEPORT that allows multiple applications to bind to a single port (under certain preconditions) but it doesn't work the way I described here.

One solution is to have some kind of a demultiplexer application, i.e. it binds to a given port, receives packets and then sends them to appropriate application. This will work, but it wasn't appropriate for my situation. So, the solution is that one application binds to the given port, and the other uses PF_PACKET socket with appropriate filter so that it also receives UDP packets of interest. I hope that you realize that this works only for UDP, and not for TCP or other connection oriented protocols!

So, what you have to do is:

  1. Open appropriate socket with socket() system call.
  2. Bind to interface using bind() system call.
  3. Attach filter to socket using setsockopt().
  4. Receive packets.

If you want an example of how this is done, take a look into busybox, more specifically its udhcpc client.

Now, there are two problems with this approach that you need to be aware of. The first is that if you try to send via this socket you are avoiding routing code in the kernel! In other words, it might happen that you try to send packets to wrong directions. How this can be solved, and if it really needs solution, depends on the specific scenario you are trying to achieve.

The second problem is that if there is no application listening on a given port, the kernel will sent ICMP port unreachable error messages on each received UDP message. I found a lot of questions on the Internet about this issue, but without any real answer. So, I took a look at where this error message is generated, and if there is anything that might prevent this from happening.

UDP packets are received in function __udp4_lib_rcv() that, in case there is no application listening on a given port, sends ICMP destination port unreachable message. As it turns out, the only case when this message will not be sent is if the destination is multicast or broadcast address. So, your options are, from the most to the least preferred:

  1. Be certain that you always have application listening on a given port.
  2. Use iptables to block ICMP error messages (be careful not to block too much!).
  3. The application on the other end ignores those error messages.

Wednesday, March 19, 2014

Installing OSSIM community edition in QEMU

Since OSSIM is based on Debian and it is a nightmare to compile it for something else (ehm, CentOS) I decided to use it in a headless QEMU virtual machine. To test the whole process, I first decided to do a regular installation of OSSIM, with display. But, I had a lot of obstacles while trying install OSSIM community edition in QEMU. It is even more interesting that when you google for ossim and qemu, there are almost no posts.

In the end, everything worked flawlessly but when using text based installation. To access text based installation edit boot command line (pressing TAB at the initial boot screen) and at the end add the following:
DEBIAN_FRONTEND=text
And that gave me text based installation. Basically, AlienVault uses Debian's installer so anything that can be configured for Debian, can be for OSSIM too. Take a look into manual for further information.

Few things to be aware of when doing this:
  1. Don't use too small disk because the installation will stuck without any notification what happened.
  2. I had problems with GUI based installation, and its fallback ncurses. The installation would stuck somewhere (e.g. in GUI after entering IP address, something would go wrong in package installation process, MySQL wasn't properly installed and there were errors that starting failed, apache wasn't properly installed and Web console wasn't accessible, etc.)

CentOS 6

On CentOS there is no qemu-kvm like in Fedora. Instead, you have to use libvirtd. Be sure that libvirt is installed, before continuing. That means packages virt-install and libvirt are installed. Additionally, libvirtd daemon must bi started.

So, first create file for disk image. You can do this using dd, but even better is to use fallocate(1) command. Also, fetch OSSIM ISO image file. Now, to start installation process use the following command:
virt-install -r 2560 --accelerate -n OSSIM \
        --cdrom /tmp/AlienVault_OSSIM_64Bits_4.3.4.iso \
        --os-variant=debiansqueeze --disk path=./sda.img \
        -w bridge --graphics vnc,password=replaceme
In the previous command I'm giving to OSSIM 2.5G RAM (option -r), the name will be OSSIM, disk image is in the current directory (with respect to the command virt-install) and I'm using bridged networking. Finally, console will be available via VNC and the password for access is replaceme.

There are several error messages you might receive when trying to start installation process:
ERROR    Error with storage parameters: size is required for non-existent disk '/etc/sysconfig/network-scripts/sda.img'
Well, this error message occured because I was trying to start installation process in the wrong directory, i.e. the one that didn't contain file for hard disk image.

The following error:
ERROR    Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
means that libvirtd daemon isn't started. Start it using:
service libvirtd on
and don't forget to make it start every time you boot your machine:
chkconfig libvirtd on
The next error:
Starting install...
ERROR    internal error Process exited while reading console log output: char device redirected to /dev/pts/1
qemu-kvm: -drive file=/root/AlienVault_OSSIM_64Bits_4.3.4.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw: could not open disk image /root/AlienVault_OSSIM_64Bits_4.3.4.iso: Permission denied
means that I placed ISO image in a directory where libvirt can not access it. Move image to, e.g. /tmp directory and try again.

After you managed to start installation process, connect to it using vncviewer application. libvirt-install binds vnc to localhost so you won't be able to access it directly from some remote host. This is actually OK, so you shouldn't change it, unless you know very well what you are doing. So, to connect to console, open terminal window and execute the following command:
ssh -L 5900:127.0.0.1:5900 host_where_installation_is_started
Now, in another terminal on local machine (i.e. the one where you started previous ssh command) run the following command:
vnc localhost
And that should be it. What happened is that with ssh you created a tunnel between your local machine and the remote where virtual machine is being installed. So, don't stop ssh until you vnc session is running!

Thursday, March 13, 2014

Installing Snort 2.9.6.0 on CentOS 6.5 64-bit

Some time ago I wrote a post about installing Snort 2.9.1 on CentOS 6. In the mean time I decided it's time to upgrade so the idea of this post is to document what changed with respect to that older post. In short, binary packages for CentOS 6 are now provided on the Snort's download page. So, you only need to download them and install (or install using URL). Yet, there is a problem with a libdnet dependency (I don't know which one was used during compilation, but it certainly wasn't the one in EPEL).

Compiling and installing

In case you want to rebuild them, the process is now almost without any problems. In the following text I'll assume that you started with a minimal CentOS installation with the following packages installed (and their dependencies, of course): gcc, make, bison, flex, autoconf, automake, rpmbuild.

First, download daq source rpm file. Before rebuilding it, you should install pcap-devel. This is actually something rpmbuild tool will warn you that you have to install. When you installed it, rebuild daq:
rpmbuild --rebuild daq
then, install it:
yum localinstall ~/rpmbuild/RPMS/x86_64/daq-2.0.2-1.x86_64.rpm
Next, for snort you'll need libdnet library which is in EPEL. So, first install EPEL:
yum install http://mirrors.neterra.net/epel/6/i386/epel-release-6-8.noarch.rpm
Then, install necessary packages:
yum install libdnet-devel zlib-devel
Those two aren't listed as dependencies in Snort's SRPM file, so you'll get some cryptic error message. Now, download Snort's srpm file and rebuild it using:
rpmbuild --rebuild snort-2.9.6.0-1.src.rpm
Now, install it using:
yum localinstall ~/rpmbuild/RPMS/x86_64/snort-2.9.6.0-1.x86_64.rpm
That's all there is for installation.

Configuring and running

I'll assume that you are installing a fresh instance, i.e. no previous configuration. In case there is previous installation be careful not to overwrite existing configuration. To configure snort you'll have to download snortrules archive. Then, unpack it:
mkdir ~/snort
tar xzf snortrules-snapshot-2960.tar.gz -C ~/snort
chown root.root ~/snort
Next you have to move files in their place. First, move basic configuration file:
mv -f snort/etc/* /etc/snort/
Note that I'm using force option of move command to overwrite existing files. Next, move rules to their place:
mv -i snort/rules snort/preproc_rules snort/so_rules /etc/snort/
Now, if you are using SELinux you should change context of the files you moved to /etc/snort directory. Do it using the following commands:
chcon -R system_u:object_r:snort_etc_t:s0 /etc/snort
chcon -R system_u:object_r:lib_t:s0 /etc/snort/so_rules/precompiled/RHEL-6-0/
You should now modify configuration file. Here is a diff of the changes I made:
--- snort.conf.orig 2014-03-13 11:25:53.889609831 +0100
+++ snort.conf 2014-03-13 11:37:32.419292894 +0100
@@ -42,16 +42,16 @@
 ###################################################

 # Setup the network addresses you are protecting
-ipvar HOME_NET any
+ipvar HOME_NET 192.168.1.0/24

 # Set up the external network addresses. Leave as "any" in most situations
 ipvar EXTERNAL_NET any

 # List of DNS servers on your network
-ipvar DNS_SERVERS $HOME_NET
+ipvar DNS_SERVERS 192.168.1.8,192.168.1.9

 # List of SMTP servers on your network
-ipvar SMTP_SERVERS $HOME_NET
+ipvar SMTP_SERVERS 192.168.1.20

 # List of web servers on your network
 ipvar HTTP_SERVERS $HOME_NET
@@ -101,13 +101,13 @@
 # Path to your rules files (this can be a relative path)
 # Note for Windows users:  You are advised to make this an absolute path,
 # such as:  c:\snort\rules
-var RULE_PATH ../rules
-var SO_RULE_PATH ../so_rules
-var PREPROC_RULE_PATH ../preproc_rules
+var RULE_PATH rules
+var SO_RULE_PATH so_rules
+var PREPROC_RULE_PATH preproc_rules

 # If you are using reputation preprocessor set these
-var WHITE_LIST_PATH ../rules
-var BLACK_LIST_PATH ../rules
+var WHITE_LIST_PATH rules
+var BLACK_LIST_PATH rules

 ###################################################
 # Step #2: Configure the decoder.  For more information, see README.decode
@@ -240,13 +240,13 @@
 ###################################################

 # path to dynamic preprocessor libraries
-dynamicpreprocessor directory /usr/local/lib/snort_dynamicpreprocessor/
+dynamicpreprocessor directory /usr/lib64/snort-2.9.6.0_dynamicpreprocessor/

 # path to base preprocessor engine
-dynamicengine /usr/local/lib/snort_dynamicengine/libsf_engine.so
+dynamicengine /usr/lib64/snort-2.9.6.0_dynamicengine/libsf_engine.so.0

 # path to dynamic rules libraries
-dynamicdetection directory /usr/local/lib/snort_dynamicrules
+dynamicdetection directory /etc/snort/so_rules/precompiled/RHEL-6-0/x86-64/2.9.6.0/

 ###################################################
 # Step #5: Configure preprocessors
And you can download the complete snort.conf file that worked for me. Be careful, you need to change IP addresses in the configuration file to match your environment.

Finally, create two empty files, /etc/snort/rules/white_list.rules and /etc/snort/rules/black_list.rules.

Now, you should be able to start Snort, i.e.
# /etc/init.d/snortd start
Starting snort: Spawning daemon child...
My daemon child 1904 lives...
Daemon parent exiting (0)                         [  OK  ]

Tuesday, March 11, 2014

Compiling OVALDI 5.10.1.6 on CentOS 6.5

Some time ago I wrote about compiling Ovaldi on CentOS 6. Now, I tried to compile it again, and I found out that some things changed. Most importantly, there is no need to compile old Xalan/Xerces libraries any more. But, there are still problems with RPM. To make the story short, I managed to compile it and create RPM. Here are the files:
  • patch you need to be able to compile ovaldi
  • SRPM file you can use to recompile ovaldi; it contains patch
  • RPM file if you don't want to compile it yourself (and you trust me ;))
Note that I didn't do any testing at all! So, it might happen that rpm based stuff doesn't work. If that's the case leave a comment and I'll take a look when I find time.

Thursday, February 27, 2014

Poticanje ljudi na odlazak iz Hrvatske - piljenje grane na kojoj sjedimo...

Često čujem kako netko potiče ljude da se javljaju na oglase za poslove van Hrvatse te ih dodatno potiču na taj korak odlaska. Moram priznati da me to većim dijelom zabrinjava, čak užasava, a jednim malim dijelom i nasmijava. Zadnji primjer, koji me je na kraju i nagnao da napišem ovaj post, je sljedeći:


Moram odmah istaknuti kako se u ovom konkretnom slučaju radi o računalnim stručnjacima, no sve se to može, a i mora, primjeniti i na ostali visoko obrazovani kadar.

Dakle, zbog čega me to zabrinjava, a i nasmijava? Pa, da bi se to shvatilo nužno je prvo razjasniti različite perspektive iz kojih se to može gledati, ili drugim riječima, različite dionike (engl. stakeholders). Naime, o perspektivi ovisi što je istina, i kao što je odmah jasno, očito je da postoji više istina. Pri tome, nepobitna je činjenica kako se svaki od dionika mora voditi svojim vlastitim interesima. To je na kraju krajeva, jedino racionalno ponašanje.

Dakle, dionici, odnosno perspektive su:
  1. Osoba koja odlazi van.
  2. Osobe/ljudi koji ostaju u Hrvatskoj.
  3. Osoba koja potiče ljude na odlazak van. Ova kategorija se nadalje može podijeliti na one koji potiču i vani su, te oni koji potiču i u Hrvatskoj su.
  4. Tvrtka koja je vani i traži zaposlene.
Ostatak teksta ću, dakle, podijeliti prema svakoj od tih dionika, te sagledati situaciju iz perspektive svakog od njih. Ključni dio je druga točka, otisnuta masnim slovima, i ako ne čitate pretjerano detaljno ovaj post, ipak bi preporučio da na tu točku obratite pozornost jer je u njoj suština ovog posta.

Primjetite inače da mi klasifikacija nije savršena jer postoji određeno preklapanje. Međutim, nemam niti volju niti namjeru učiniti tu klasifikaciju savršenom. Bitna je poanta koju želim poručiti i mislim da će ova klasifikacija za to biti sasvim odgovarajuća.

Osoba koja odlazi van

U ovom slučaju, nemam što komentirati. Budući da svatko gleda svoj interes, tako i svaka osoba može i mora za sebe odlučiti i odvagati koje su najbolje opcije i postupiti shodno tome.

Možda bi samo kao malu digresiju  mogao reći kako je problem da odlaze van osobe kojima je i u HR dobro, a one kojima je u HR loše, ostaju u HR. Naravno da to nije apsolutno pravilo, ali, mislim da prosjek jako dobro reflektira to što sam rekao. I ta činjenica je bitna, ali tek u kontekstu iduće točke...

Ljudi koji ostaju u Hrvatskoj

Ovdje bi se moglo napraviti daljnje raslojavanje i detaljnija analiza, ali nema potrebe jer stvar je vrlo jednostavna. Naime, ovu grupu bi najbolje mogao opisati kao ekipu koja sjedi na grani, nekolicina reže tu granu, a ova grupa se ponaša kao da ih nije briga. U biti, i nije ih briga jer nemaju pojma.

Koji su razlozi zašto to tako tragično doživljavam:

  1. Novci poreznih obveznika su utrošeni na školovanje tih ljudi, i to ne mali. Dakle, porezni obveznici bi mrali biti zabrinuti kamo se bacaju njihovi novci!
  2. Stručni, obrazovani ljudi su ono što pokreće moderne ekonomije. Hrvatska, odlaskom takvih ljudi, ostaje na ljudima sa srednjom stručnom spremom i prema tome osuđena je na propast.
  3. S takvim odljevom trebali bi se zabrinuti tko će školovati buduće generacije, a također, trebali bi se zabrinuti tko će nas u budućnosti liječiti, ili tko će nam  zarađivati penzije.
  4. Sposobni ljudi čine okolinu koja napreduje, i vuče sve za sobom. Ako nema sposobnih ljudi, nema ni napretka okoline.
Mislim da su to vrlo jaki razlozi zašto nikome normalnome ne bi smjelo pasti na pamet da potiče ljude na odlazak van. 

Kao jedan zanimljiv primjer spomenut ću moj rodni grad, Viroviticu. Naime, iz Virovitice mnoštvo srednjoškolaca je otišlo na studij u Zagreb, Osijek, Rijeku. No, jako malo se i vratilo nazad. To je slučaj barem od kad sam ja otišao na studij. Rezultat je taj da Virovitica, kao i velika većina Hrvatske, stagnira i vrlo vjerojatno se nalazi u fazi kada povratak više nije moguć. Dakle, to će se desiti i Hrvatskoj.

Osobe koje potiču na odlazak i nisu u Hrvatskoj

Tu nemam što dodati, osim da tim ljudima nije ni u džep ni iz džepa to što će tamo netko iz neke države otići, i na taj način osiromašiti tu državu. Koji su motivi te ekipe, teško je reći.

Osobe koje potiču na odlazak i jesu u Hrvatskoj

Za njih mogu reći sve što sam rekao i za one koji ostaju u Hrvatskoj, jedino što su ovi oni koji pile granu i/ili navijaju da se grana što prije prepili. Dakle, šutljiva većina nanosi štetu svojom neaktivnošću i nerazumijevanjem, ovi nanose najveću štetu i trebalo bi ih javno osuditi u novinama. No, kad smo već kod toga, ni novinari previše ne kuže.

Trtvke koje su vani i vrbuju 

One imaju i te kakvu korist od ljudi koji dolaze jer, obično, u svojoj okolini ne mogu pronaći dovoljno kadra. To je sigurno tako za Silicijsku dolinu, Irsku, ali i niz drugih država/regija gdje se nalazi skoncentrirana nekakva industrija, a lokalna okolina nije u mogućnosti pružiti dovoljno ljudskih resursa.

Sunday, February 23, 2014

Kernel upgrade to 3.13.3-201 and VMWare Workstation...

I just started VMWare for the first time after upgrade and restart into kernel 3.13.3-201, and it didn't work. Well, I already got used to it. Anyway, the fix was very quick. I found a patch in a post on VMWare forums, downloaded it and applied it. The application process is a bit more manually involved that with the previous patches. You need to:

  1. Switch to root user.
  2. Go to the /usr/lib/vmware/modules/source directory
  3. Unpack vmnet.tar using tar command.
  4. Enter into newly created vmnet-only directory
  5. Apply patch (patch -p1 < path_and_name_of_unpacked_patch_file). Patch shouldn't make any noise, only patched file is displayed.
  6. Go one level up (exit vmnet-only directory)
  7. Rename existing vmnet.tar into something else, just in case.
  8. Create new vmnet.tar (tar cf vmnet.tar vmnet-only)
  9. Start vmware
And that's it...

Saturday, February 22, 2014

Messing up Zimbra upgrade, a.k.a. dangerous --platform-override option...

You know that Zimbra's installer has --platform-override option that enables you to install Zimbra on unsupported platforms, like CentOS. Well, it also allows you to mess up things. Namely, I downloaded Zimbra for RHEL6, but was upgrading an existing installation on CentOS5. And I didn't noticed anything unusual until upgrade process failed! It failed very early with an error message from loader that it couldn't start perl binary. Namely, the error message I got, was similar to:
/usr/bin/perl: symbol lookup error: /opt/zimbra/zimbramon/lib/i386-linux-thread-multi/auto/IO/IO.so: undefined symbol: Perl_Tstack_sp_ptr
It wasn't exactly that one, but similar! Well, I was surprised because Zimbra's upgrade process is good and robust, and I had never problems with it, but since I had some local customizations on my installation I suspected those to be the cause. So, I ended up by copying CentOS' Perl installation into Zimbra tree. This process wasn't actually so easy to do and that should've warned me that something is terribly wrong, but I continued nevertheless. But, when I managed to pass that point, the next one finally made me realize the mistake. Namely, mysql binary required Glibc 2.7, while in CentOS 5 is some earlier version. Now, I knew what mistake I've made.

Here is also a point I realised that I'm facing a long night trying to recover Zimbra installation!

So, I've downloaded the correct version of Zimbra, the one I intended to upgrade to, manually installed it using rpm tool, and then restarted upgrade process with:
/opt/zimbra/libexec/zmsetup.pl
And here my problems continued. Namely, the upgrade script correctly figured that I'm trying to upgrade, it also correctly figured the old and the new version, but it stuck on starting mysql server. After some time I managed to figure out where the problem is, namely, existing mysql database has password while the installation process assumes there isn't one. So, it couldn't determine if mysql was started or not, and everything stalled. I lost some time trying to make database not to ask for the password, and even though I managed to do something, luckily, in the mean time I stumbled on the following Wiki page:
Recovering from wrong platform upgrade
I made it large because it really got me out of the problem. It is written for version Zimbra version 5, but it works on 7, too. In a nutshell, you install the old Zimbra versions you had and then start from there with the idea of making a rollback. BUT, BIG WARNING, this is possible only if your localconfig.xml file is intact. In my case, I erased it, but the upgrade process saved a copy which I was able to recover! I had some additional problems while testing installation as suggested byt the Wiki page, but all of them were related to the tweaks I did to my installation of Zimbra, i.e. binding Zimbra to a single IP address.

What have I learned

Well, first and foremost, do a backup before doing an upgrade. Yes, I know it was stupid, and beginner's, mistake, but as I've said, Zimbra's upgrade process is really good, at least for me. The real reason I didn't do it was that it takes several hours to make it. If I had copy, I would revert to that copy, and I would only need to fix RPM database. So, to avoid the same thing happening again, now I did make a backup process which rsync'es all of Zimbra tree to an alternative location every day.

Next thing is, localconfig.xml file is extremely important, so I did additional backup of that file, too. Note that in the file are all passwords so, it's also critical from the security perspective!

Also, I would suggest that Zimbra adds checks into installation script to guard against mistakes like the one I did. It turns out, after some googling, that there are a lot more other posts with the similar problems. This can be easily done for CentOS/RHEL with appropriate requirements in RPM packages.

And for the end, here is one other story about failed Zimbra upgrade which I find interesting.

Friday, February 21, 2014

GnuCash - Troškovi na kreditnoj kartici

Vođenje troškova koje se ostvaruju putem kreditne kartice se ispostavilo kao najzahtjevnije. Postoji više načina na koji se to može obaviti, ali ja sam se u konačnici odlučio za detaljniji način, tj. onaj koji omogućava detaljnije vođenje troškova. Komplikaciju predstavlja činjenica kako nije moguće voditi troškove u različitim valutama unutar jedno računa (odnosno stavke kako sam ponekad pisao), bar ja ne znam kako bi to učinio. U tom smislu, odlučio sam se za zasebno vođenje evidencije kunskih te troškova u eurima.

Stvaranje računa

Dakle, kao prvi korak je kreirati račune, odnosno stavke, za kreditne kartice. To je relativno jednostavno, treba kliknuti na desnu tipku miša negdje unutar prozora s računima/stavkama te odabrati opciju New Account... i u prozoru koji se pojavi ispuniti polja na način prikazan sljedećom slikom:


Pod naziv je upisano Kreditne kartice, označena je opcija Placeholder što znači da se radi samo o računu koji sadrži druge račune ali neće u njemu moći biti upisivane stavke. Pod tip računa (Account Type) označeno je Credit Card i konačno, pod Parent Account, je označeno da se radi o novom vršnom računu. Sada je potrebno kreirati pojedine račune za Kunske troškove te troškove u Eurima. Opet je potrebno kliknuti na desnu tipku miša i odabrati opciju New Account... te potom ispuniti podatke kako je to prikazano na sljedećoj slici:



Ovim se stvara račun za MasterCard koji dolazi na naplatu u kunama. Primjetite kako je odabran tip računa Credit Card te je postavljen unutar stavke Kreditne kartice. Slično je i za Eure:



S tim da je u ovom slučaju odabrana valuta EUR (Euro).

Konačno stanje je prikazano sljedećom slikom:



Upisivanje troškova

Pretpostavimo kako smo s kreditnom karticom platili gorivo koje nas je koštalo 400Kn te je kupljeno na dan 29.11.2013. Dodatno, dana 4.12.2013. bili smo u nabavci te smo kupili hrane za 200kn. Te dvije stavke bi upisali pod račun MasterCard - Kunski troškovi na sljedeći način:



Primjetite kako je u koloni Transfer upisano na što se odnosi trošak. Dakle, kada sam kupio gorivo i platio ga kreditnom karticom, stavio sam da se radi o trošku za gorivo. Takav način upisivanja vezan je uz moju odluku kako želim voditi detaljne troškove kreditne kartice, tj. na ŠTO su novci potrošeni, a ne samo koliko je potrošeno. Kao zadnju stvar, primjetite da se trošak upisuje u kolonu Charge.

Upisivanje deviznih troškova je gotovo identično, jedino što se troškovi ne upisuju u kunama već u Eurima.

Naplata troškova

Kada vam banka konačno obračuna troškove i izvrši plaćanje onda je potrebno provesti postupak "rekonsilijacije" (ili kako god se to već zove u bankarskim krugovima). Naime, novci se povlače s vašeg tekućeg računa (ili ih uplaćujete u banci) te je potrebno obaviti tu transakciju. To se obavlja na sljedeći način

Prvo, kliknite desnom tipkom na račun kreditne kartice za koju je došla naplata. Pretpostavimo da je to kunski MasterCard. Na desni klik, otvorit će se izbornik unutar kojega je potrebno odabrati opciju Reconcile:




Kada odaberete tu opciju otvara se novi prozor. Unutar njega potrebno je odrediti datum kada se vrši naplata (ili obračun troškova) te suma koja je do trenutka obračuna učinjena:


U mom slučaju, odabrao sam datum 20.12.2013 jer na taj dan je stigla naplata te mi je sam GnuCash odredio da sam do tog trenutka napravio trošak od 600Kn. Ovom prilikom treba reći kako postoji razlika između datuma kada se vrši obračun (recimo, meni je to 10. u mjesecu) i datuma kada se vrši plaćanje (to je 20. u mjesecu). Ako ste upisali pod "Statement Date" datum plaćanja, a između datuma plaćanja i formiranja obavijesti je bilo troškova, oni će greškom također biti uključeni u naplatu. Dakle, pripazite na to. Možda je ipak najbolje upisivati datum formiranja troška, a ne kako sam ja napravio datum naplate.

Nakon što potvrdite informacije o datumu obračuna te konačne sume pritiskom na OK, otvara se novi prozor unutar kojega trebate odabreti koje sve stavke dolaze na plaćanje:


Ono što trebate učiniti je u desnom dijelu odabrati stavke koje se naplaćuju. Primjetite da svaka stavka, na kraju, ima kolonu R. To su "checkboxovi" koje treba selektirati. To morate napraviti tako da na kraju pokrijete sav trošak koji dolazi na naplatu, dakle svih 600kn. Razlika, koliko je još preostalo, navedena je u donjem desnom kutu, u retku "Difference". U našem slučaju, treba odabrati sve stavke, međutim, ako je nakon formiranja obavijesti bilo novih troškova, onda naravno nećete te nove troškove odabirati. Kada ste napravili odabri, stanje je sljedeće:


Primjetite dvije stvari. Prvo, razlika je 0 jer smo pokrili svu sumu koja dolazi na naplatu, i drugo, zelena tipka u traci s alatima je sada omogućena. Naime, tu zelenu tipku treba odabrati kako bi se nastavilo u sljedeći, i zadnji, korak:


U zadnjem koraku treba odabrati odakle se pokrivaju troškovi koji su načinjeni na kreditnoj kartici. U mom slučaju, troškovi se pokrivaju s tekućeg računa pa sam ja odabrao u polju "Transfer From" tekući račun, dok sam u polju "Transfer To" odabrao kunski kreditni račun. Klikom na OK, obavlja se transakcija, i cijeli postupak je završen. Sada situacija izgleda ovako:


Drugim riječima, na kunskom MasterCardu nemam nepodmirenih troškova, dok mi je tekući račun "tanji" za 600kn.

Tuesday, January 14, 2014

Modifying mail passing through Zimbra

I had a request to attach to (almost) each mail message that passes through the Zimbra mail server an image. Basically, what the owner wanted is that there is an image with advertisement in the mail that is sent by internal users. Additional requirements were:
  1. Image should be added only once!
  2. In case there is new image, and there is already on in the mail, the old one should be replaced!
  3. Image has to be at exact spot within a message.
There were some additional requirements from my POV:
  1. Not every mail should have image attached, e.g. automatically generated internal messages!
  2. I should be careful about impact on the performance.
  3. Mali messages that are not modified should not have any noticeable marks about image that isn't added (this one will be more clear later).
  4. The solution should allow to define only certain senders to have mails modified.
  5. It has to have a DRY RUN mode so that it is easily disabled.
I immediately knew that there is no way to place image somewhere on the Internet and put link into the mail message. Although the most elegant solution, the problem is that all mail clients don't show images by default, and that's it. So, image has to be within mail message itself. A bit of research showed up a potential solution. Namely, to embed image data within IMG tag itself, and mail messages are already altered by adding disclaimer which is HTML and a perfect place to add that IMG tag, so why not reuse disclaimer for the same purpose? In favor of that solution was also my intention not to add new scripts into the mail processing chain because of fear that it might impact performance. 

Unfortunately, that solution didn't work. The most important shortcoming is that Outlook and GMail don't handle IMG tag with embedded image data in it, in other words, the image isn't shown. To solve that, image has to be embedded as a MIME part within mail. Additionally, the other requirements aren't easy to achieve, especially, the one to replace old image with a new one. So, in the end I had to resort to writing scripts.

I already wrote about how I managed to solve more complex requirements for a disclaimer than the functionality of Zimbra allows. So, it was natural place for me to add that additional processing there. I called the script to add image altermail.py and now altermime script has the following form:
#!/bin/bash
grep "DISCLAIMER:" ${1#--input=} > /dev/null 2>&1
if [ ! "$?" = 0 ]; then
/opt/zimbra/altermime-0.3.10/bin/altermime-bin "$@"
fi
#echo "`date +%Y%m%d%H%M%S` $@" >> /tmp/altermime-args
/opt/zimbra/altermime-0.3.10/bin/altermail.py "$@" >> /tmp/altermail.log 2>&1
I'm calling altermial.py after altermime because the image placeholder is within disclaimer! Also, I removed exec keyword before altermime-bin call so that altermail.py is finished.

Additionally, note that altermail.py accepts the same arguments as altermime! This is in order to simplify things a bit.

Obviously, I choose Python as a programming language of my choice. I could write all that in Perl too, but since I'm lately working a lot more with Python, Python was the way to go. Both languages have very good support for mail processing (MIME messages in particular).

The script is on the GitHub and you can fetch it there.

How the script works

First of, script doesn't work for mail messages that aren't MIME. So, after loading a message the first check if it is a multipart message. If not, then script just exits.

Next, white and black lists are checked.

There are two passess over the mail message. In the first pass, it searches through the mail message to see if there is already image attached. If so, then it additionally checks if it is an older version of the image. If it is, it replaces the image, but in both cases it doesn't do anything more and finishes execution.

The second pass is done when there is no image in the mail message and it has to be added. Now, when adding the image it has to be added with HTML as html/related so that they are both shown. If you add it as html/alternative, then only one of them will be shown!

All the configuration options are embedded within script itself. I chose not to have configuration file to reduce number of disk accesses, which is already very high (a lot of modules are necessary).


About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)

Blog Archive