Everything about nothing: January 2015

While working on a script to have full Zimbra backups as many days in the past as possible, I was trying to automatically remove old backups based on the free space value. Basically, the idea was to remove directory by directory until free space reached some threshold. To find out free space on a disk is easy, use df(1) command. Basically, it looks like this:

$ df -k /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 56267084 39311864 16938836 70% /

The problem is that it is necessary to use some postprocessing in order to obtain desired value, i.e. 5th or 5th column. cut(1) command, in this case, is a bit problematic because in general you can not expect that the output is so nicely formatted, nor it is fixed. For example, based on the width of the widest device node in the first column, it is automatically resized. That in turn means number of whitespaces varies, and you end up being forced to use something else than cut(1). Probably, the most appropriate tool is awk(1), since awk(1) can properly parse fields separated with variable number of whitespaces. In addition, you need to get rid of first line. That can be done using head(1)/tail(1), but it is more efficient to use awk(1) itself. So, you end up with the following construct:

$ df -k / | awk 'NR==2 {print $4}'
16938836

But, for some reason, I wasn't satisfied with the given solution because I thought I'm using too complex tools for something that should be simpler than that. So, I started to search is there some other way to obtain free space of some partition. It turned out that stat(1) command is able to do that, but it's rarely used for that purpose. It is used to find out data about files, or directories, but not file systems. Yet, there is an option, -f, that tells stat(1) we are querying file system, and also there is an option --format which accepts format sequences in a style of date(1) command. So, to get the free space on root file system you can use it as follows:

$ stat -f --format "%f" /
4238805

stat(1) command without --format option prints all the data about file system it can find out:

$ stat -f /
File: "/"
ID: b8a4e1f0a2aefb22 Namelen: 255 Type: ext2/ext3
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 14066771 Free: 4238805 Available: 4234709
Inodes: Total: 3588096 Free: 2151591

This makes it in some way analogous to df(1) command. But, we are getting values in blocks, instead of kilobytes! You can get block size using %S format sequence, but that's it. So, some additional trickery is needed. One solution is to output arithmetic expression and evaluate it using bc(1) command, like this:

$ stat -f --format "%f * %S" / | bc
17362145280

Alternatively, it is also possible to use shell's arithmetic evaluation like this:

$ echo $((`stat -f --format "%f * %S" /`))17362145280

But, in both cases we are starting two process. In a first case the processes are stat(1) and bc(1), and in the second case it is a new subshell (for backtick) and stat(1). Note that this is the same as the solution with awk(1). But in case of awk(1) we are starting two more complex tools of which one, df(1), is more targeted to display value to a user than to be used in scripts. One additional advantage of a method using awk(1) might be portability, i.e. I'm df(1)/awk(1) combination is probably more common than stat(1)/bc(1) combination.

Anyway, the difference probably isn't so big with respect to performance, but obviously there is another way to do it, and it was interesting to pursue an alternative.

Everything about nothing

Friday, January 9, 2015

Getting free disk space in Linux

About Me

Blog Archive