Friday, January 9, 2015

Getting free disk space in Linux

While working on a script to have full Zimbra backups as many days in the past as possible, I was trying to automatically remove old backups based on the free space value. Basically, the idea was to remove directory by directory until free space reached some threshold. To find out free space on a disk is easy, use df(1) command. Basically, it looks like this:
$ df -k /
Filesystem 1K-blocks     Used Available Use% Mounted on
/dev/sda1   56267084 39311864  16938836  70% /
The problem is that it is necessary to use some postprocessing in order to obtain desired value, i.e. 5th or 5th column. cut(1) command, in this case, is a bit problematic because in general you can not expect that the output is so nicely formatted, nor it is fixed. For example, based on the width of the widest device node in the first column, it is automatically resized. That in turn means number of whitespaces varies, and you end up being forced to use something else than cut(1). Probably, the most appropriate tool is awk(1), since awk(1) can properly parse fields separated with variable number of whitespaces. In addition, you need to get rid of first line. That can be done using head(1)/tail(1), but it is more efficient to use awk(1) itself. So, you end up with the following construct:
$ df -k / | awk 'NR==2 {print $4}'
16938836
But, for some reason, I wasn't satisfied with the given solution because I thought I'm using too complex tools for something that should be simpler than that. So, I started to search is there some other way to obtain free space of some partition. It turned out that stat(1) command is able to do that, but it's rarely used for that purpose. It is used to find out data about files, or directories, but not file systems. Yet, there is an option, -f, that tells stat(1) we are querying file system, and also there is an option --format which accepts format sequences in a style of date(1) command. So, to get the free space on root file system you can use it as follows:
$ stat -f --format "%f" /
4238805
stat(1) command without --format option prints all the data about file system it can find out:
$ stat -f /
  File: "/"
    ID: b8a4e1f0a2aefb22 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 14066771   Free: 4238805    Available: 4234709
Inodes: Total: 3588096    Free: 2151591
This makes it in some way analogous to df(1) command. But, we are getting values in blocks, instead of kilobytes! You can get block size using %S format sequence, but that's it. So, some additional trickery is needed. One solution is to output arithmetic expression and evaluate it using bc(1) command, like this:
$ stat -f --format "%f * %S" / | bc
17362145280
Alternatively, it is also possible to use shell's arithmetic evaluation like this:
$ echo $((`stat -f --format "%f * %S" /`))17362145280
But, in both cases we are starting two process. In a first case the processes are stat(1) and bc(1), and in the second case it is a new subshell (for backtick) and stat(1). Note that this is the same as the solution with awk(1). But in case of awk(1) we are starting two more complex tools of which one, df(1), is more targeted to display value to a user than to be used in scripts. One additional advantage of a method using awk(1) might be portability, i.e. I'm df(1)/awk(1) combination is probably more common than stat(1)/bc(1) combination.

Anyway, the difference probably isn't so big with respect to performance, but obviously there is another way to do it, and it was interesting to pursue an alternative. 

No comments:

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)