Saturday, August 23, 2014

Memory access latencies

Once, I saw a table in which all the memory latencies are scaled in such a way that CPU cycle is defined to be 1 second, and then L1 cache latency is several seconds, L2 cache even more, and so on up to SCSI commands timeout and system reboot. This was very interesting because I have much better developed sense for seconds and higher time units that for nanoseconds, microseconds, etc. Few days ago I remembered that table and I wanted to see it again, but couldn't find it.  This was from some book I couldn't remember the name. So, I started to google for it, and finally, after an hour or so of googling, I managed to find this picture. It turns out that this was from the book Systems performance written by Brendan Gregg. So, I decided to replicate it here for a future reference:


Table 2.2: Example Time Scale of System Latencies
Event Latency Scaled
1 CPU Cycle 0.3 ns 1 s
Level 1 cache access 0.9 ns 3 s
Level 2 cache access 2.8 ns 9 s
Level 3 cache access 12.9 ns 43 s
Main memory access (DRAM, from CPU) 120 ns 6 min
Solid-state disk I/O (flash memory) 50 - 150 us 2-6 days
Rotational disk I/O 1-10 ms 1-12 months
Internet: San Francisco to New York 40 ms 4 years
Internet: San Francisco to United Kingdom 81 ms 8 years
Internet: San Francisco to Australia 183 ms 19 years
TCP packet retransmit 1-3 s 105-317 years
OS virtualization system reboot 4 s 423 years
SCSI command timeout 30 s 3 millennia
Hardware (HW) virtualization system reboot 40 s 4 millennia
Physical system reboot 5 min 32 millennia

It's actually impressive how fast CPU is with respect to other components. It is also very good argument for multitasking, i.e. assigning CPU to some other task while waiting for, e.g. disk, or something from the network.

One additional impressive thing is written below the table in the book. Namely, if you multiply CPU cycle with speed of light (c) you can see that the light can travel only 0.5m while CPU does one instruction. That's really impressive. :)

That's it for this post. For the end, while I was searching for this table, I stumbled on some additional interesting links:



Sunday, June 29, 2014

Private addresses in IPv6 protocol

It is almost a common wisdom that 172.16.0.0/12, 192.168.0.0/16, and 10.0.0.0/8 are private network addresses that should be used when you don't have assigned address, or you don't intend to connect to the Internet (at least not directly). With IPv6 being ever more popular, and necessary, the question is which addresses are used for private networks in that protocol. In this post I'll try to answer that question.

The truth is that in IPv6 there are two types of private addresses, link local and unique local addresses. Link local IPv6 addresses, as the name suggests, are valid only on a single link. For example, on a single wireless network. You'll recognize those addresses by their prefix, which is fe80::/10, and they are automatically configured by appending interface's unique ID. IPv4 also has link local address, though it is not so frequently used. Still, maybe you noticed it when your DHCP didn't work and suddenly you had address that starts with 169.254.0.0./16. This was a link local IPv4 address configured. The problem with link local addresses is that they can not be used in case you try to connect two or more networks. They are only valid on a single network, and packets having those addresses are not routable! So, we need something else.

Unique local addresses (ULA), defined in RFC4193, are closer to IPv4 private addresses. That RFC defines ULA format and how to generate them. Basically, those are addresses with the prefix FC00::/7. These addresses are treated as normal, global, addresses, but are only valid inside some restricted area and can not be used on the global Internet. This is the same as saying that 10.0.0.0/8 addresses can be used within some private networks, but are not allowed on a global Internet. You choose how this conglomerate of networks will be connected, what prefixes used, etc.

There is  difference, though. Namely, it is expected that ULA will be unique in the world. You might ask why is that important, when those addresses are not allowed on the Internet anyway. But, that is important. Did it ever happened to you that you had to connect two private IPv4 networks (directly via router, via VPN, etc.), and coincidentally, both used, e.g. 192.168.1.0/24 prefix? Such situations are a pain to debug, and require renumbering or some nasty tricks to make them work. So, being unique is an important feature.

So, the mentioned RFC, actually specifies how to generate ULA with /48 prefix and a high probability of the prefix being unique. Let's first see the exact format of ULA:
| 7 bits |1|  40 bits   |  16 bits  |          64 bits           |
+--------+-+------------+-----------+----------------------------+
| Prefix |L| Global ID  | Subnet ID |        Interface ID        |
+--------+-+------------+-----------+----------------------------+
First 8 bits have a fixed value 0xFD. As you can see, prefix is 7 bit, but L bit must be set to 1 if the address is specified according to the RFC4193. So, first 8 bits are fixed to the value 0xFD. Note that L bit set to 0 isn't specified, it is something left for the future. Now, the main part is Global ID, whose length is 40 bits. That one must be generated in such a way to be unique with high probability. This is done in the following way:
  1. Obtain current time in a 64-bit format as specified in the NTP specification.
  2. Obtain identifier of a system running this algorithm (EUI-64, MAC, serial number).
  3. Concatenate the previous two and hash the concatenated result using SHA-1.
  4. Take the low order 40 bits as a Global ID.
The prefix obtained can be used now for a site. Subnet ID can be further used for multiple subnets within a site. There are Web based implementations of the algorithm you can use to either get a feeling of the generated addresses, or to generate prefix for your concrete situation.

Occasionally you'll stumble upon so called site local addresses. Those addresses were defined starting with the initial IPv6 addressing architecture in RFC1884 and were also defined in subsequent revisions of addressing architecture (RFC2373, RFC3513) but were finally deprecated in RFC3879. Since they were defined for so long (8 years) you might stumble upon them in some legacy applications. They are recognizable by their prefix FEC0::/10. You shouldn't use them any more, but use ULA instead.

Friday, June 27, 2014

Detecting which directory is changing...

Suppose that you have some directory with a lot of subdirectories. Of all those subdirectories, one of them is changing in size, while all the others are of a constant size. The question is, how to detect which subdirectory is that?

This happened to me while I was downloading mail archives from IETF. lftp client, that I'm using, shows only a file that it is currently downloading, not a directory in which it is, i.e. the output looks something like this:
lftp ftp.ietf.org:/> mirror ietf-mail-archive
`2010-04.mail' at 518040 (50%) 120.1K/s eta:4s [Receiving data]
                             
A solution to search for a given file won't work because this particular filename is in almost every directory.

The solution I used, was the following shell command:
$ ( du -sk *; sleep 5; du -sk * ) | sort | uniq -u
36204 mmusic
36848 mmusic
This command has to be executed inside ietf-mail-archive directory. It works as follows:
  1. First 'du -sk *' command lists all directory sizes.
  2. Then it sleeps for a five seconds (sleep 5) waiting for a directory that is currently changing, to change its size.
  3. Again we get all the directory sizes using the second du -sk command.
  4. Parentheses around all three are used so all of those commands execute within a subshell and that we receive output of both du commands.
  5. Then, we sort output. Note that the directories that don't change will be one after the another, while the one that changes won't be. 
  6. Finally, we use uniq command to filter out all the duplicate lines, meaning, only the directory that changed will be passed to the output.

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)