Sunday, December 9, 2012

Few remarks on CS144...

I'm teaching computer networks for the past 10 or so years, and during that time I got used to a certain approach in teaching this subject. But as I already noted in the post about e2e design principle and middleboxes, I'm watching course Introduction to Computer Networks (CS144), given on Stanford university. The main reason being that I wanted to see how others are doing it. While I was watching the part of the first lecture, What is Internet - 4 layers, I had some comments, so I decided to write a post about it. But then, I decided to comment on the whole course, not just a single lecture. At least that is my intention at this moment.

One very important thing before I start. Note that every course has to simplify things and remove as much details as possible in order to make things "learnable". So, sometimes lecturers don't tell complete truth, or even they say something that isn't truth. This is acceptable as long as they correct themselves eventually. But because of this it means that there are many approaches to teach something, potentially very different, and I'm looking from the viewpoint of one specific approach, namely the one I'm using. This, in turn, might mean that some, or even majority, will not agree with my comments on CS144 in this post. That's perfectly OK, but anyone reading this post should bear that in mind and not take things for granted!

What is Internet - 4 Layers

The purpose of this lecture is to teach you about layering in the networks. This is a very important concept that is a mandatory knowledge for anyone doing anything that touches networking.

But there are few things that I don't like in the approach taken by CS144. First, more correctness problem than something major, is when the lecturer notes where particular layers are implemented. He doesn't give a complete information in this case because he says that everything below application layer is implemented in the operating system. But, the truth is that parts of "link layer" are implemented in hardware, which definitely isn't an operating system, and within firmware which also isn't part of an operating system. Also, where is the line between hardware/firmware and operating system greatly varies. There are, for example, hardware accelerators for TCP, and in that case hardware/firmware reaches almost up to application layer.

Next, ISO/OSI RM is mentioned only briefly, and the comment was "... that was widely used." It was introduced because network layer is frequently called Layer 3, while in model used in this course it is in layer 2. Before continuing, let me just note that Layer 2 is also frequently used, and Layer 7 (L7) isn't so rarely used, either. Anyway, first ISO/OSI has never been widely used for the purpose it was created, unless you count bureaucratic work done within OSI, which is a lot of (bureaucratic) work! On the other hand, it is widely used as a reference model, i.e. it is used to compare different networks. And also it is widely used when we try to be a general, not tied to a specific network. After all, Internet is only one instance of many other networks, past and future. Now, I agree that it is good policy these days to stick to the Internet when teaching basics of networking. But it should be clear that Internet isn't the only network around. If OSI did something right (well, truth to be told, they did several things right), than it is the stuff around network model (or architecture). Note that there are some things that are not right (e.g. number of layers), but in general it is very well thought subject. By the way, physical layer has much more to do than only wires and connectors, if nothing else because there are three main ways of communication (wireless, wired and optical) and then there are countless number of variations within each of those.

Now, when the lecturer compares the 4 layer model he uses with ISO, he says that TCP covers transport and session layers of ISO/OSI RM. This is the first time ever that I heard that TCP covers session layer. This is based on his premise that the purpose of session layer is connection establishment. But, that's simply not true. The purpose of session layer is management of multiple connections, which can degrade into a single connection and in that case session layer is very thin - in terms of the functionality. On the other hand, connection establishment for a single connection is part of the specific protocol within transport layer (there are of course those that don't have connection establishment). Take for example OSI transport protocol TP4 which has connection establishment, transfer and disconnect phases, just like TCP and OSI definitely places it in transport layer, not session layer!

Finally, the lecture implies that layers are the same thing as protocols, i.e. that transport layer is TCP. But, the layer is just a concept, while TCP is an entity, implementation, that logically belongs to a certain layer.

What is Internet - IP Service Model

This lecture is about IP service, which, as the lecturer says at the beginning is what IP offers to layer above and what it expects from layer below. But I think that this lecture actually mixes services expected from layers below and above, with the inner workings of the IP that are invisible to higher/lower layers:
  • Packet fragmenting isn't visible outside of the IP protocol because it is the task of the IP itself to defragment fragmented packets before handing data to the protocol in the layer above. Also, when IP fragmets packets the protocols in lower layers don't know, neither they care, if those are fragments or not. They are treated as opaque data by the lower layer protocols.
  • Feature to prevent packets from looping forever is also internal mechanism to IP protocol, and not something that higher or lower layers should know or care about. True, there is ICMP message that informs sender that this happened, but as I said, it is not intended for other layers. If nothing else, because those layers don't determine the value of TTL field. It is a sole discretion of IP protocol itself.
  • Checksum in the IP packet isn't used to prevent IP packet to be delivered to wrong destination. Let me cite RFC791 which says that:
    This checksum at the internet level is intended to protect the internet header fields from transmission errors.
    So it is intended to protect header from errors, not to prevent deliver of IP packet to wrong destination. True, it might happen that the error occurs in destination address and that, in that case, delivery is prevented but this is only a special case, a consequence, not something specifically targeted.
    Furthermore, while I'm at checksum, it uses simple addition and thus it is a very weak protection mechanism. Actually, it was so useless, and also it was slowing routers, so it was removed in IPv6. By the way, the same version of checksum is equally useful in TCP.
  • Options within IPv4 are, again, specific to IPv4 protocol and not something offered as a service to higher layers.
I have to admit that the bullet "Allows for a new versions of IP" totally confused me?

Next, the definition of connectionless service is that no state is established in the network. That is true, but the point is that it is not the feature of a service but of the protocol operation, and thus protocols above (i.e. in higher layers) simply don't care about that. It is possible for some protocol to offer connection oriented service while operating over connectionless "subnetwork" (e.g. TCP over IP) as it is possible to offer connectionless service over connection oriented "subnetwork (e.g. IP over ATM). More about connectionless vs. connestion oriented you can read in my other post.

Note, the term IP layer is somewhat wrong, or at least discussable  Namely, there is no IP layer but network layer in which one of the protocols is IP protocol. Now, I'm aware that many say IP layer so, if we assume that the majority is right, then I'm wrong. :)

Also, for the end of this part it was interesting to see the mixed use of the terms datagram and a packet. I'm almost always using the term packet, rarely datagram, but I'll have to take a look at this more closely.

Anyway, could be that the lectures of this course and I have different view on what "service model" is, but I didn't notice that they defined what they mean by it, they just started to explain service model of different protocols.

Now, while solving quizzes the following questions surprised me:
  • An Internet router is allowed to drop packets when it has insufficient resources -- this is the idea of "best effort" service. There can also be cases when resources are available (e.g., link capacity) but the router drops the packet anyways. Which of the following are examples of scenarios where a router drops a packet even when it has sufficient resources?

    I thought that the answer was a, c and d (corrupted packet). But, d was rejected.
  • In an alternative to the Internet Protocol called "ATM" proposed in the 1990s, the source and destination address is replaced by a unique "flow" identifier that is dynamically created for each new end-to-end communication. Before the communication starts, the flow identifier is added to each router along the path, to indicate the next hop, and then removed when the communication is over. What are the consequences of this design choice?

    Here, I thought that the answers are a and c. But apparently, a and d were accepted. Now, c says that  there is a need for control entity to manage flow labels. Might be that I misunderstood "control entity", that it actually means something centralized. In that case probably I'm wrong. And d says there is no more need for transport layer. I would like to hear some arguments for that. Anyway, I'll have to read a bit more details about ATM, after all.

What is internet - TCP UDP

This video starts with the introduction in which the following sentence is stated: ... two different transport layer services, one of them is TCP and the other is UDP. The problem is that TCP and UDP are not services but protocols that offer some service.

"TCP is an example of transport layer". As I said, TCP is protocol, not a layer!

I wouldn't say that the property "stream of bytes" means that the bytes will be delivered in order. That's more the property of reliability. What "stream of bytes" means, in the case of TCP, is that there is no concept of the message and message boundaries. So, if the application sends two times 500 octets, it can be delivered on the other end in one go of 1000 octets, in three rounds, etc.

Source port isn't only used so that TCP knows where to send back data, but also for receiving entity to know how to demultiplex incoming TCP segment. Namely, every connection is uniquely identified by a four tuple (IP src addr, src port, IP dst addr, dst addr) and so source port is used for demultiplexing.

Checksum in TCP is quite weak, as I already argued, so it is not particularly good mechanism for detecting errors.

It is possible that TCP connection is closed in three exchanges, but could be that this will be explained later.

What is the Internet - ICMP

I have to admit that placing ICMP in transport layer is quite a novel approach to layering Internet protocols. The lecturer says that strictly speaking it uses IP and thus it belongs to transport layer. The truth is that it is far from clear where this protocol is, but the point is that when you place protocols in different layers it is not only what the protocol uses, but also what it offers and for what it is used - with respect to layer functionality. So, when we talk about ICMP, it doesn't offer services to layer above, that would be application layer, but it doesn't offer services to transport layer, either. Also, transport layer offers end-to-end communication services to application layer. Note that ICMP, on the other hand, allows communication of network layer entities (IP protocols) between any two nodes within the network. It is produced and consumed by IP protocol implemenations.

Two additional things have to be clarified that someone might take out now and counter argument me. First, there are applications that use ICMP, ping and traceroute. The truth is that ICMP actually was never designed to be used by applications, neither ping nor traceroute (especially not traceroute, search for the word "jelaous" on this page, its an interesting story). It just turned out that something can be used for the purpose not intended initially and so we now have those applications. But, I think that ping and traceroute access directly network layer, that is ICMP.

The second thing that someone might use to say that ICMP isn't in the network layer is OSPF. Namely, OSPF uses directly IP for a transfer service, not UDP nor TCP. So, someone might say that by placing ICMP into the network layer I'm placing OSPF to network layer too. There are those that think that OSPF is there. But, I think that OSPF is in application layer, along with other routing protocols. And that is for two reasons:
  1. Routing protocols communicate from end-to-end. It doesn't matter that "end" in this case might be, and is, a network router somewhere within the network, the point is that OSPF application treats that as intended destionations, ends. With ICMP, any node might - for example - drop a packet and generate Time Exceeded message. Note that the node generating error message isn't an end point of the communication!
  2. The functionality of the protocols is vastly different. And not only that, but also who is consuming the packets. ICMP is consumed, and generated, by IP protocol. (minus ping/traceroute for whom I already said that they are a special cases). OSPF on the other hand, is quite a complex protocol and IP protocol directly hands data to OSPF application process. IP doesn't consume those messages, neither it produces them.
So, I think OSPF is in application layer, while ICMP is in top part of the network layer.

Additionally, let me return to the lecture slides. Slide number 3 shows data for ICMP coming from the application. It's not true, data comes from the network layer itself, and ping and traceroute are misusing layering.

On slide 5 ICMP is treated as a network protocol in a sense like IP is. But I think that it's misleading. This actually leads me to one more argument why ICMP belongs to the network layer. Namely, ICMP doesn't have any separate implementation, there is no ICMP module within an operating system. There is IP module (protocol implementation) that produces and consumes ICMP messages.

Ok, so much about that lecture. Finally, when I was trying to solve quizzes, I had a problem with a first question: Which of the following statements are true about the ICMP service model? The offered answers were:
  1. ICMP messages are typically used to diagnose network problems. This is true, but it's not service model.
  2. Some routers would prioritize ICMP messages over other packets. This one isn' true. The routers treat ICMP messages as any other message (unless specifically configured to do so).
  3. ICMP messages are useless, since they do not transport actual data. ICMP is definitely not useless.
  4. ICMP messages can be maliciously used to scan a network and identify network devices. Yes, they can, but it's not a service model what this question asks.
  5. ICMP messages are reliably transmitted over the Internet. They are transferred in IP which is unreliable.
After trial and error it turned out that b is also true!? But then again, I can say that I made mistake because I didn't read that "some would" prioritize, which could be true, and "would" doesn't mean it is necessarily so. Huh, I hate when someone plays with words.

Ok, I'll stop here because this post is brewing for too long, and as I'm having much other work to do, it will take time until I watch all the lectures. Not to mention that it becomes quite large. So, I decided to publish this, and expect new posts eventually...

1 comment:

ssynhtn said...

Nice article, as a self learner I thought I was really dumb when I fail to pass a fair amount of these quiz problems in first several tries, some of the choices seem really vague to me, the lecturer does not provide enough information for me to figure out which is right and why is that.

About Me

scientist, consultant, security specialist, networking guy, system administrator, philosopher ;)

Blog Archive