Everything about nothing: networks

Showing posts with label networks. Show all posts

Saturday, October 13, 2012

Connection vs. connectionless vs. protocol vs. service vs. ...

I was searching for an example of a connection oriented unreliable service and associated, or implementing, protocol and I stumbled on a post written by someone who claims to be CCIE that doesn't distinguish between terms service and a protocol, or at least the post was written in such a way that there is no distinction. Now, there are pages on Wikipedia that explain those terms but I was compelled to write my own post about those terms and to make distinction and characteristics clear. Also, because connection oriented service is frequently associated with TCP, and connectionless with UDP, then the characteristics of the protocols are often attributed to the connection oriented and connectionless services as well. But this is wrong, and let me also explain what is wrong and why.

Network layers and service

To understand the difference between service and a protocol you have to know that network functionality is divided into independent layers, stacked on one another. This is, obviously, true for all layers except for the first and the last one. This division is necessary because, for example apparently simple operation of opening Web page is actually very complex and includes a lot of functionality at the bottom of which is problem of sending and receiving bits using wireless communications, copper communication and/or fiber communication. Those areas alone are so complex that people specialize not only in, e.g. wireless communications, but in more specific parts like, e.g., antenna design. Anyway, the main purpose of each layer is to encapsulate some functionality and provide service to higher layer (note that I'm referring to layer immediately above) without higher layer being aware of what's happening in the layer below, or knowing what is the number of layers below. This is the same principle used in software design where applications are divided into modules to make them manageable. This process of using concepts of layers and services is iterative (or recursive, depending how you look at it) meaning that the layer that offers service to higher layer in the same time uses service of a lower layer to accomplish its goals. Again, first and last layer are somewhat specific, but I won't go into that.

Now, we came to the important fact that each layer provides service to a higher layer and uses services from the lower layer. So, service is just that, some functionality offered to a higher layer in which higher layer doesn't know how the service is implemented. Note that the layer that offers service is also called service provider, while the one that uses service from a lower layer is called service user.

Actually, this is enough knowledge of layering in networks to understand distinction between service and a protocol, but for completeness I'll mention few more things about layering. First, there is (almost) infinite number of ways this layering could be done. Not only with respect to the exact number of layers there are, but also with respect to specific functionality placed in layers. The most popular layering model is ISO/OSI Reference Model which has exactly 7 layers with each layer having prescribed functionality. It is called reference model because it is almost exclusively used as a reference for all other possible models. In other words, concrete networks like e.g. Internet, or even Ethernet, have different number of layers and/or functionalities in layers so they are frequently mapped to ISO/OSI Reference Model for a purpose of discussions and better understanding.

One final, very important thing. Layers don't implement functionalities, they are abstract concept so they don't exist as something material. What implements functionalities and offers services are different entities that logically belong to a certain layer. In other words, there are software and hardware modules written by programmers, or designed by hardware engineers, that exist in computers which implement some functionality and which are connected to other software/hardware modules that use them or which they use. For those software/hardware modules, by looking how they are connected and what they do, we say that they belong to a certain network layer. And when I say that layer implements, what I actually mean is that some entity in a layer implements, also when I say that layer uses a service of a lower layer, what is actually meant is that entity in a layer uses a service of an entity in a lower layer. There is a bit of ambiguity in those statements, but is easier to write and I think that with this clarification it isn't so confusing.

Protocols

The fact that each layer has services of a lower layer on its disposal, and doesn't know how the lower layer works nor the lower layer knows how higher is working, means basically that the communication is implemented between the same layers in different machines (or within the some one, which is actually a special case). So, to establish communication I, as an entity in say 3rd layer, am communicating with entity in 3rd layer on some another machine and we exchange information in order to allow communication of users in 4th layer. The same goes for 4th and other layers, too. But now, we have a problem. Namely, communicating entity in one layer on one machine is programmed by one company (say Microsoft) while entity on another machine, in the same layer, is programmed by someone implementing the analogous entity in Linux. Clearly, the two programmers probably don't know each other, and possibly they will never know. So, how do we make sure that their software will work, i.e. talk to each other! The answer is: by defining protocol. Human language is actually a protocol, albeit a very complex and ambiguous one. But nevertheless, if two secretaries don't speak the same language and have the same set of concepts that the language is referring to, they will never be able to pass messages from their bosses (users)!

So, protocol allows two (or more) entities within the layer to exchange information and establish communication and transfer data between their users. Protocol thus implements service, or is used to implement a service! More about that later. So, the protocol includes the following elements:

Data units exchanged, called Protocol Data Unit, or PDU. For every data unit exchanged, format has to be rigorously defined!
Behavior, usually defined and implemented using state machines. Behavior is actually how entity responds to information it receives from its peer (other entity), from its users and also what it expects from lower layer and how it uses it.

Note that each entity actually has communication with three other entities. The first one is a user in a higher layer - service user, the second one is entity in lower layer whose services are used to transfer data - service provider, and finally, there is a peer with whom the communication is established.

Connection and connectionless services

We saw in the section Network layers and services what the service is. Now we can say that there are two primary types of services. The first one is modeled according to how telephones work and is called connection oriented service while the second one is modeled according how post office works and is called connectionless service. It is interesting to note that connectionless is actually older, i.e. the telegraph system is connectionless and was in use before telephone was invented, but connection oriented is more dominant and before advent of digital computers was basically the only type in use.

The key difference between the two is that entity that uses connection oriented service from a lower layer entity has to first establish connection, i.e. to say with whom it is going to communication on the other end but without transferring any data yet. This is called connection establishment phase. Also, when the user is finished with data transfer, or communication, it has to explicitly break communication channel with its peer entity on the other end. This is called connection teardown phase. In between those two phases, data is transferred. Because of this, the identifier (i.e. address) of the other end is transferred only once, during connection establishment phase.

If you think a bit about this, you'll immediately see the similarity between telephone call and this service. In telephone call you first establish connection by dialing your peer's number, then you talk (i.e. transfer data), and finally you hang-up. Also, during telephone call you are user and telephone company offers you a service in which you don't know what's happening within telephone system. You only know and care that you have established communication channel with your peer entity, i.e. the person on the other end. Now, maybe you spouse told you that you call you friends to dinner. In that case, your spouse is your user and you are providing service to him/her.

On the other hand, connectionless service has only data transfer phase, i.e. no connection establishment nor teardown, you just send data. Obviously, when sending data you have to tell to your service provider to whom data should be sent and it has to be done each time you send something. Again, we said that postal office works that way and letters are sent that way, i.e. each one of them has an address and all the letters you've sent are mutually independent!

Relation between connectedness and protocol

Note again that, while talking about types of service, we didn't once talked about how things work, only how it appears to work. And that's the main point. Namely, service is one thing, protocol is other, and service can be connection oriented or connectionless, but the protocol is, well, just protocol. Now, the terms connection oriented protocol and connectionless protocol are extensively used in the literature, but this connectedness attribute is actually bound to the service protocol implements, not to the protocol itself.

Let us, as an example, take protocols from the Internet, IP, TCP and UDP. TCP and UDP are transport layer protocols (meaning, they are part of the transport layer in ISO/OSI RM). IP on the other hand is network layer protocol and it is used for communication of network layer entities. In networking texts entities are almost exclusively called the same as protocol they use, so we have TCP entity that uses TCP protocol to communicate with other TCP entities, called simply TCP, UDP entity that uses UDP protocol to communicate with other UDP entities, called simply UDP. It is similarly for IP protocol/entity. This might be ambiguous sometimes, but from the context it should be clear if the authors are talking about entities or protocols.

Lets start with IP. IP offers connectionless service to its service user and uses connectionless service from its service provider. This means that each IP's protocol data unit (called datagram or packet or IP packet) carries destination address and data, and in order for two IP entities to communicate it is not necessary to establish connection. Actually, there is no way connection could be established wth IP protocol. Furthermore, entities using IP protocol offer connectionless service to users, in our case, TCP and UDP. And IP also uses connectionless service from lower layers. The reason for this is that connectionless is a least common denominator, it actually expects least from the network, and that's the one reason why IP is connectionless protocol. If the underlying network is connection oriented, like e.g. ATM is, than it has only to expose connectionless service that will be used by IP. And if in the implementation of these services it is necessary to establish and break connection for each packet, then so be it. It will work, though not particularly efficient.

The next entity is TCP. It offers connection oriented service to its users, and uses connectionless service from its service provider, the IP entity. But TCP's service is more that that, it is also reliable (more about that in the next section). Now, take a note that TCP uses IP for communication (more precisely it uses services provided by IP entity) which are connectionless! So, TCP offers connection oriented service on a top of connectionless service. This is actually very hard to achieve.

Finally, there is UDP entity that offers connectionless service to its users and it uses connectionless service from its service provider, IP entity. UDP is actually very thin layer in terms of functionality because it adds almost nothing to what IP already provides. In a way it only relays data.

Note that what each entity offers to its users (i.e. service) doesn't necessarily correspond to what it gets from its service provider.

Relation between connectedness and QoS

Ok, final thing to discuss is reliability, or more generally Qualit of Service (QoS) offered by service. As I said in the introduction, because connection oriented service is mostly associated with TCP, characteristics of TCP are associated with connection oriented service. Similarly goes for UDP. But that's not true. Connectedness of service and the guarantees it provides for a certain parameters of communicatoin (called QoS) don't have anything to do with each other. It is perfectly feasible to have connected oriented unreliable service as is to have connectionless reliable service.

Now, reliability is a bit of vague term here. In case of TCP it means that the service guarantees that all the data that was sent will arrive, in order sent, without duplicates. In case it couldn't fulfill those requirements, the service will be disconnected with an appropriate error indication. Note that fulfillment of those guarantees is part of the protocol operation, and there are different mechanisms to achieve that like sequence numbers, acknowledgments, timeouts, retransmissions, etc. Also note that one more important thing. There is no guarantee that there will be no errors in a stream, i.e. that some bit fill be accidentally flipped. TCP doesn't detect that. And if you think that errors might appear when data travels through the network, think about bugs in software and possible consequences on data...

Anyway, connectedness and QoS are separate things that can be combined in different ways.

Croatian terminology

This is actually note for Croatian readers. When I was thinking should I write this post I wasn't sure should I write it in Croatian or English. In the end, I decided to write it in English (obviously) but one of the reasons I was thinking about using Croatian is because of the terminology. I insist on using Croatian translations if they are available and I don't like when someone in Croatia is speaking half in Croatian and half in English. Even worse is when someone writes half English half Croatian. Ok, some level of mix is acceptable (especially in spoken language), but there are quite good translations and I don't see why it would be necessary to use English equivalents in talk.

So, I refer croatian readers to look at dictionary with all the translations.

Wednesday, July 18, 2012

ASLR to extreme

I was reading about Artificial Immune Systems (more about that in another post) and in one of the papers the statement was that biological systems increase resiliency by diversity. Furthermore, they give a contra example in computer networks in which Internet Explorer (at the time the paper was written) had 90% market share. It's obvious that when something hits IE, it hits almost the whole Internet. This isn't diversity by any standard.

I think that we have such problems with security in general that we need some new, radical solution. Probably, we are long way from that solution, but it occurred to me that this is exactly what is necessary, diversity that will disallow attackers from influencing single computers and thus large parts of the Internet. Still, it is hard to expect there will be N producers of operating systems, then N of browsers, etc. It's not easy to produce those, it takes long time and huge resources. Now, biological systems are much much older and theoretically it could be that in some distant future there will be such diversity. IMHO, this is questionable, and as I said it's theoretically in some distant future, which is why it is beyond the point. What we need is something that works now.

If you think a bit what we need is a mutation, that will change computer systems, from the bottom up in unpredictable ways. On the bottom I'm thinking about parts of a single application, while on the top I think of the complex systems consisting of computers and networks. Furthermore, this mutation has to be specific to each system so that there are hardly two similar systems in existence. So, for example, the computer you work on isn't similar to any other computer in use, and, as you use it, it evolves and mutates.

Now, why I mentioned Address Space Layout Randomization (ASLR) in the title? Because it seems to me to be a step in the direction of totally mutating everything. Namely, ASLR mutates address space of the process thus making it unpredictable for attackers and making each systems different. This mutation unfortunately, is restricted because it is too coarse grained, i.e. you move whole libraries, but not functions, of even blocks of the code from which functions are built.

Of course there are problems. For a start, similarity is a key to maintenance of systems. Companies having a large number of computers try hard to make them equal, just to lower maintenance costs. Not only that, developers count on similarity to be able to reproduce bugs, and consequently to correct them. So, those requirements should either be kept in a new system (which in part is contradictory) or new ways of achieving the same effect (i.e. maintainability).

Finally, mutation has to be dynamic. Namely, even if attacker gets into one system, or part of the system he needs time to discover other parts of the system. If mutation is quick enough, the knowledge that attacker obtains will be worthless before he manages to use it. Not only that, but potentially what he already achieved will evaporate soon.

Thursday, January 4, 2007

BGPlay and whois...

And here comes first "serious" post. :) It's about handy tool called BGPlay.

The purpose of this tool is to visualize AS connectivity in time from some point in the Internet to the given network/AS. Actually, it's Java applet, and you can find it here. I didn't saw link where source, or application, can be downloaded.

First of all, let me try to explain briefly what AS is. Internet consists of different networks connected together. But the real truth is that those networks are first grouped into autonomous systems (AS) and then those autonomous systems form the Internet. More precisely, autonomous system is a collection of (computer) networks that is under single administrative control. Usually, autonomous systems run single interior routing protocol, but not always.

So, armed with this, let's start the applet. It will present you the following dialog:

In field titled prefix, you write network address you are interested in. For example, we can put 161.53.0.0/16, it's prefix for Croatian Academic and Research Network (CARNet). Also, you may wish to change start date field into something earlier than offered by default. After clicking on OK, and some waitinig, you'll be presented with the following window:

This window has four parts. In the left, smaller, pane there is time axis with time running from bottom up. Values ploted on this axis represent number of changes in AS conectivity. In the right pane, the biggest one, there is drawn graph with shown connectivity of our network. Actually, it's not network shown, but AS to which out network belongs. This AS is marked red and it has number 2108. At the bottom of this window there are some controls that allow control of connectivity visualisation. Finally, at the top, you can see what's drawn at some particular moment.

So, start visualisation by pressing play button. What you'll see is how route change between different ASes, either because they are withdrawn, announced, or just changed. On the top pane in window is message that states what exactly happened. And, in the left pane, you can see arrow going up, meaning that the time is running.

So, nice, but there is something else that can be seen from this figure, and that is, with whom particular AS is connected. Looking at the figure, you see that two ASes domine in connectivity with our AS, AS1299 and AS3356. So, let's find out who's behind those ASes. For that purpose we'll use whois utility.

Run the following query in the Linux command line:

$ whois AS1299

After some short time you'll get output. There are few interesting peaces. The first one is the following:

aut-num: AS1299
as-name: TELIANET
descr: TeliaNet Global Network
descr: Telia International Carrier

This gives us idea who is behind this AS. It's TeliaNet, and after some simple googling, we find that it's internet provider from Sweden. Actually, googling reveals much more that this about TeliaNet, but that's unimportant for now.

The other important part of the output is:

remarks: 1299:210x Peers at VIX

which indicates us that CARNet is peering with TeliaNet at VIX peering exchange. Again, some small amount of googling, and we find it's Vienna Internet Exchange. Actually, search term was "VIX internet exchange", since searching only for VIX gives false results.

I'll leave searching for other AS to you.

There are also some caveats with this approach. You can not find out exact connectivity of some AS because peering arrangements are usually not propagated into BGP and thus, there is no way for this software to find out those connections.

Still, this is very interesting piece of software with only one shortcoming I found during this short period of it's use. There is no zoom button or anyhting like that!

Everything about nothing