TCP/IP Overview

Every computer on the internet uses the Internet Protocol (IP). There are two protocols that are added on top of IP: Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).

TCP is used when we need to send data in a reliable way. For example, email uses TCP because every word in a letter is important. UDP is used when some data loss is acceptable and speed is the most important. For example, audio phone calls use UDP because losing a few datagrams hurts sound quality only slightly. You can learn more about common applications, their protocols, and their ports by viewing /etc/services.

IP is used to carry packets of information. Inside those packets, we have different data that follows different protocols depending upon the application. Common applications include SMTP for email, FTP for uploading/downloading files, SSH for remotely connecting to computers, HTTP for the world wide web, IRC for text messages and chat, and so on.

TCP/IP relies on client-server architecture. One side will act as the server, and another side the client. The client requests information, and the server delivers it. For example, if you use your phone to view a video, your phone is the client, and the server is in a data center. The server is often an expensive machine with a fast internet connection, but not always. Any computer running the right software can act as a server. For example, your home desktop PC could run an IRC server and serve chat messages which your phone could request. In fact, even your phone can run a web server and deliver web pages to other desktop PCs, which would become clients. There is no physical definition of a server; the only definition is that the client requests information and the server responds to it.

IP Addresses

Each computer has at least one network interface to connect it with the global Internet. Each network interface will have at least one IP address, which will look something like 192.168.0.1. Each IP address can have any number of ports starting from 1 to 65535. When you combine an IP address with a port, you have a socket. For example, 192.168.0.1:443 is the socket that your web server (openhttpd) listens on.

Every client must specify a unique socket with a specific protocol, and the server must listen on that exact same combination in order to respond. For example, suppose you send a request to a web server with IP 192.168.0.1 on port 443 using TCP. If the web server is instead listening on a different socket (IP 192.168.0.1 port 80 using TCP), the web server will not to respond. Port 443 is not the same as port 80. Both the socket and protocol must be identical. If the IP address, port, or protocol type is wrong, the client-server connection will not work.

IP addresses are not physically bound to any hardware device. A hardware device can easily change its IP address. As a result, when routing packets, it becomes necessary to know which networking interface an IP address is actually referring to. This is where ethernet MAC addresses often come in. MAC addresses are a unique 48-bit identifier which uniquely identify network interfaces on hardware. They are usually written in hexadecimal (such as 01:23:45:67:89:ab).

Servers sometimes ask you to bind to an IP address. This means that the server will listen and send packets using only that IP address. This can be important if your server has multiple IP addresses it can choose from; you may only want to use one for your server.

Subnets

IPv4 addresses are written as four numbers separated by periods (dotted-quad notation). They are, however, stored on the computer using binary. For example, 192.168.0.1 in binary could be represented as 11000000.10101000.00000000.00000001. Each IPv4 address can be separated into a subnet identifier and a host identifier.

The Internet has billions of computers on it, so it is helpful to be able to divide all these computers into smaller subnetworks, or subnets for short. Once we group up all the computers into subnets, it will make it much easier to find a subnet, and then to find a computer inside the subnet. Routers use this in order to help with routing (delivering) packets to the right place.

Suppose the first 24 bits of the IPv4 address are part of the subnet identifier. We would indicate this by adding /24 at the end of the IPv4 address: 192.168.0.1/24. This tells us that the first 24 bits of the IPv4 address indicate what subnet the address is a part of; and that the last 8 bits indicate the device on the subnet.

Computers often use bitmasks in order to quickly calculate the subnet identifier and the host identifier. The subnet mask is basically a number where all the binary digits of the subnet are 1s, and the rest are 0s. So a /24 subnet mask could be described in three ways. In binary, it would be 11111111.11111111.11111111.00000000. In dotted-quad notation, it is 255.255.255.0. In hexadecimal, it is 0xffffff00. You will find this information very valuable later when you configure your network interface's subnet mask and default gateway.

Special Addresses

Some IPv4 addresses have special meaning. IP addresses from 127.0.0.0 to 127.255.255.255 are loopback addresses. This address refers to your current computer itself. Instead of using the network to connect to another computer, you use the network to connect to yourself. Using a loopback address helps you test a network service without having to use the Internet. For example, you might install a web server and then view it locally by visiting http://127.0.0.1.

Notice that we can rewrite this address range more compactly by using 127.0.0.0/8. Here, we are referring to the entire subnet where the first 8 bits are the same as in the IPv4 address 127.0.0.0.

There are a few other reserved IP addresses you should also be aware of. For example, 192.0.0.0/24 refers to a private network that is not connected to the Internet. For this reason, an IP address like 192.168.0.1 can never be used for a public Internet service.