Application Layer

We have been working from the bottom to the top of our four-layer TCP/IP network model and we are finally at the top. The Application layer is where the networked software like web browsers, mail programs, video players, or networked video players operate. We as users interact with these applications and the applications interact with the network on our behalf.

The Application Layer.

The Application Layer.

Client and Server Applications

It is important to remember that two parts are required for a networked application to function. The architecture for a networked application is called “client/server”. The server portion of the application runs somewhere on the Internet and has the information that users want to view or interact with. The client portion of the application makes connections to the server application, retrieves information, and shows it to the user. These applications use the Transport layer on each of their computers to exchange data.

Client/Server Applications.

Client/Server Applications.

To browse a web address like www.codeahoy.com, you must have a web application running on your computer. When you type an address into your web browser, it connects to the appropriate web server, retrieves pages for you to view, and then shows you the pages.

The web browser on your computer sends a request to connect to www.codeahoy.com. Your computer looks up the domain name to find the corresponding IP address for the server and makes a transport connection to that IP address, then begins to request data from the server over that network connection. When the data is received, the web browser shows it to you. Sometimes web browsers display a small animated icon to let you know that the data is being retrieved across the network.

On the other end of the connection is another application called a “web server”. The web server program is always up and waiting for incoming connections. So when you want to see a web page, you are connecting to a server application that is already running and waiting for your connection.

In a sense, the Transport, Internetwork, and Link layers, along with the Domain Name System, are like a telephone network for networked applications. They “dial up” different server applications on the network and have “conversations” with those applications to exchange data.

Application Layer Protocols

Just like people talking on telephones, each pair of network applications needs a set of rules that govern the conversation. In most cultures, when your phone rings and you pick up the phone you say “Hello”. Normally the person who made the call (the client person) is silent until the person who picked up the phone (the server person) says “Hello”. If you have ever called someone who does not follow this simple rule, it can be quite confusing. You probably would assume that the connection was not working, hang up, and retry the call.

A set of rules that govern how we communicate is called a “protocol”. The definition of the word protocol is “a rule which describes how an activity should be performed, especially in the field of diplomacy.” The idea is that in formal situations, we should behave according to a precise set of rules. We use this word to describe networked applications, because without precise rules, applications have no way to establish and manage a conversation. Computers like precision.

Application Protocols.

Application Protocols.

There are many different networked applications and it is important that each networked application have a well-documented protocol so that all servers and clients can interoperate. Some of these protocols are intricate and complex.

The protocol that describes how a web browser communicates with a web server is described in a number of large documents starting with this one:

https://tools.ietf.org/html/rfc7230

The formal name of the protocol between web clients and web servers is the “HyperText Transport Protocol”, or HTTP for short. When you put “http:” or “https:” on the beginning of a URL that you type into the browser, you are indicating that you would like to retrieve a document using the HTTP protocol.

If you were to read the above document, and go to section 5.3.2 on page 41, you find the exact text of what a web client is supposed to send to a web server:

GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1

One of the reasons that HTTP is so successful is that it is relatively simple compared to most client/server protocols. Even though the basic use of HTTP is relatively simple, there is a lot of detail that allows web clients and servers communicate efficiently and transfer a wide range of information and data. Six different documents describe the HTTP protocol, in a total of 305 pages. That might seem like a lot of detail, but the key in designing protocols is to think through all possible uses of the protocol and describe each scenario carefully.

Exploring the HTTP Protocol

In this section we will manually exercise the HTTP protocol by pretending to be a web browser and sending HTTP commands to a web server to retrieve data. To play with the HTTP protocol, we will use one of the earliest Internet applications ever built.

The “telnet” application was first developed in 1968, and was developed according to one of the earliest standards for the Internet:

https://tools.ietf.org/html/rfc15

This standard is only five pages long and at this point, you probably can easily read and understand most of what is in the document. The telnet client application is so old that it is effectively a dinosaur, as it comes from “prehistoric” times in terms of the age of the Internet. The Internet was created in 1985 by the NSFNet project and the precursor to the NSFNet called the ARPANET was brought up in 1969. Telnet was designed and built even before the first TCP/IP network was in production.

Interestingly, the telnet application is still present in most modern operating systems. You can access telnet from the terminal (command line) in Macintosh and Linux. The telnet application was also present in Windows 95 through Windows XP, but is not included in later versions of Windows. If you have a later version of Windows, you can download and install a telnet client to do the examples in this section.

Telnet is a simple application. Run telnet from the command line (or terminal) and type the following command:

telnet www.dr-chuck.com 80

The first parameter is a domain name (an IP address would work here as well) and a port to connect to on that host. We use the port to indicate which application server we would like to connect to. Port 80 is where we typically expect to find an HTTP (web) server application on a host. If there is no web server listening on port 80, our connection will time out and fail. But if there is a web server, we will be connected to that web server and whatever we type on our keyboard will be sent directly to the server. At this point, we need to know the HTTP protocol and type the commands precisely as expected. If we don’t know the protocol, the web server will not be too friendly. Here is an example of things not going well:

telnet www.dr-chuck.com 80
Trying 198.251.66.43...
Connected to www.dr-chuck.com.
Escape character is '^]'.
HELP
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>501 Method Not Implemented</title>
...
</body></html>
Connection closed by foreign host.

We type “telnet” in the terminal requesting a connection to port 80 (the web server) on the host www.dr-chuck.com. You can see as our transport layer is looking up the domain name, finding that the actual address is “198.251.66.43”, and then making a successful connection to that server. Once we are connected, the server waits for us to type a command followed by the enter or return key. Since we don’t know the protocol, we type “HELP” and enter. The server is not pleased, gives us an error message, and then closes the connection. We do not get a second chance. If we do not know the protocol, the web server does not want to talk to us.

But let’s go back and read section 5.3.2 of the RFC-7230 document and try again to request a document using the correct syntax:

telnet www.dr-chuck.com 80
Trying 198.251.66.43...
Connected to www.dr-chuck.com.
Escape character is '^]'.
GET http://www.dr-chuck.com/page1.htm HTTP/1.0

HTTP/1.1 200 OK
Last-Modified: Sun, 19 Jan 2014 14:25:43 GMT
Content-Length: 131
Content-Type: text/html

<h1>The First Page</h1>
<p>
If you like, you can switch to the
<a href="http://www.dr-chuck.com/page2.htm">
Second Page</a>.
</p>
Connection closed by foreign host.

We make the connection to the web browser again using telnet, then we send a GET command that indicates which document we want to retrieve. We use version 1.0 of the HTTP protocol because it is simpler than HTTP 1.1. Then we send a blank line by pressing “return” or “enter” to indicate that we are done with our request.

Hacking HTTP By Hand.

Hacking HTTP By Hand.

Since we have sent the proper request, the host responds with a series of headers describing the document, followed by a blank line, then it sends the actual document.

The headers communicate metadata (data about data) about the document that we have asked to retrieve. For example, the first line contains a status code.

In this example, the status code of “200” means that things went well. A status code of “404” in the first line of the headers indicates that the requested document was not found. A status code of “301” indicates that the document has moved to a new location.

The status codes for HTTP are grouped into ranges: 2XX codes indicate success, 3XX codes are for redirecting, 4XX codes indicate that the client application did something wrong, and 5xx codes indicate that the server did something wrong.

This is a HyperText Markup Language (HTML) document, so it is marked up with tags like <h1> and <p>. When your browser receives the document in HTML format, it looks at the markup in the document, interprets it, and presents you a formatted version of the document.

The IMAP Protocol for Retrieving Mail

The HTTP protocol is only one of many client/server application protocols used on the Internet. Another common protocol is used so that a mail application running on your computer can retrieve mail from a central server. Since your personal computer might not be turned on at all times, when mail is sent to you it is sent to a server and stored on that server until you turn on your computer and retrieve any new email.

Like many application standards, the Internet Message Access Protocol (IMAP) is described in a series of Request For Comment (RFC) documents starting with this RFC:

https://tools.ietf.org/html/rfc3501

IMAP is a more complicated protocol than the web protocol, so we won’t be able to use the telnet command to fake the protocol. But if you were going to develop a mail reading application, you could carefully read this document and develop code to have a successful conversation with a standards-compliant IMAP server. Here is a simple example from section 6.3.1 of the above document showing what the client (C:) sends and how the server (S:) responds:

C: A142 SELECT INBOX
S: * 172 EXISTS
S: * 1 RECENT
S: * OK [UNSEEN 12] Message 12 is first unseen
S: * OK [UIDVALIDITY 3857529045] UIDs valid
S: * OK [UIDNEXT 4392] Predicted next UID
S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited
S: A142 OK [READ-WRITE] SELECT completed

The messages that are sent by the client and server are not designed to be viewed by an end user so they are not particularly descriptive. These messages are precisely formatted and are sent in a precise order so that they can be generated and read by networked computer applications on each end of the connection.

Flow Control

When we looked at the Transport layer, we talked about the “window size”, which was the amount of data that the Transport layer on the sending computer will send before pausing to wait for an acknowledgement.

Buffering in the Transport Layer.

Buffering in the Transport Layer.

In this figure, we see a message broken into packets, with some of the packets sent and acknowledged. Six packets have been sent but not yet acknowledged and the sending Transport layer has reached the limit of the transmit window, so it is pausing until it receives an acknowledgement from the receiving computer’s Transport layer. The receiving computer has received three packets, one of which is out of order.

When we were looking at this example before from the point of view of the Transport layer, we ignored where the packets to be sent were coming from and where the packets were going to in the receiving computer. Now that we are looking at the Application layer, we can add the two applications that are the source and the destination of the stream of data.

Let’s assume the web browser has made a transport connection to the web server and has started downloading an image file. The web server has opened the image file and is sending the data from the file to its Transport layer as quickly as possible. But the Transport layer must follow the rules of window size, so it can only send a certain amount of data at a time. When the window fills up, the web server is paused until the Transport layer on the destination computer has started to receive and acknowledge packets.

As the Transport layer on the destination computer starts to receive packets, reconstruct the stream of data, and acknowledge packets, it delivers the reconstructed stream of packets to the web browser application display on the user’s screen. Sometimes on a slow connection you can see your browser “paint” pictures as the data is downloaded. On a fast connection the data comes so quickly that the pictures appear instantaneously.

If we redraw our picture of packets in the Transport layer, adding both of the application layers where the packets are in the middle of an image, now it looks like this:

Buffering in the Application and Transport Layers.

Buffering in the Application and Transport Layers.

The web server is reading the image file (‘F’) and sending it as a stream to the web browser as quickly as it can send the data. The source Transport layer has broken the stream into packets and used IP to send the packets to the destination computer.

The Transport layer has sent six packets (‘S’) and has stopped sending because it has reached its window size and paused the web server. Three of those six packets have made it to the Transport layer on the destination computer (‘R’) and three of the packets are still making their way across the Internet (‘S’).

As the destination Transport layer pieces the stream back together, it both sends an acknowledgement (ACK) and delivers the data to the receiving application (the web browser). The web browser reconstructs the image (‘A’) and displays it to the user as the data is received.

A key thing to notice in this picture is that the transport layers do not keep the packets for the entire file. They only retain packets that are “in transit” and unacknowledged. Once packets are acknowledged and delivered to the destination application, there is no reason for either the source or destination Transport layer to hold on to the packets.

When the acknowledgment flows back from the destination computer to the source computer, the Transport layer on the source computer unpauses the web server application and the web server continues to read data from the file and send it to the source Transport layer for transmission.

This ability to start and stop the sending application to make sure we send data as quickly as possible without sending data so fast that they clog up the Internet is called “flow control”. The applications are not responsible for flow control, they just try to send or receive data as quickly as possible and the two transport layers start and stop the applications as needed based on the speed and reliability of the network.

Writing Networked Applications

The applications which send and receive data over the network are written in one or more programming languages. Many programming languages have libraries of code that make it quite simple to write application code to send and receive data across the network. With a good programming library, making a connection to an application running on a server, sending data to the server, and receiving data from the server is generally as easy as reading and writing a file.

As an example, the code below is all it takes in the Python programming language to make a connection to a web server and retrieve a document:

Programming with Sockets in Python.

Programming with Sockets in Python.

While you may or not know the Python programming language, the key point is that it only takes ten lines of application code to make and use a network connection. This code is simple because the Transport, Internetwork, and Link layers so effectively solve the problems at each of their layers that applications using the network can ignore nearly all of the details of how the network is implemented.

In the Python application, in this line of code

mysock.connect(('www.py4inf.com', 80))

we have specified that we are connecting to the application that is listening for incoming connections on port 80 on the remote computer www.py4inf.com.

By choosing port 80 we are indicating that we want to connect to a World Wide Web server on that host and are expecting to communicate with that server using the HyperText Transport Protocol.

Summary

The entire purpose of the lower three layers (Transport, Internetwork, and Link) is to make it so that applications running in the Application layer can focus the application problem that needs to be solved and leave virtually all of the complexity of moving data across a network to be handled by the lower layers of the network model.

Because this approach makes it so simple to build networked applications, we have seen a wide range of networked applications including web browsers, mail applications, networked video games, network-based telephony applications, and many others. And what is even more exciting is that it is easy to experiment and build whole new types of networked applications to solve problems that have not yet been imagined.

Glossary

HTML: HyperText Markup Language. A textual format that marks up text using tags surrounded by less-than and greater-than characters. Example HTML looks like: <p>This is <strong>nice</strong></p>.

HTTP: HyperText Transport Protocol. An Application layer protocol that allows web browsers to retrieve web documents from web servers.

IMAP: Internet Message Access Protocol. A protocol that allows mail clients to log into and retrieve mail from IMAP-enabled mail servers.

flow control: When a sending computer slows down to make sure that it does not overwhelm either the network or the destination computer. Flow control also causes the sending computer to increase the speed at which data is sent when it is sure that the network and destination computer can handle the faster data rates.

socket: A software library available in many programming languages that makes creating a network connection and exchanging data nearly as easy as opening and reading a file on your computer.

status code: One aspect of the HTTP protocol that indicates the overall success or failure of a request for a document. The most well-known HTTP status code is “404”, which is how an HTTP server tells an HTTP client (i.e., a browser) that it the requested document could not be found.

telnet: A simple client application that makes TCP connections to various address/port combinations and allows typed data to be sent across the connection. In the early days of the Internet, telnet was used to remotely log in to a computer across the network.

web browser: A client application that you run on your computer to retrieve and display web pages.

web server: An application that deliver (serves up) Web pages.

Questions

  1. Which layer is right below the Application layer?
    • a)Transport
    • b)Internetworking
    • c)Link Layer
    • d)Obtuse layer
  2. What kind of document is used to describe widely used Application layer protocols?
    • a)DHCP
    • b)RFC
    • c)APPDOC
    • d)ISO 9000
  3. Which of these is an idea that was invented in the Application layer?
    • a)0f:2a:b3:1f:b3:1a
    • b)192.168.3.14
    • c)www.khanacademy.com
    • d)http://www.dr-chuck.com/
  4. Which of the following is not something that the Application layer worries about?
    • a)Whether the client or server starts talking first
    • b)The format of the commands and responses exchanged across a socket
    • c)How the window size changes as data is sent across a socket
    • d)How data is represented as it is sent across the network to assure interoperability.
  5. Which of these is an Application layer protocol?
    • a)HTTP
    • b)TCP
    • c)DHCP
    • d)Ethernet
  6. What port would typically be used to talk to a web server?
    • a)23
    • b)80
    • c)103
    • d)143
  7. What is the command that a web browser sends to a web server to retrieve an web document?
    • a)RETR
    • b)DOCUMENT
    • c)404
    • d)GET
  8. What is the purpose of the “Content-type:” header when you retrieve a document over the web protocol?
    • a)Tells the browser how to display the retrieved document
    • b)Tells the browser where to go if the document cannot be found
    • c)Tells the browser whether or not the retrieved document is empty
    • c)Tells the browser where the headers end and the content starts
  9. What common UNIX command can be used to send simple commands to a web server?
    • a)ftp
    • b)ping
    • c)traceroute
    • d)telnet
  10. What does an HTTP status code of “404” mean?
    • a)Document has moved
    • b)Successful document retrieval
    • c)Protocol error
    • d)Document not found
  11. What characters are used to mark up HTML documents?
    • a)Less-than and greater-than signs < >
    • b)Exclamation points !
    • c)Square brackets [ ]
    • d)Curly brackets { }
  12. What is a common application protocol for retrieving mail?
    • a)RFC
    • b)HTML
    • c)ICANN
    • d)IMAP
  13. What application protocol does RFC15 describe?
    • a)telnet
    • b)ping
    • c)traceroute
    • d)www
  14. What happens to a server application that is sending a large file when the TCP layer has sent enough data to fill the window size and has not yet received an acknowledgement?
    • a)The application switches its transmission to a new socket
    • b)The application crashes and must be restarted
    • c)The application is paused until the remote computer acknowledges that it has received some of the data
    • d)The closest gateway router starts to discard packets that would exceed the window size
  15. What is a “socket” on the Internet?
    • a)A way for devices to get wireless power
    • b)A way for devices to get an IP address
    • c)An entry in a routing table
    • d)A two-way data connection between a pair of client and server applications
  16. What must an application know to make a socket connection in software?
    • a)The address of the server and the port number on the server
    • b)The route between the source and destination computers
    • c)Which part of the IP address is the network number
    • d)The initial size of the TCP window during transmission

Licenses and Attributions


Speak Your Mind

-->