What are HTTP cookies?
This is important to architects because… cookies are an essential and sensitive part of any web user interface application.
The need for state management. Cookies are are essential part of the web for state management. The concept of “state” as applied to a web session simply implies a means for associating a series of web requests with one another. Otherwise, each web request would be treated independently.
State management example. The necessity for state management is best illustrated by example. Consider if I wanted to purchase a book at Amazon.com. Purchasing the book requires a series of steps – I have to look for books, select a book, enter payment information, and finally make a payment. Without some type of state management, when I submitted my payment information, Amazon would not know what the payment was for. Amazon needs some way to “remember” what the book selection was – that is where state management comes in.
Multiple ways to manage state. There are multiple ways to manage state. There are multiple ways to store state on the back end, in caches, and in various modern browser storage mechanisms. The emergence of multiple ways to manage state over the past 25 years has changed the nature of how cookies are used, but cookies are still a mainstay for state management.
The criticality of client state. As we have just suggested, there are server-side mechanisms for storing state, but you need a means to correlate the server state with multiple web requests – client state.
RFC 6265. RFC stands for “Request for Comment”. RFCs are standards published by the Internet Engineering Task Force (IETF). RFC 6265, published in April 2011, is the standard for cookies. Cookies have been around since the mid-90s, of course, so the date of April 2011 may seem strange. This simply reflects that RFCs are updated over time and this was the third RFC updating the cookie specification.
Type of Cookies
There are multiple types of cookies, to include session cookies persistent cookies, authentication cookies, third-party cookies, zombie cookies, super cookies, and tracking cookies. They all work fundamentally the same way. Several of these types are not described in detail here in the interests of brevity.
Session vs. Persistent. A session cookie is designed to only be used while the browser session continues – the cookie is destroyed when the session is ended. Persistent cookies are stored on the user’s computer and will be used when the same user uses the same browser on the same computer to access to same web site they accessed during the original session.
Tracking Cookies. A “tracking cookie” is typically used by an organization to keep track of information that allows them to learn a little more about you. For instance, they organization will know how often you have visited their web site over a period of time. This is notable because it is more common than typically understood. In fact, the 50 web sites with the most traffic in the United States each store an average of 64 pieces of tracking information.
Encryption. Cookies may be encrypted or unencrypted.
What is Typically Stored in Cookies?
The web server decides. The web server application decides what should be set in various cookies. There are few restrictions other than that the cookie can be a maximum of 4096 bytes. The original intent of cookies was to store shopping carts, but modern shopping cart implementations store the shopping cart in a database. Cookies had originally used to store things like credit card numbers, passwords, and other sensitive identity information. While there is nothing preventing that, that is rare in 2018 because of the obvious security implications. The main things that are stored in cookies as of 2018 are “session identifiers” and personalization information.
Session Cookies. A session cookie is simply a unique identifier. This unique identifier is passed to the server. The server manages the rest of the state, but needs the unique session id to distinguish the state of one session from another session.
Personalization Information. Non-sensitive personalization information is often stored in cookies. This is not personal identity information, but simple information like web site color and layout preferences.
Working with Cookies
The server can set a Cookie with the “Set-Cookie” HTTP header. The browser/client transmits cookies back to the server with the “Cookie” HTTP header.
Step 1: Request a web page from the server. In the first example below, we are simply requesting a web page. There are no Set-Cookie or Cookie HTTP headers at this point.
Step 2: Get a response from the server. In the response to the request from Step #1, the server is including two Set-Cookie headers. This instructs the browser to attach those cookies to subsequent requests that are sent to the server.
Step 3: Subsequent requests from the client. For the rest of the session (or for later sessions also if the cookie is persistent), the browser will use the “Cookie” HTTP header to sent the same information back to the server that the server had instructed it to send.
Cookies can have “attributes” as well. The attributes look just like the key-value pairs for other cookies that are set, but there are six attributes that have special meaning.
Expires and Max-Age. As you can see in the example below, there is a key/value pair with the “Expires” key. This tells the browser when the cookie expires. We could also use the “Max-Age” attribute to indicate how long the cookie should be allowed to exist.
Domain and Path. In the example below, the “Domain” and “Path” attributes are set. Cookies are typically used only for a specific “domain” – basically a single web site. If the Domain attribute is not set, the cookie is assumed to apply only to the web site that the browser is currently communicating with.
A domain attribute can be set for a domain/web site other than the one that is setting the cookie. This type of cookie is called a “third-party” cookie.
The path attribute limits the use of the cookie to a subset of the web site that set the cookie. In the example below, we have set the Path to /accounts. The browser will then send the cookie only to pages that have that path.
The simple fact that information is being stored on a client system suggests correctly that cookies can pose a bit of a security problem. Using the “Secure” and “HTTPOnly” attributes can help, but those attributes can be defeated by simply making requests programmatically outside of the browser.
You will typically want to encrypt cookies and can then be reasonably certain you are mitigating concerns about cookies.
No matter what you do, you can only mitigate the danger around cookies and never eliminate it entirely from a theoretical perspective. While beyond the scope of this post, cookies are susceptible to several types of attacks, to include “man-in-the-middle” and “cross site scripting” attacks.
(1) State management. Cookies are vital for managing state between multiple web requests.
(2) Types of cookies. There are other types, but you should at least understand the difference between session cookies and persistent cookies.
(3) What should be in a cookie? Besides non-sensitive personalization information, you generally just want a unique session id in the cookie – just enough so the web server can correlate it’s state information with the client state – the cookie.
(5) Cookie security. Cookies are very vulnerable to security attacks, so don’t store sensitive information in cookies, consider security attributes, and consider encryption.
(6) Expiration. You always want to consider when your cookie should expire. If you set it t expire in the past, you have created a session cookie rather than a persistent cookie.
(7) RFC 6265. This “request for comment” is the cookie specification and it worth at least a brief look.