Have you ever visited an online store, added a few items to your cart, then closed your browser, only to return later and find that your items are still there waiting for you? Or maybe you've logged into a website and noticed that you're already signed in, without having to enter your credentials again. All of these conveniences are made possible by a technology called cookies. In this article, we'll explore the fascinating world of HTTP cookies.

Introduction

"Cookies are a general mechanism which server side connections (such as CGI scripts) can use to both store and retrieve information on the client side of the connection. The addition of a simple, persistent, client-side state significantly extends the capabilities of Web-based client/server applications." Netscape Proposal (First Cookie proposal).

The following image, created by SecurityZines, offers a comprehensive yet intuitively explained overview of HTTP Cookies. In SecurityZines, they provided Concepts of Infosec extremely simplified by Rohit and Anshu.

The simplest definition of cookies is that they are an identifier generated by the server, which enables the server to recognize the user among other users. There are a few RFC documents for the standard formalization of cookies (RFC2109 in 1997 oldest)(RFC6265 in 2011 latest). The term ‘User-Agent’ is used in the RFC to refer to the browser. In the following picture I present a basic example of the use of cookie:

"...It remembers stateful information for the stateless HTTP protocol." Mozilla Web Docs.

The primary purpose of cookie management is to facilitate various functions on a website, such as session management (e.g. shopping cart), preferences (e.g. dark-light theme), and tracking user behavior.

Third-party resources

What happens if the webpage that you visit imports a third-party resource? The browser would request these resources from the third-party. If it is the first time you are requesting that endpoint, it could happen that the endpoint returns a Set-Cookie header. If it is not the first time, the browser will request the resource using the cookie for that third-party domain, if any. Let’s take the previous example, but this time imagine that it imports an image from a third-party:

As depicted in the preceding images, two keys have been incorporated into the HTTP protocol for cookies: set-cookie and cookie. If you prefer, you can read the RFC for the specification (link).

"Similarly, cookies for a given host are shared across all the ports on that host, even though the usual 'same-origin-policy' used by web browsers isolates content retrieved via different ports." RFC 6265.

As demonstrated in the picture, the Set-Cookie header is a mechanism used by the server to inform the user’s browser of a specific cookie that should be stored and sent in subsequent requests to the server. This header is included in the server’s response to the user’s initial request, and contains information about the cookie’s name, value, expiration time, path, and other optional attributes. Once the browser receives this header, it stores the cookie locally and sends it back to the server in all subsequent requests to the same domain. For further read of this header or any existing header I recommend (Mozilla Web Docs). The syntax of this header is the following one:

// Reference: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie
Set-Cookie: <cookie-name>=<cookie-value>
Set-Cookie: <cookie-name>=<cookie-value>; <option>
Set-Cookie: <cookie-name>=<cookie-value>; Domain=<domain-value>; Secure; HttpOnly

All of the set-cookie options are important but we are going to focus in some of them. If you want to check all the headers (Mozilla Docs). One important option is HttpOnly which disables any interacion with cookies from the javascript. Suppose you discover a Cross-Site Scripting (XSS) vulnerability in a webpage. This vulnerability grants you the ability to execute your JavaScript code on the client-side. If the HttpOnly header has not been set, you can extract the admin’s cookie if they access the URL with the XSS using document.cookie in javascript. You can then use the cookie to login as the admin. Warning: Setting the HttpOnly header doesn’t mean that the cookie will not be included in the request made by JavaScript (e.g., fetch). For this issue (CSRF attacks), SameSite header exist. This header defines if the header could be send from other origins. For this header, I recommend another post from Mozilla Web Docs.

"For example, if I set the cookie’s path to /siteA, then when I’m on /siteB, I can’t read the cookie from /siteA because the paths are different, so I can’t get it. But actually, it’s not necessarily true. If your website doesn’t block iframe embedding and the cookie isn’t set to HttpOnly, you can use an iframe to read cookies from different paths. This is because https://example.com/siteA and https://example.com/siteB are the same origin even though their paths are different. Therefore, you can directly access the document of other same-origin web pages through an iframe and use this feature to get document.cookie". Huli's blog..

Cookie header is the header where the cookie is send in the user requests, after been created (set-cookie or javascript). The syntax for this header is:

// Reference: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cookie
Cookie: <cookie-list>
Cookie: name=value
Cookie: name=value; name2=value2; name3=value3

JavaScript provides functionality for reading and writing cookies.

// You can try this logic using the console of the browser inspector

// Read
// Will return all the cookies (if HttpOnly is not enable)
let cookies = document.cookie;
console.log(cookies);

// Write
document.cookie = "user=babyyoda; path=/";"

Security

One fundamental keyword for restricting cookie usage exclusively to HTTPS environments (employing SSL) is Secure. Another frequently used keyword is HttpOnly, indicating that the cookie is inaccessible to JavaScript and exclusively utilized in HTTP connections. In addition to conventional safeguards against common threats such as XSS attacks, there is another recently added mitigation measure known as Same-site. For further information, I refer to Mozilla’s blog, Soheil’s Research Paper Wiki or this one done by Jub0bs.

For this subsection, I also recommend this blog entry by Ankur or this research called “Cookie Crumbles: Unveiling Web Session Integrity Vulnerabilities” (BlackHat video) by Pedro Adão and Marco Squarcina. And finally I would like also to mention this entry blog called “The limits of the same-origin policy: cross-origin (but same-site) attacks” by David Dworken.

Privacy issues

As the reader may notice, cookies can be used to track a user across different websites. Some of these issues are briefly mentioned in the RFCs. As example, one mitigation for third-party tracking is to block the use of third-party cookies. To further exploration of web-tracking topic.

Historical Notes

"This specification is an amalgam of Kristol's State-Info proposal and Netscape's Cookie proposal. "

HTTP Cookies were first proposed in 1994 by Netscape Communications Corporation as a way to maintain stateful sessions between web clients and servers. The original specification was called “Magic Cookies” and was later renamed to “Cookies”. The proposal was later submitted to the Internet Engineering Task Force (IETF), which is responsible for standardizing Internet protocols.

"Brian Behlendorf proposed a Session-ID header that would be user- agent-initiated and could be used by an origin server to track "clicktrails". It would not carry any origin-server-defined state, however. Phillip Hallam-Baker has proposed another client-defined session ID mechanism for similar purposes. While both session IDs and cookies can provide a way to sustain stateful sessions, their intended purpose is different, and, consequently, the privacy requirements for them are different. A user initiates session IDs to allow servers to track progress through them, or to distinguish multiple users on a shared machine. "

The first version of the HTTP Cookies specification was published in 1997 as RFC 2109, which defined the format of the cookie header that is used to set and send cookies between clients and servers. The original specification included several attributes that can be used to control the behavior of cookies, such as the cookie’s name, value, expiration time, and path.

In 1999, the specification was revised and published as RFC 2965, which added support for more advanced features such as secure cookies, which can only be sent over a secure HTTPS connection. In the case of HttpOnly flag, was added in 2011, by the RFC 6265

Thanks for reading

If you have any recommendation/mistake/feedback, feel free to reach me twitter :)

References:

Netscape Cookie Proposal.
HTTP State Management Mechanism: RFC 1997
HTTP State Management Mechanism: RFC 2011
Recommended: Mozilla Developers Docs: Using HTTP Cookies.
Recommended: Huli Blog: link to the blog.
Recommended: Cookie Bugs - by Ankur: link to the blog.
Recommended: The Cookie Monster in Your Browsers pptx - by @filedescriptor: link.
Recommended: Handling Cookies is a Minefield - by @April King: link.

Introduction

Third-party resources

Baking a Cookie

Set-Cookie

Cookie

Document.cookie (Javascript)

Security

Privacy issues

Historical Notes

Thanks for reading

References:

Other articles of the blog: