HTTP Networking Notes

date

Jul 21, 2024

slug

http-networking-note

status

Published

Why HTTP (Hypertext Transfer Protocol)

Before we jump into the nitty-gritty details of the protocol and how it works, it's important to understand why we should care.

HTTP is the most popular protocol for Internet communication. In fact, it powers websites.

You've probably noticed that every time you go to a website, you see that HTTP:// section of the URL. That's because every time you visit a website online, you're using the HTTP protocol to do it.

URLs are not specific to HTTP. If they were, we wouldn't need to prefix URLs with that HTTP://. The purpose of the prefix is to tell the computer that's making the request which protocol to use for making this request. There's plenty of others, like HTTPS, Mail2, and PSQL.

When you enter the URL of a website into your browser, you're making an HTTP request to a server, and that server is responding with the data that makes up their website. The images, text, HTML, CSS, all of that comes back via an HTTP response.

What is a protocol

Let's imagine that I write a note to you on a piece of paper that says “raise your hand”. And when you get the piece of paper, you read it and you raise your hand.

So the question is, upon reading the piece of paper, how did you know that I wanted you to raise your hand? It's because we have a protocol for communicating over text.

We've decided that these symbols that look like R-A-I-S-E form a word, which means to lift something up. And H-A-N-D means the thing at the end of my arm.

So it’s just a set of rules that me as the person communicating, and you as the person I'm communicating to, have agreed upon. And now that we've agreed upon that set of rules, we can communicate over the medium of a piece of paper.

A protocol on the Internet works in the same way. We decide upon a set of rules, and then two computers can both follow those rules and communicate with each other.

In the case of computers, instead of sending plain text, we would send strings of ones and zeros that represent something over a network and the Hypertext transfer protocol. Because we have this protocol, this way of parsing the ones and zeros as useful information, I am able to send you a photo over the Internet and your phone is able to parse all that information and display the photo.

Client and Server

At the heart of HTTP is a simple request-response system.

When we make an HTTP request, the requesting computer (client) make a request to another computer (server). The server processes that request and sends information back to us in a response. That's the entire HTTP life cycle. It's all based on request-response.

When we say client, we just mean the computer sending the request. When we say server, we mean the computer that's responding to the request. There's not a special type of computer. It's just who's doing the sending and who's doing the receiving in any given communication.

Even though any computer can be a server, the best servers are made for serving data. In fact, when we're developing our backend applications, even if we will eventually deploy them in a data center like Google Cloud Platform, Amazon Web Services and Azure, we're generally developing them on our local machines. So you can run server software on laptops, but the computers in those data centers are hyper optimized to be servers. Their hardware are optimized for server side workloads.

Unlike most applications that terminate when they're done doing what they're supposed to do, servers are generally always on. You turn on the server and then it just sits there on a computer listening for incoming HTTP requests. As they come in, it handles them and sends responses. You'd actually need to manually stop your server from running because they're not designed to shut down automatically.

Generally speaking, clients exist on the front end of an application and servers exist on the backend of an application. The front end of an application is what the users see and directly interact with, which is typically a mobile app or a website. The backend is essentially everything else.

Let's take a look at the example of YouTube. The YouTube app exists locally on your phone and it contains all of the code that is required to render the visual YouTube interface. So we can consider that app and your phone as the front end, because it's the visual interface you're interacting with directly as an user.

However, your phone does not locally store every YouTube video and comment on the planet. So when you load a video, your phone actually needs to make an HTTP request to YouTube's backend. And there is a computer on YouTube's backend, probably in a data center somewhere, that's going to serve or handle that HTTP request.

In this case, your phone is the client because it's the one sending the HTTP request, and the backend computer, the one in the data center processing the request, is acting as the server.

Now let's pretend that this server, all it has is video data locally. But the client actually asked for videos and comments. So there's probably some other server within YouTube's backend that stores the comments locally.

This first server, that's trying to process the HTTP request that asks for videos and comments, is going to send its own HTTP request over to the other server.

In this case, this first server is actually now acting as a client because it's sending an HTTP request. So that's totally okay when a computer is acting as both a server and a client.

The other server grabs the comments that were asked for and sends them back to the first server in an HTTP response. The first server now has all the data it needs to respond to the client, and it can send a response back to the client.

At the end of the day, the word client and the word server only matter within the context of a single HTTP request.

DNS

DNS or the domain name system is essentially a giant phone book for the Internet. With a phone book, you could look up your friends by their name and find their phone number. With DNS, we can look up a website by its domain name and find its IP address.

So how does DNS work under the hood?

There's a not-for-profit organization called ICANN, which stands for the Internet Corporation for Assigned Names and Numbers. ICANN manages the domain name system for the entire Internet. It’s basically the publisher of the phone book and controls what goes into the phone book.

Whenever your computer has a domain name that it wants to find the IP address of, it contacts one of ICANN's root name servers (how your computer knows where to find the root name servers is typically built into its networking configuration). From there, ICANN is able to look up that IP address from their distributed DNS system.

IP addresses

I'm over here on my computer, and I want to communicate with some server. Trouble is, we have all these other devices that are connected to the internet. So how can I tell the routing mechanisms of the internet that I want to communicate directly with that server?

That's where web addresses, more specifically, what we call IP addresses or Internet Protocol addresses come in.

There's many possible valid IP addresses. An IP address look like this 8.13.156.7. There's 4 sections separated by .(periods) and each section can be between 0 and 255. Four numbers, each number is one byte of information.

This is just one format of an IP address, and it's actually the more popular format. It's called IPv4. But there is a newer format that's being used called IPv6 or IP version 6.

In IPv6, each section is separated by a :(colon) rather than a period. There's 8 sections in an IPv6 address, and each section actually has more information in it. So there's way more possible IPv6 addresses than IPv4 addresses, and that's really important because we're actually running out of IPv4 addresses.

The same principle applies. The addresses are unique and uniquely identify machines on the internet, but we just have more options available to us. So as you're looking at IP addresses, just know that you'll see them in both formats.

Just like how if I were to give you my physical address, you could send me a package from anywhere in the world. If I were to give you my IP address, you could contact me over the internet from anywhere in the world.

Every device connected to the internet has its own unique IP address. In the example from before, where I'm trying to communicate with the server, all I need to know is its unique IP address. If I know that, I can communicate with it.

But If I want to contact Amazon's servers, it's not very helpful if I need to know their IP address. I just want to navigate to amazon.com and want their servers to give me a web page.

That's where DNS (Domain Name System) comes into play. One of its main purposes is to map human readable names like amazon.com to IP addresses. So I can type into my computer amazon.com, and the DNS is responsible for looking up the IP address that's associated with amazon.com and giving it back to me, so that I can make that request to the server that I'm looking to get to.

So There's essentially 2 steps every time we want to send an HTTP request to the server on a given domain name. The first step that our computers do under the hood automatically for us is to resolve DNS. What that means is we're resolving the domain name into an IP address and obtaining the IP address. Step number two is to use the obtained IP address to actually make that request across the internet.

This has a side benefit. If Amazon wants to change their IP address, they can update this mapping without needing to change their domain name. So all the users can still navigate to amazon.com, but now they're actually going to a different server or a different IP address under the hood.

Domain names

A domain name is just one part of a URL. And it's specifically the part that we use to look up the IP address of a server.

For example, here we've got a full URL https://en.wikipedia.org/wiki/Miniature_pig. This en.wikipedia.org is the only part of the URL that makes up the domain name, or it is the domain name. The rest of the URL is not related to the domain name, and it's not the part that we need in order to look up the IP address of the domain name. All we need is the domain name to look up the associated IP address.

Now let's run through an example of how you might deploy your own website to the Internet using IP addresses and domain names.

You need a web server software and you need a website that you want to serve, which is basically a collection of HTML and CSS files.

Take that software and deploy it on a machine. Typically, what you would do today is deploy it on some cloud platform like Netlify, GCP, Azure or AWS. And when you do that, these large kind of cloud providers would give you the IP address of the machine where they deployed your code.

Once you have that IP address, then you would need to go buy a domain name from a provider of domain names. They're called registrars, something like Amazon, Google and Namecheap. You buy xyz.com and tell them the IP address you want it to point to, and then they go and update the domain name system for you.

Subdomains

If we look at this root domain boot.dev, you'll see there's two parts. There's .dev, which is the TLD or top-level domain. There's quite a few different options, such as .dev, .com, .org and .net. And then boot is the domain name.

A subdomain actually prefixes a domain name. Subdomains can be used to break up the resources hosted on a domain name without having to go buy new domain names.

For example, boot.dev hosts our website, but api.boot.dev hosts our API separately, which is what the website uses to go fetch data, update user records and passwords, etc. We can also have a blog.boot.dev, which is a separate website that hosts our blog. So we paid for boot.dev, and now we own all of the subdomains that can be prefixed under boot.dev.

URIs and URLs

URI stands for Uniform Resource Identifier, and it's essentially a superset of URLs.

URL is just type of a URI. Another common type of URI is a URN. URNs and URIs, they can refer to other things that aren't necessarily accessible via the Internet, like ISBN numbers of books.

Generally speaking, when we're working with the Internet, we're working with URLs. Sometimes, you'll be asked to provide a URI. And you're just providing a URL again because a URL is a valid URI.

Sections of a URL

Take a look at this giant URL http://testuser:testpass@testdomain.com:8080/testpath?testsearch=testvalue#testhash. This URL is so large because every section is present. Typically, some sections are optional. The protocol and the domain are the only two sections that are completely required.

Protocol (http:)

The protocol is required. The computer needs to know how to take your message and transmit it in code. HTTP is like a language that computers use to speak to one another over a network. And so when we fetch a resource using a URL, we need to know what language or what protocol to use to do that.

HTTP is not the only scheme, and not all the schemes are postfixed or end in that //. The // section is there because HTTP as a protocol has an authority component. It has the username:password section of the URI. And that's why the protocol mailto, which is mailto:, does not end in a //.

You may see email addresses in your browser that you're able to click and it will automatically open your email client. That's because your computer understands the mailto URI, and knows that that means to open an email client with a draft email ready to send to the email address in that URI.

We also can use a different protocol like postgres for a database connection. But because this URL is accessible via an HTTP call, it's prefixed with the HTTP protocol.

Username and Password (testuser:testpass)

These are the two different sections for username and password. The : is the separator.

Usernames and passwords are optional. If a resource doesn't need a username and password, it's considered public and doesn't require credentials to get access to.

You might use them a lot to access resources in code, like getting access to a database or a piece of infrastructure. But you'll very rarely see usernames and passwords in a URL when you're just browsing the internet. We typically use usernames and passwords on a website via form submission, not in the URL itself.

Hostname/Domain (testdomain.com)

A domain is required. There's no way for us to reach out to another server and get information unless we can use that domain name, resolve it into an IP address, and then make a request across a network.

The @ symbol is an optional delimiter that separates the username and password from the hostname. You only need it if you've included a username and password.

localhost is the domain of our own computer.

Port (8080)

This is the port that we'll be using to access the information on the server.

A port is not necessarily optional in the sense that we're always using a port to make an HTTP request. However, every protocol has default ports. The default port for protocol is used if we didn't provide one.

For the HTTP protocol, 80 would be the default. And HTTPS, which is what you use for a secure HTTP call, defaults to port 443.

When you're browsing the Internet in your browser, you don't normally have the port, because pretty much every website online uses the default ports. You'll only need to specify a port if for some reason you're deviating from the defaults. This is used a lot in local web development.

When you're building websites on your own machine and you want to access the test version of the site in the browser, you might just use different ports so that you can access many different test sites on your own machine, without having to get domain names for every single one of them.

Path (/testpath)

The path comes after a /. Very often in websites, the path is used to show different web pages. So depending on the path in the URL, you'll navigate to a different page.

The pages can be nested, so you can have multiple / in there with multiple sections of the path. In this case, we just have the one section of the path.

The path is optional. The default is just the root, which is /.

Search/Query parameters (?testsearch=testvalue)

The search starts with a ?. The question mark separates the rest of the URL from the query parameters. And then we have key value pairs.

There could be more key value pairs after the testsearch=testvalue, and they would be separated by &(ampersands).

A query is optional.

Hash/Fragment (#testhash)

Very often in websites, the fragment is used to link to a specific section on a page. But it could be used for other things outside of the context of websites.

A fragment is optional.

Ports

Let's say I have my web server, and I've got some piece of software that's hosting my website. Some customers are sending HTTP requests to my server so they can get access to my website.

The trouble is, what if I want to host multiple pieces of software on the same physical machine?

For example, in addition to my website, I also want to host a database. Maybe for my own purposes and it's not even public. I want to be able to get access to the database from my home computer.

So how can my computer know whether the incoming request should be handled by my web server or by my database?

That's where ports come in. Ports are virtual little hubs managed by the operating system that allow us to segment incoming requests and incoming data streams.

If I have port 5432 for my database and port 80 for my web server, I can direct all my web traffic at port 80 for my web server, and can access my database on port 5432. Those are pretty common ports for those use cases.

With ports, I can now actually run many different instances of different kinds of software, and network with them all at the same time. When I'm building a few different websites on my own machine, I can actually run all of those websites connected to different ports at the same time.

You can't bind two pieces of software to the same port, otherwise the operating system wouldn't know which piece of software should handle which incoming network request. But we can run many pieces of software all bound to different ports. And your operating system actually allows for over 65,000 different ports at the same time. So there are a ton of different ports that we can use when we are networking.

If a port is not present in a URL, the default is used based on the protocol.

Paths

By convention, traditional file servers will typically route the path of a directory to a file called index.html in that directory. If I were to type index.html, I'd get the same thing.

By convention, many static file servers just use the path to the file on your disk as the path in the URL. So a URL path is just the same thing as a file system path once you've hooked up a web server. But it doesn't have to work that way.

At the end of the day, the server software can do whatever it wants. We could completely change how we parse the path of a URL and what we want to respond with. Many back-end web frameworks work this way.

If you write a server in Go or in Django, there will often be a lot of custom logic that determines how different paths in a URL are handled, and what files and data will be responded with when certain paths are accessed.

So while it's common for a URL path to map directly to a file on a file system when we're talking about websites, that's rarely the case when we're talking about web APIs. Essentially, the API is parsing the path and determining based on the path what kind of information we want, so that our front-end code can then render that information.

Query parameters

At the end of the day, the server can do whatever it wants with the path or the query parameters. But by convention, while the path of an HTTP request typically changes something big, like the web page that you're looking at or the entire resource that you're requesting from the server, a query parameter typically changes something small about the request, like some metadata, filtering options or contents.

For example, usually the page is dependent upon the path, but rarely changes when the query parameters changes. If we go to google.com, the search page, and search for “hello world”, we'll get a page with many different search results for the “hello world” query. Google is using /search as the results page, and then using the query parameter, which is ?q=hello+world to specify which term is being searched for. So the page didn't change. We were still on the results page when we changed the query parameter, but the search term that we were looking for results for did change when the query parameter changed.

Additionally, query parameters are very often used for marketing and analytics information, like affiliate link tracking or commission tracking.