Klaus Elk Books

HTTP

Introduction

In Embedded Software for the IoT, I go to depths with the backbone of the internet - the TCP/IP protocols. However, I only touch briefly on the backbone of the web - the Hypertext Transfer Protocol - HTTP. We will study it a bit closer here. This will also bring us to HTTPS - the secure version. Along the way we will use tools like curl and Chrome with DevTools, and we will try out AJAX.

The Basics

When I taugth the internet protocols at DTU - the Danish Technical University - I started with a simple telnet command.

You can enable telnet on Windows, but it's a pain to work with, whereas the Linux version is quite OK. Instead I here use PuTTY - a terminal emulator. For the first basic drill I use my local XAMPP server, which is setup to allow basic HTTP - without the "s" at the end, because my production server demands HTTPS.

In the below I have asked PuTTY to connect to the local webserver: 127.0.0.1 port 80 - using a "raw" text protocol. I also configure PuTTY not to close the window on termination. When the terminal window opens, I have a TCP-connection to the server, but nothing else has been sent or received. I then type the first two lines below in the terminal window. That's basically the simplest HTTP Request there is.

GET / HTTP/1.1
Host: klauselk.com

HTTP/1.1 200 OK
Date: Fri, 19 Sep 2025 21:09:38 GMT
Server: Apache/2.4.58 (Win64) OpenSSL/3.1.3 PHP/8.2.12
Content-Security-Policy: upgrade-insecure-requests;
X-Powered-By: PHP/8.2.12
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

3e0d
<!DOCTYPE html>
               <html lang="en">

                                   <head>

This is very similar to what happens in the browser, if I write "127.0.0.1" in the browser's address line. The browser connects to that address (and by default port 80) and issues as a minimum these two mandatory HTTP headers.

Back to the terminal output above: First the "GET / HTTP/1.1" - stating that I want to GET the root-page - the first "/" - using HTTP version 1.1. Note that eventhough I have asked the terminal emulator to connect to 127.0.0.1:80, I ask for "Host: klauselk.com" in the second header. Normally, I need to do this even if I had originally asked PuTTY to connect to e.g., the live "klauselk.com", because PuTTY's URL is translated to an IP-address long before the server receives the HTTP Request, which we will see later, and we also need to tell the Webserver which host it is asked to represent. In many cases a Webserver hosts many websites.

After my two headers, I now tell the server that I am done with my HTTP Request, by hitting my return key twice - thus sending to times CR-LF. What now follows is the Webservers HTTP Response. This is another set of HTTP-headers, followed by my content - the HTML.

Note that the headers are neatly above each other, because they all end with CR-LF - Carriage-Return and LineFeed. On the other hand, the HTML is clearly missing carrriage returns - having only LineFeeds. This is because http is specified to have both characters at the end of the line, whereas HTML doesn't care. I simply decided to write my PHP and HTML in Linux-style. This gave me some issues with Windows and git, which is described in the Site Log.

Now the browser basically has two tasks:

Using Chrome with DevTools

The figure below shows - almost - the same thing - a HTTP GET- now using Chrome with DevTools - with cache disabled, forcing all elements to be transferred. It is only almost the same, because Chrome in this case is, by it's own will, actually connects using HTTPS, but it doesn't matter here.

I have selected the line that does the first GET from "localhost" root - "index.php" - containing figures and scripts that follows. At the right side we see that I have chosen "Headers", where a "General" section describes what happens overall. Here we see that it is HTTPS and therefore port 443 instead of port 80. Before "443" we see "[::1]". This is actually the short notation for the IPv6 equivalent of the IPv4 127.0.0.1. Thus, this is using IPv6.

Below this we have the actual Response above the Request (strange order), and I have ticked off "Raw", so we can see the resemblance with the terminal output. It is clear that the browser sends more headers than the two I used. We will dig more into headers in a moment. The expected request-header containing "Host: localhost" is not the second line here, but it is there. Note that sometimes you do not have the "Host:" header - but instead "Authority", which is kind of the same. We also see that the browser afterwards requests all the referenced images - as well as some javascript.

DevTools Showing Raw Request-Response
DevTools Showing Raw Request-Response

Note that in the above figure, both Request and Response contain "Connection: Keep-Alive" (upper/lower-case is irrelevant). Basically this is a handshake where the client (browser) says to the server: "By the way, I support keep-alive", and the server says "me too - it's a deal." In Embedded Software for the IoT, I describe how basic connections are created and maintained as TCP-Sockets. Suffice to say here, that the operating systems at both ends spend a lot of CPU-cycles on establishing and maintaining connections - and we spend time waiting on roundtrips as well as actual transmissions. If the physical distance is huge, the number of roundtrips can be more hurtful to speed, than the size of the payload, as roundtrips makes it all happen serially. I look at the actual transmission time on Ethernet Tx Delay.

Anyway, in HTTP version 1.0, every request-response demanded a new connection, but with version 1.1 the parties can negotiate and decide to use one connection for multiple request-responses. That is what we see here. HTTP is stateless - each element in the webpage can be retrieved independent of the others.

This means, that it is also normal that the browser fires up multiple sockets in order to retrieve the embedded images and javascript etc., in parallel. As an example, if a web-page contains 80 pictures, we may see that first the containing page is fetched, then e.g. 9 more connections are spawned, and each of the now 10 connections retrieve 8 pictures. This can easily involve e.g., six load-balanced servers in parallel. These servers can act completely independently because HTTP is stateless. It's the webbrowser that keeps state - knowing how far it is at any given time in retriewing all the relevant elements of a page. This is much more scaleable than e.g., FTP, where the server needs to be in synch with the client./p>

Let's take a step back and look at the full loading of a page. This is what we see in the figure below:

DevTools Showing all elements of a page
DevTools Showing all elements of a page

Now we see that the first request is what we analyzed above - the index.php page. In the "Initiator" column we see that most of the following items are referenced in this page - we even see the line numbers. We also see in the Type column that eventhough some PNG-files are requested, we get WebP-files. This is discussed in detail in the Site Log page. We also see that all Requests are met with successfull "200" Responses (204 is also success - but with no data).

In relation to the subject of this page - HTTP - the "Protocol" column is interesting. We see that most items are fetched with HTTP v1.1 - just like in the simple terminal example. One is a chrome-extension. This is not really fetched - it is pre-installed on my PC. Finally there are two items fetched with "h3". This is HTTP version 3 (h2 also exists - not used here). This is a very fast update of HTTP, that does not have the first "empty" roundtrip for TCP, because it uses something called QUIC. From high altitude it is however, still http.

Introducing curl

Now that we have seen the basics, lets move on to a real-life tool - curl - for "Client URL". When I wrote the following, I used the git bash shell that comes with a standard git installation on Windows. It has it's own version of curl that behaves very unix-like.

However, I also tried the "Command Prompt" or "DOS Box". It uses a version from "c:\windows\system32" that delivers exactly the same results - at least on my PC - albeit with header-names in bold. If, instead I use PowerShell, "curl" is aliased to "Invoke-WebRequest". This is a tool that does some of the same things - but differently.

Now I use curl to address the production-site. If I use the "-i" option, I include the HTTP-headers in the output - while with "-I" I only get the headers. The "-v" means "verbose" as usual. So here it goes:

curl -v -I  127.0.0.1
*   Trying 127.0.0.1:80...
* Connected to 127.0.0.1 (127.0.0.1) port 80
> HEAD / HTTP/1.1
> Host: 127.0.0.1
> User-Agent: curl/8.8.0
> Accept: */*

< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Wed, 17 Sep 2025 14:16:44 GMT
Date: Wed, 17 Sep 2025 14:16:44 GMT
< Server: Apache/2.4.58 (Win64) OpenSSL/3.1.3 PHP/8.2.12
Server: Apache/2.4.58 (Win64) OpenSSL/3.1.3 PHP/8.2.12
< Content-Security-Policy: upgrade-insecure-requests;
Content-Security-Policy: upgrade-insecure-requests;
< X-Powered-By: PHP/8.2.12
X-Powered-By: PHP/8.2.12
< Content-Type: text/html; charset=UTF-8
Content-Type: text/html; charset=UTF-8
<

* Connection #0 to host 127.0.0.1 left intact

In the above, curl uses "*" for what happens locally, ">" for the Request from the client and "<" for the servers Response. "HEAD" is an HTTP command, asking the server to send only the headers (incl. Content-Length) it would have sent if we had used GET. For some reason, curl interprets the received headers and basically writes the same again. Now, let's try the live site :

curl -v -I  klauselk.com
* Host klauselk.com:80 was resolved.
* IPv6: 2a02:2350:5:10e:8086:51cc:105c:c422
* IPv4: 46.30.215.48
*   Trying [2a02:2350:5:10e:8086:51cc:105c:c422]:80...
* Connected to klauselk.com (2a02:2350:5:10e:8086:51cc:105c:c422) port 80
> HEAD / HTTP/1.1
> Host: klauselk.com
> User-Agent: curl/8.8.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Date: Tue, 16 Sep 2025 16:43:15 GMT
Date: Tue, 16 Sep 2025 16:43:15 GMT
< Server: Apache
Server: Apache
< Content-Security-Policy: upgrade-insecure-requests;
Content-Security-Policy: upgrade-insecure-requests;
< Location: https://klauselk.com/
Location: https://klauselk.com/
< Cache-Control: max-age=86400
Cache-Control: max-age=86400
< Expires: Wed, 17 Sep 2025 16:43:15 GMT
Expires: Wed, 17 Sep 2025 16:43:15 GMT
< Content-Length: 229
Content-Length: 229
< Content-Type: text/html; charset=iso-8859-1
Content-Type: text/html; charset=iso-8859-1
< X-Varnish: 13677605596 12077534070
X-Varnish: 13677605596 12077534070
< Age: 78380
Age: 78380
< Via: 1.1 webcache2 (Varnish/trunk)
Via: 1.1 webcache2 (Varnish/trunk)
< Connection: keep-alive
Connection: keep-alive
<

* Connection #0 to host klauselk.com left intact

Now we are using the URL instead of an IP-address and we see that via DNS it is "resolved" into both IPv4 and IPv6 addresses, with IPv6 tried first. We also see a lot of interesting headers:

Using Secure Connections - HTTPS

We saw above, that the live site does not really answer the HTTP-based request - it just says "Ask for the HTTPS-version". This is what happens when I try exactly that:

curl  -I  https://klauselk.com
curl: (35) schannel: next InitializeSecurityContext failed: CRYPT_E_NO_REVOCATION_CHECK (0x80092012) - The revocation function was unable to check revocation for the certificate.

The problem above is that curl cannot check whether my webservers certificate might have been revoked. There are several fixes for this. The following worked for me:

$ curl  --ssl-revoke-best-effort -I  https://klauselk.com
HTTP/1.1 200 OK
Date: Wed, 17 Sep 2025 15:10:09 GMT
Server: Apache
X-Powered-By: PHP/8.4.10
Content-Security-Policy: upgrade-insecure-requests;
Cache-Control: max-age=86400
Expires: Thu, 18 Sep 2025 15:10:09 GMT
Vary: Accept-Encoding
Content-Type: text/html; charset=UTF-8
X-Varnish: 14030015721
Age: 0
Via: 1.1 webcache2 (Varnish/trunk)
Accept-Ranges: bytes
Connection: keep-alive

What we are looking for is the "200 OK" in the response. I also tried without the "-I" option and received the full HTML as expected.

Now we can see what really happens when the live site is accessed with http, and redirects to https:

curl -I -L -v --ssl-revoke-best-effort  http://klauselk.com
* Host klauselk.com:80 was resolved.
* IPv6: 2a02:2350:5:10e:8086:51cc:105c:c422
* IPv4: 46.30.215.48
*   Trying [2a02:2350:5:10e:8086:51cc:105c:c422]:80...
* Connected to klauselk.com (2a02:2350:5:10e:8086:51cc:105c:c422) port 80
> HEAD / HTTP/1.1
> Host: klauselk.com
> User-Agent: curl/8.8.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
< Date: Fri, 19 Sep 2025 20:12:30 GMT
Date: Fri, 19 Sep 2025 20:12:30 GMT
< Server: Apache
Server: Apache
< Content-Security-Policy: upgrade-insecure-requests;
Content-Security-Policy: upgrade-insecure-requests;
< Location: https://klauselk.com/
Location: https://klauselk.com/
< Cache-Control: max-age=86400
Cache-Control: max-age=86400
< Expires: Sat, 20 Sep 2025 20:12:30 GMT
Expires: Sat, 20 Sep 2025 20:12:30 GMT
< Content-Length: 229
Content-Length: 229
< Content-Type: text/html; charset=iso-8859-1
Content-Type: text/html; charset=iso-8859-1
< X-Varnish: 19487856652 18916365505
X-Varnish: 19487856652 18916365505
< Age: 8202
Age: 8202
< Via: 1.1 webcache2 (Varnish/trunk)
Via: 1.1 webcache2 (Varnish/trunk)
< Connection: keep-alive
Connection: keep-alive
<

* Ignoring the response-body
* Connection #0 to host klauselk.com left intact
* Clear auth, redirects to port from 80 to 443
* Issue another request to this URL: 'https://klauselk.com/'
* Host klauselk.com:443 was resolved.
* IPv6: 2a02:2350:5:10e:8086:51cc:105c:c422
* IPv4: 46.30.215.48
*   Trying [2a02:2350:5:10e:8086:51cc:105c:c422]:443...
* Connected to klauselk.com (2a02:2350:5:10e:8086:51cc:105c:c422) port 443
* schannel: disabled automatic use of client certificate
* using HTTP/1.x
> HEAD / HTTP/1.1
> Host: klauselk.com
> User-Agent: curl/8.8.0
> Accept: */*
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< Date: Fri, 19 Sep 2025 22:29:12 GMT
Date: Fri, 19 Sep 2025 22:29:12 GMT
< Server: Apache
Server: Apache
< X-Powered-By: PHP/8.4.10
X-Powered-By: PHP/8.4.10
< Content-Security-Policy: upgrade-insecure-requests;
Content-Security-Policy: upgrade-insecure-requests;
< Cache-Control: max-age=86400
Cache-Control: max-age=86400
< Expires: Sat, 20 Sep 2025 22:29:12 GMT
Expires: Sat, 20 Sep 2025 22:29:12 GMT
< Vary: Accept-Encoding
Vary: Accept-Encoding
< Content-Type: text/html; charset=UTF-8
Content-Type: text/html; charset=UTF-8
< X-Varnish: 19150093645
X-Varnish: 19150093645
< Age: 0
Age: 0
< Via: 1.1 webcache2 (Varnish/trunk)
Via: 1.1 webcache2 (Varnish/trunk)
< Accept-Ranges: bytes
Accept-Ranges: bytes
< Connection: keep-alive
Connection: keep-alive
<

* Connection #1 to host klauselk.com left intact

The above shows how the HTTP request is met with a response containing a 301 Redirect, then looks at the "Location" in the response and connects to this location. Note also that instead of port 80, port 443 is used when using https - as it should.

Other tools

I have used Wireshark a lot earlier. It also works great for HTTP tracing. However, if you are only working at the application-level - as HTTP(S) is - you often don't need such heavy arms. curl is fine. Another simple tool is Postman. It may be easier to use with POSTs and PUTs than curl. You find links to both in the right-menu (bottom on phones).

There is also another commandline tool - "wget". It is a lot like curl, but can get a webpage - including all items referenced. This is practical when you script full retrievals of pages.

Other HTTP commands

We have spent a long time dealing with GET - and it's shorter version HEAD. This makes a lot of sense because, for a static website, that's basically all there is. Even if there are also forms etc, it's the basic GET Requests we really need to be fast. But as you probably know - whenever there is a GET, there is also a PUT. The following are the most relevant HTTP commands:

HTTP Usage SQL
POST Creates Information INSERT
GET Retrieves Information SELECT
PUT Updates Information UPDATE
DELETE Deletes Information DELETE

The above table originates from Embedded Software for the IoT. As you can see, I compare the four major HTTP-commands with the CRUD in databases - with the corresponding SQL commands thrown in. This is a warmup for introducing REST - more on this in Embedded Software for the IoT.

AJAX

Since this page is about HTTP, I decided to try out AJAX - Asynchronous Javascript And XML.

While I have worked a lot with HTTP via RESTful APIs and datatransmission, I hadn't really tried out AJAX before. It seems that the basic AJAX is not so much in fashion anymore - probably because several frameworks now build on and around it.

We have seen earlier on this page how a webpage is loaded - along with all its embedded items. If there is a form on the page, and it is "submitted", this results in re-loading the page - or loading another page. A good example is the small search-form at the top of this site's menu (on the left on bigger screens).

Let's say that I type "TCP" into that menu-based search-form and hit enter. Now the form treats it like a link to GET the search.php page with the search string: "search.php?q=TCP" (You can "View page source" if you want to see this html-form).

Once on the new search-page, you may perform a new search. This results in a new link being followed - this time to the page itself - with the new search-string. This all happens very fast, because the search-page is rather small.

But what if we have a big web-based application where we want to show e.g., realtime measurements? Nobody wants this page to constantly reload. This is where AJAX comes in. The main tool here is XMLHttpRequest - aka XHR.

XMLHttpRequest is an object supported by all modern browsers, allowing you to stay on the current page, while the browser does a HTTP-Request for you in the background. The HTTP-Response can be pure text, json, html or what have you. When this was first invented, it started the whole Web 2.0 wave that we now take for granted, where users can interact with the web-page. Today many organisations prefer web-based applications over "normal" PC-applications, because of the much simpler clients - with simpler installation and maintenance.

The below "form" is a revamped version of the search-form mentioned above. Now it uses XMLHttpRequest and a different php-file (uiless.php) that has no header, footer etc. It just spits out the text I need (here HTML).

Search

Search Results:

The code below is all the code needed for the AJAX demo in this page. First we see the "form" which is not a form but a "div"-section, with a label, a text-field and a button. It started as a form, but took ages to debug, as the form singlehandedly decided to reload the page when submitted - even without an "action" part. This was obviously exactly the thing I wanted to demo not happening. Much more robust now.

When the button is pressed, the JavaScript is called - with the search-string as parameter.

Then there is the "span" with the search-result. This expands with the text when received.

Next we have the JavaScript, invoked by the button-pressure. It instantiates the important XMLHttpRequest and registers a "lampda" function that handles the "onload" event. Then it opens the new object and performs a normal HTTP-Request as we have seen earlier on this page.

<div>
    <label for="kword">Keyword:</label>
    <input type="text" id="kword" name="kword" >
    <button type="button" onclick="showSearch(document.getElementById('kword').value)">Search</button>
</div>

<p>Search Results: <br><br><span id="txtSearch"></span></p>


<script>
    function showSearch(str) {
        var xmlhttp = new XMLHttpRequest();

            xmlhttp.onload = function() {
                document.getElementById("txtSearch").innerHTML = this.responseText;
            }                        
        
        xmlhttp.open("GET", "private/uiless_search.php?q=" + str, true);
        xmlhttp.send();
    }
</script>

The screen-shot below shows the Chrome DevTools at a breakpoint in the code. Breakpoints are set by clicking at the relevant line - in the left margin - just as in many other debuggers. At the right we see the XHtmlRequest object with its data - just as the response has been received. We see that the onload event has a function and that status is 200 (OK). The ReadyState is 4, which verifies that we have received a response.

To debug XHtmlRequest objects, this must be enabled via the setup (gear-wheel). Note also that I have selected to pause on exceptions. This is normally not needed, but it helped me realize that the original form was reloading the page. If you want to see more DevTools, I also use it in the Site Log.


Breakpoint on response
Breakpoint on response

So this is what I learned from this drill:

Black Elk

© 2026 KlausElk.com & ElkTronic.dk