Streaming files with HTTP.jl

Panagiotis Georgakopoulos
download files http httpjl javascript js julia upload

HTTP.jl is the de-facto library for using HTTP in Julia, but also for implementing HTTP Servers and Websocket connections. HTTP.jl is pretty low level, in the sense that it abstracts away fewer parts of HTTP than many users coming from other languages are accustomed to. Additionally, when implementing a server, there are two different ways to launch a server HTTP.serve! and HTTP.listen! - and each of these get slightly different arguments, slightly different handlers which, of course, are not compatible with each other.

The scenario we're going to investigate.

Let's say your julia script is running a big analysis. That analysis produces a file that is quite large, and you want to download that. You don't (want to) have enough RAM on the server to load all that file, you only want to send it over. Let's add to the mix, that in order to run this analysis, you also need to send a file. You also don't want that file to live on the server's memory, ever. Just save please!

The HTTP.jl server

using HTTP: HTTP

router = HTTP.Router()
HTTP.register!(router, "GET", "/**", stream_download)  # We're going to write this in a bit!
HTTP.register!(router, "POST", "/**", stream_upload)  # We're going to write this in a bit!
HTTP.serve!(router, "0.0.0.0", 1235; stream=true)  # stream=true is important!

This server will now accept data in the /** endpoint. The ** will match all the rest of the path to the handler so it can find the file the user wants. We asked HTTP.serve!not to read all data available, by saying stream=true.

These mean that a GET /file.html request will trigger stream_download with /file.html as the (HTTP.Request).target and a POST /folder/file.zip request will trigger stream_upload with /folder/file.zip as (HTTP.Request).target.

HTTP.serve!(...args; stream=true)

According to the docs,

Where f is a Handler function, typically of the form f(::Request) -> Response, but can also operate directly on an HTTP.Stream of the form f(::Stream) -> Nothing while also passing stream=true to the keyword arguments.

We're going to use the second case, where we we'll be operating on the stream.

A mental model for streaming

Not streaming is the equivalent of taking a film 🎞️ unrolling it completely and laying it on the floor.

Streaming on the other side, is pretty much exactly what an old cinema 📽️ or a videocassette does 📼: You get two reels, you unroll one image (byte), you show it on the cinema wall/save it (upload)/send it(download) and then you carefully roll it again on a second reel (reuse/free the memory).

At any given time, only a small amount of data is "unrolled", so the memory requirements are constant.

Streaming download handler (server -> user)

So, how will our handler look like?

function stream_download(http::HTTP.Stream)::Nothing
  request::HTTP.Request = http.message  # This includes `method`, `target`, `headers` etc, but not body.
  filepath = request.target  # Let's assume that's 1-1 with your filesystem - don't do this IRL without checking!
  # TODO: Check file exists etc.
  # Now we know that we want to download a file, and which file to download. 

  # In the lines below, we're helping the client (browser or other)
  # to understand what is coming.
  # Content-Type: application/octet-stream means a file we don't know much about, so it's a good generic option for downloads. You can find a better MIME for your usecase, but with this you can't be wrong.
  # Content-Disposition tells our browser to download the file and what name to give to it, if available.
  # Content-Length is the size of the file in bytes. We get that from `stat` (read the docs!)
  browserfriendlydownloadheaders = [
    "Content-Type" => "application/octet-stream",
    "Content-Disposition" => " attachment; filename=\"$(basename(filepath))\"",
    "Content-Length" => stat(filepath).size,
  ]
  request.response = HTTP.Response(200, browserfriendlydownloadheaders)

  HTTP.startwrite(http)  # "Open" stream for writing
  open(filepath) do io   # "Open" the file for reading!
    write(http, io)      # Start copying bytes from file to HTTP.stream!
  end                    # 
  HTTP.closewrite(http)  # "Close" the stream to be nice
  return nothing         # We wrote to the stream, closed it, so there is nothing left todo
end

Note that we never have all the file that lives in filepath in memory. We get "references" both to the file and the stream, and we start copying bytes from one to the other - we only keep "chunks" of the file in the julia process at any given time. That means that we can stream any file with constant memory footprint (of course not constant disk footprint 😉).

Streaming upload handler (user -> server)

Can we apply the same trick when uploading? Of course we can!

function stream_upload(http::HTTP.Stream)::Nothing
  request::HTTP.Request = http.message  # This includes `method`, `target`, `headers` etc, but not body.
  filepath = request.target  # Let's assume that's 1-1 with your filesystem - don't do this IRL without checking!
  # TODO: Check that you're not overriding another file
  # TODO: Check content-length to make sure you have enough disk (!)
  # Maybe it's a good idea to reject files with unknown content-length
  # Escecially if you _don't_ reverse proxy with an nginx or similar. 
  # NGINX has a max upload size of a few MBs so you're safe there.
  # This can be tweaked by `client_max_body_size 2G;` in NGINX

  # Please do some checks to `filepath` here!
  # We will write the stream's body to the file as-is!
  write(filepath, http)  # Yes, this was _that_ simple!

  # Now we need to tell the client that the file was uploaded ok!
  # All this is literally `HTTP.serve!(; stream=true)` tax
  request.response = HTTP.Response(200,
    ["Content-Type" => "text/plain"];
    body="ok")
  request.response.request = request
  HTTP.startwrite(http)
  write(http, request.response.body)
  HTTP.closewrite(http)
  # Tax ends here!
  return nothing  # We wrote to the stream, closed it, so there is nothing left todo
end

I don't know JavaScript. How do I send a file to my server?

First add an <input type="file"> and a submit button

<input type="file" id="my-file-input"/>
<input type="submit" onclick="uploadFile">

Then add a script with some javascript to handle it:

async function uploadFile(event)
  event.preventDefault() // Will prevent "submitting" the form
  const input = document.querySelector("#my-file-input")  // Will give you the input tag
  const file = input.files[0];
  if(!file) alert("No files!")

  try {
    const resp = await fetch(`https://my-julia-server-ip:1235/${path}`, {
      body: file,
      method: 'POST',
      headers: {
        'Content-Type': 'application/octet-stream',
        'Content-Disposition': `attachment; filename="${encodeURIComponent(
          this.fileToUpload.name
        )}"`
      },
    })
    if(await resp.text() === "ok!"){  // This is what we return from the "tax" code above!
      console.log("File uploaded successfully!")
    }
  } catch(e){
    console.error("Something went wrong!", e)
  }
  return undefined

That was the tiny struggle of the day! As we demonstrated, HTTP.jl gives you all the magic power you need, if only you find the right spell to cast!

© JuliaHub
Website built with Franklin.jl and the Julia programming language