How Heroku’s H12 birthed a website

This is the tale of a single application requirement absolutely messing up my sleep schedule and mental ability. The story of how one minor bug sent me down a rabbit hole of errors until I conquered my demons.

The stage is set — a user needs to download a PDF that’s stored on the server. Easy enough, I save that PDF in a location and deliver it to the user when asked for. I use a blob format to send it from my Express server to the React client, and it works perfectly fine. Voila, problem solved, end of story. Or you’d hope.

Virgin blob delivery spinner

Here’s where it goes downhill. I now need to watermark the PDFs before they’re delivered to the user. 10 minutes later, it turns out that there’s no feasible way for me to watermark a PDF using Javascript and Node. *sighs*. Time to look elsewhere.

Here’s where my genius comes in: what if I use Python? Use a library in Python to generate the file, save it, have Express pick it up as a blob and send it to the user. I didn’t want to go down that route because of all the possible points of failure, but after another hour of searching for ideas, it’s the one I ended up going with.

Call me masterchef because I make the best spaghetti around here. :  r/ProgrammerHumor
It may get the job done but it was super spaghetti at its finest.

I looked up a few guides on what I wanted to do and soon, I had a Python file capable of generating PDFs that were watermarked with the user’s details, and child_process spawning the Python instance and collecting the created file to deliver to the user. Worked like a charm, and when you spend a whole day to implement a solution, watching it work just as you imagined is as euphoric as it gets. Package it, clean it up a little, push it and off to bed to never be bothered by this problem again.

Oh, how I was wrong.

A few users complained that their files weren’t being downloaded. I went through the server logs and their files were successfully created and delivered. Upon asking them to switch networks, they were all (almost all) able to download the files without a problem. A couple of users still had this issue, but it was a minute set of users and moreover I wasn’t able to diagnose the problem since all the logs showed up okay checks.

Until 3 weeks later.

(It wasn’t.)

The latest file upload was over 4 times bigger than any other PDF before, and a lot of customers began complaining about being unable to download this new file. It was the same problem, yet the logs showed no error. Annoyed of this error and imagining it would be a short and sweet fix, I set out on my Google adventure to figure out and fix this new issue once and for all. It was obviously going to be a quick easy solution, or so I thought.

Heroku popped up an H12 in my face. Finally, atleast I had a lead! A diagnosis was close. I looked up the documentation and realized that Heroku had a 30 second limit on requests. If the request wasn’t completed in under 30 seconds, it was abandoned.

That explained everything! That explained every single user complaint about not receiving their files. Not everyone has speedy internet that can download a ~10MB file in under 15 seconds (The Python file generation was expensive), so the solution to the problem would be simple: raise the Heroku timeout limit!

why write long code when big timeout do trick

Except you cannot do that. If it was that straightforward, this website wouldn’t exist.

Heroku recommends a background worker service for tasks that take a lot of time to avoid the H12 limit as well as have user experience be seamless between requesting and receiving the file. Having no experience with implementing a system like that, I set out yet again to understand Redis/Bull and refactor my code to employ use of that.

The way I understood it (and the way I explained it to my tech inept girlfriend) was that my system relied on a single worker X. When a user requests for a file, X would leave his station, create that file, and give it to the user. If some other user requested for a file while X was busy, they’d be shunned away without response. If the entire exchange took over 30 seconds, the original user would also be left empty handed. Node’s Bull package implements a queue, so I would have two workers X and Y. When a user requests for a file, X acknowledges the request and gives it to Y. In case another user wants a file, X takes that, hands it to Y and proceeds to wait. As soon as Y’s done on his first task, he returns the file to X and picks up the next task in his queue, while X takes that file and delivers it to the user that requested for it. This system was miles better and so impressive to me that I couldn’t help but try out at using this in the project.

I really should’ve investigated better.

A full day of programming ensued, but by the end of it, I had managed to reroute my application’s request to enqueue a task request, and a worker to process all tasks. In Heroku’s documented example, they use a GET request that calls for updates at fixed intervals. That’s a straightforward way of receiving updates, but seemed inefficient to me. I had worked with sockets before for a social networking project, and this felt like the perfect time to put that knowledge to use. Hacking around a quick socket implementation as well, I had resolved all of the major issues.

But there was still one small thing bothering me.

Even if the file was generated in time, if the download took over 30 seconds it would be abandoned/other users would have to wait until that was complete. It felt as if all the tough grunt work was done but the result was still just slighlty better. I contemplated getting an S3 storage, but the setup time and cost put me off the idea. That’s when I thought of Cloudinary instead. It’s a free storage service, I had already implemented it for my social networking site, I had an account set up as well. In under 30 minutes, I “wrote” (i.e. copied) the upload functions from my previous project and —

It was glorious.

Chad queue file generator and cloud uploader

This might be my favorite programming accomplishment yet, to the point that I bought a whole domain name just so I could forever remember this. The story of how I turned an inefficient, failure-prone system into a scalable and ideal solution. Programming a bug away truly is the best kind of high.

Makes you wonder, if the timeout was 60 seconds instead of 30, I wouldn’t have spent an hour rambling about my own work.

2 thoughts on “How Heroku’s H12 birthed a website

  1. Saairah Mehta says:

    i am tech inept, but i loved reading this so so much (even if it required twice the number of brain cells I generally use). cant wait to read more of you work!!!

    Reply
  2. yaaju says:

    I had to read half of everything you wrote twice but it was insanely entertaining and so interesting. Not bad at all g

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *