Because your web code is as clean as 8 year olds at day care. That’s why. I really should have invested in a hand sanitizer company back in April. I wonder how well they’re doing.
Anyway, as part of my continual blogging as part of my Open Source Development course, this week I’m going to discuss a PR constructed for our class project Telescope. As part of our third assignment, we’re required to contribute to a repo of our choosing, and some sort of contribution to Telescope.
Telescope is an open source web server and client application for aggregating and presenting a timeline of Seneca’s open source blogs. Telescope makes it easy to see what’s happening with open source at Seneca right now.
My JavaScript is near nonexistent but I really enjoy working with web technologies. Because of this week was hectic, confusing, yet also very exciting.
I’d really like to contribute towards the back end but my front end skills work really needs work. It was either this or contribute towards writing front end testing- which is on hold right now since we’re pivoting to next.js. Nonetheless, I decided to jump right into an issue that affected the User eXperience, and as a bonus, an issue partially based on something that really excites me- security! This issue revolves around how an image (specifically, a particular type of image) wasn’t being loaded, and it was because of this that I found this issue so interesting:
Telescope had scraped the original post, but not the <img>
tag. What gives? Time to get my magnifying glass and deerstalker hat… like a nerdier Sherlock Holmes.
My first step was to explore Telescope and find out how it works, what makes it tick, maybe it enjoys long walks on the beach, or maybe a big it spends its free time dreaming of electric sheep ‘neath the clouds. I got a hint from the issue post on GitHub that the Sanitizer used with the project’s parser likely the culprit. I dug around until I found the sanitizer module, then spent some time reading about how it works. I also spent an embarrassingly long amount of time reading about html tags and attributes… it’s been a while. Finally I spent some time reading about how data:
URIs work.
Once I oriented myself I spun up my local copy of the project, and got to work. My second step was to hunt down exactly what is and is not being accepted to the database when parsing a blog post. I decided that I should first determine if changing this file actually does anything, i.e. if I was even on the right track. I decided to (hilariously) tell the sanitizer to restrict all html tags.
Injecting a hilarious side note here, this is when I discovered that dev.to
will block assets if hit enough times. You see, right when I decided to restrict all tags, the entire time I was making hits to a specific blog post that I wrote that had an image on it. I made my change and refreshed. And look! My image is gone! Great! Wait… wtf? Why are other people’s images appearing still then? Furthermore, why does my post still have tags? Turns out dev.to
uh… blocked my image from loading (it wasn’t loading the actual Telescope site too.) Haha very funny guys. Just another thing I learned I guess. Anyway.
Each post should just be plaintext at this point right? Right! Well, no. I made a bunch of changes and turns out once a post is indexed to the database, it is how it is and forever shall be, sanitized tags et al. I noticed this behaviour when a conveniently timed post was indexed for the first time, and was just in plaintext. Perfect, I’m on the right track. Now I just have to reverse my changes and figure out how to unblock these types of images.
I went back to the sanitizers documentation and found what I needed:
allowedSchemesByTag: { img: ['data'] },
This simple one-line change enabled img
tags that have a data
scheme to be allowed. So… how do I test that this works? I asked around and my wonderful professor suggested, in much nicer words, that I stop wasting my time and instead write a unit test for this fix to see if an image with a data
scheme was being received as expected. Good idea! And I also get to finally write a test! Something I’ve been tortured with for the last 2 years is now my own power. And here it is:
(Sorry for the image, dev.to isn’t letting me post this code in a code block.)
This is essentially just saying “I want this line of code to look the same when it comes out the other end of the sanitizer.” And it did! I also made sure of this by modifying the test to see if it would break, and I made sure to reverse the changes made on the sanitizer to see if it would be blocked as expected, and it was! Success! Or… was it?
It was then that I realized the hubris and greed of my ways. Why on Earth was my new addition causing other tests to fail?? I stumbled on this issue for an hour or so until I noticed the problem. Maybe you’ll notice it faster than me:
allowedSchemesByTag: { img: ['data'] },
See that’s the thing about programming. Computers do exactly as instructed. I just hadn’t told it the right schemes to allow… sigh.
allowedSchemesByTag: { img: ['http', 'https', 'data'] },
Gee I wonder why all the images were being blocked. Hmm.
With the fix in place, and my tests written, it was time to git rebase
, push
, and comment.
Here is the final PR for the Telescope portion of this assignment.
Overall I’m feeling really… well, okayish about this PR (and everything, really). I look around and see some really great PRs by other students. Why can’t I be that good? Why can’t I code this well? Well, in time maybe. Sucking at something is the first step towards not sucking at something. At least I always tell myself that. What a sucker I am.