Content Cleaning Service - Django & AWS Lambda

Tobias Abdon

Founder & Writer

Sat 17 Jun 2017

I recently built an application that allowed users to submit content. I wanted to make sure that whatever users sent in was safe before showing it to other users on the site.

I had a couple of choices. Since the backend was written in Django I could've used a Python library to clean the content. However, the content coming from the frontend could include a range of tags, and the Python libraries that I could find didn't have support for content like SVG.

Upon investigation I came across DOMPurify, which supports cleaning a wide range of content and is actively developed. The only snag in using it is that my application is written in Python, and DOMPurify is JavaScript.

There are a number of ways to handle this kind of situation. Since I was already using AWS, I opted to build a Lambda function that could accept any type of HTML content, clean it using DOMPurify, and return the cleaned version.

This diagram shows how my solution works. You can click each component of the diagram to see the code that was used. Please explore and let me know if you have any questions!