Cleansing HTML in Django Forms

We want to accept some HTML, while stripping out the rest, in our Django form. Let’s take a quick look at how we can introduce some basic HTML cleaning functionality to our Django ModelForm. We will be using Bleach to do all the dirty work.

from yourapp.models import YourModel
from django import forms
import bleach

class YourForm(ModelForm):
    class Meta:
        model = YourModel
    def bleachData(self, data, whitelist=[]):
        allowed = whitelist
        clean_data = bleach.clean(data, allowed)

        return clean_data

    def clean_somefield(self):
        somefield = self.cleaned_data['somefield']
        whitelist = ['b', 'i']
        somefield = self.beachData(somefield, whitelist)

        return somefield 

    def clean(self):
        cleaned_data = super(YourForm, self).clean()
        self.cleaned_data['somefield'] = self.clean_somefield()

        return self.cleaned_data

As you can see, we run our normal form validation methods and then initiate a post-cleanse cleanse by bleaching ‘somefield’ and allowing a whitelist of tags, bold and italics.