Prevent Spoofing with Paperclip

Jon Yurek

Egor Homakov recently brought to my attention a slight problem with how Paperclip handles some content type validations. Namely, if an attacker puts an entire HTML page into the EXIF tag of a completely valid JPEG and named the file “gotcha.html”, they could potentially trick users into an XSS vulnerability.

Now, this is kind of a convoluted means of attacking. It involves:

  • A server that’s running Paperclip configured to not validate content types or filenames
  • A front-end HTTP server that will serve the assets with a content type based on their file name
  • The attacker must get the user to load the crafted image directly (injecting it in an img tag is not enough)

Even with this list of requirements, it’s possible, and so we need to take it seriously.

Content Type Spoof Detection

To combat this, we’ve released Paperclip 4.0 (and then quickly released 4.1), which has a few new restrictions in order to improve out-of-the-box security. The change that handles this problem directly is an automatic validation that checks uploaded files for content type spoofing. That is, if you upload a JPEG and name it .html, it’s not going to get through. This happens automatically during the upload process, and uses the file command in order to determine the actual content type of the file. If you don’t have file already (for example, because you’re on Windows), you can install the file command separately.

Required Content Type or Filename Validations

Next, we’re also turning on a new requirement: You must have a content type or filename validation, or you must explicitly opt-out of it.

class ActiveRecord::Base
  has_attached_file :avatar

  # Validate content type
  validates_attachment_content_type :avatar, :content_type => /\Aimage/

  # Validate filename
  validates_attachment_file_name :avatar, :matches => [/png\Z/, /jpe?g\Z/]

  # Explicitly do not validate
  do_not_validate_attachment_file_type :avatar
end

Note that older versions of Paperclip are susceptible to this attack if you don’t have a content type validation. If you do have one, then you are protected against people crafting images to perform this type of attack.

The filename validation is new with 4.0.0. We know that some people don’t store the content types on their models, but still need a way to be valid. Using the file name can help ensure you’re only getting the kinds of files you expect, and all Paperclip attachments have that. This will allow those users to upgrade without having to implement a possibly costly migration of that data into their database.

Content Type Mapping

Immediately, some users reported problems with the spoof detection added in 4.0. In order to fix this, we released 4.1 that added an option called :content_type_mappings that will allow you to specify an extension that cannot otherwise be mapped. For example:

Paperclip.options[:content_type_mappings] = {
  :pem => "text/plain"
}

This will allow users to upload “.pem” files (public certificates for encryption), because file considers those files as “text/plain”. This will tell Paperclip “I consider a .pem file that file calls ‘text/plain’ to be correct” and it will accept it.