23.07.2013
by Esa Turtiainen
tags: S3 MIME

I uploaded a full HTML site to S3 using s3cmd and wondered why CSS did not load at all and even though most HTML files worked, some did not.

It turned out that the MIME types of the files were wrong.

MIME type is a string that tells what type a file is. If a page is of MIME type "text/html" it is rendered as a HTML layout document. If it is not "text/html" it is usually rendered as a text file. Browser does not use filename extension in URL (like .html or .css) to decide what the file is. It only trusts MIME type that is told in the headers of the response from the web server.

In my case of S3 MIME types of files were totally random.

Usually the web server makes decision of MIME type based on the filename extension of the file. If file is .html, it is reported as MIME type "text/html".

It seems that S3 does not make this kind of decisions. Every file in S3 has attribute MIME type that is used as-is when the file is served via the web server.

This means that the decision is left to the s3cmd when the file is uploaded. s3cmd sync (or s3cmd put) tries to be smart and check the MIME type using library python-magic. This library uses elaborate rule database to deduct from the content of the file what type it is.

At least in Ubuntu 12.04 .. 13.10 python-magic gives totally random MIME types for s3cmd.

Nice thing is that it is easy to fix this. If you remove the packet python-magic s3cmd falls back to use simple deduction of MIME type based on file name extension (.html, .htm or .css in case of web servers). And it works perfectly in these simple cases.