BLOB is bad for your (mental) health

If you ever mention that your web application uses database to store files, you risk being flamed into oblivion.  Indeed, in most cases, it is a bad idea, since file system is more effecient when it comes to files.  However, there are cases when it makes sense to have files saved in the database.

Maybe I am doing something wrong, but in the last six month, I had to develop at least three systems that used MySQL for file storage (uploaded files that have to be synchronized across several hosts, etc).  Yesterday, for the third time I stumbled across the same problem, that almost drove me insane.

MySQL has four data types for storing binary data – TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB.  Somehow I always forget about these and use BLOB.  BLOB works just fine, but it has a limit on size, which is rather low – 64 KBytes.  The mean thing here is that it will work just fine with most of the test data – text files, short PDFs, and small pictures.  Once the application is tested and put into production, the corrupted files will start coming in.  Re-writing all parts that deal with uploading, moving, cleaning, escaping, and encrypting binary data takes time.  Going through file reading and writing routines is boring too, and it won’t help either.

By the time, the issue is discovered and all fields are changed to LONGBLOB, it is often very late, and you’ve lost your weekend, as well as a lot of large files. This post is an attempt to save my (and your) sanity.

Reminder: use LONGBLOB instead of BLOB for file storage, unless you are absolutely sure about the maximum size of incoming data.

2 thoughts on “BLOB is bad for your (mental) health”

  1. Why did you use database for file syncronisation between hosts? I’ve read some articles about it some time ago – there were described many ways of doing it. All what I remember is Rsync protocol. Isn’t it better to not to use database but some network protocol, developed spesially for such purpose?

  2. Tiamat,

    usually – yes. However there are cases, when even rsync is not an option – cross-platform hosting, firewall policies, lack of full control over some hosts, lack of expertise, etc.

Leave a Comment