Skip navigation.
Home

Doing it wrong, the database way

With the availability of free database systems, such as the 20 MySQL forks, SQL light, and my favorite: PostgreSQL, a lot of people are trying to integrate RDBMS functionality into their software.

It's amazing to me how many of these people really shouldn't be doing this, or who should at least ask for some help from someone who understands a little bit about databases.

Let me paraphrase a conversation I saw on a mailing list recently:

The field was varchar(40) and the code was trying to put a value in it bigger than 40 characters, so I enlarged the field to varchar(255). That should fix the problem.

"That should fix it" is essentially the same as saying, "our software should only crash occasionally."

If you already know how stupid that statement is, then you can stop reading, because I'm just going to rant about it.

Firstly, why the hell are you trying to stick more than 40 characters in a 40 character field? I mean, somebody made that field 40 characters, are you sure it isn't the code that's broken by trying to stick more in? What does the rest of your code look like anyway? Are there buffer overflows everywhere because you frequently try to create strings larger than the space you've allocated?

Secondly, are you designing this for database software from the 60s? Because you obviously picked a (rather arbitrary) limit of 255, which is the upper limit for an 8-bit integer. What database are you using that's based on 8 bit math? I hear 16 bit is all the rage lately, you might want to ditch the vacuum tubes and upgrade to an IBM model 50!

Lastly, why the fuck did you pick a limit at all? I mean, the arbitrary selection of 255 is a hidden way of saying, "I have no idea what size this could get to" which is going to bite you sooner or later somewhere, and for absolutely no reason whatsoever.

Let me give you a hint: limiting that field size to 255 does nothing to improve the database layout, yet it guarantees breakage if you ever use more than 255 characters for that field. Does your code have a 255 character limit? If not, then you just guaranteed that your code won't work with the database all the time. Guaranteed!

Would you arbitrarily limit an integer field to 7000, because you don't think it will exceed that? No? Why not, it wouldn't be any stupider that what you did with that varchar field.

There's probably someone out there who's going to try to claim some space savings. Hello! I was serious about it not being the sixties any more. Any RDBMS software that's from this decade is only going to store the characters that you stuff in there. Hell, anything worth its salt will compress it while it's at it.

That limit is available for the purpose of data validation. If you pick an arbitrary limit, you're not validating anything!

But, apparently the number 255 is magical (I can't count how many shitty database schemas I've seen where every varchar is limited to 255 characters, and nobody has a good explanation for why!)

You could almost write it off as ignorance or a beginner mistake, except it's being done by people who should know better. Systems that are otherwise well-designed have this bullshit in them as a matter of practice.

It's everywhere! It's spreading like swine flue, and it's a menace. These young programmers see it and their fucking brains shut off. Apparently they never ask why ... they just copy it mindlessly into their designs.

You should know the size of the data you'll be putting in the database. It's your code for crying out loud, what's the limit on that variable? You didn't limit it ... well, bigger problems are coming down the pike, then.

I've got 15 years of database design experience that guarantees that if you limit a field arbitrarily, some day, something will be broken as a result.

If you legitimately don't know the limit, then leave it unbounded. Is it that hard?