Maggie Nelson

php|architect: Database Design for PHP Programmers by Mac Newbold

by maggie on Feb.27, 2008, under entry

The February 2008 issue of php|architect features an article by Mac Newbold titled “Database Design for PHP Programmers”. Here’s the blurb:

While PHP can do a lot of great things by itself, it can do much more when you add a relational database. Whether you use MySQL, PostgreSQL, Oracle, MicroSoft SQL Server or SQLite, the way you design your schema and build your tables has a big impact on the abilities of your application. By the same token, mistakes in your database structure can be very difficult to fix or overcome. A little bit of careful planning can go a long way.

It’s a pretty good article about database design and I think all PHP developers who are just starting to integrate databases in their applications should read it. However, there are a few things I’d like to add and point out.


SQL does not mean “all non-PHP code that has to do with the database”
Here’s a quick primer:
SQL – Structured Query Language. This includes all your SELECT queries.
DML – Data Modification Language. This includes any statements that change data in your RDBMS. All your INSERT, UPDATE, DELETE, MERGE go here.
DDL – Data Definition Language. How you define objects in your RDBMS. A database object is a table, index, sequence, etc. Your CREATE TABLE, DROP INDEX, etc. fall under this category.
It makes me cringe when people say “SQL statements to create your tables.” I understand that SQL has kind of become to mean “all database code”, but it really isn’t. There’s a huge difference between the three categories I listed above and it really helps developers organize how they think about database development if they understand the differences between the three.
For example, did you know that you can’t create a table with auto-increment starting at a value which is the next auto-increment value from another table? Why is that? The answer is simple: you can’t mix DDL (create table) with SQL (selecting a new value). If you understand that DDL and SQL are different, the answer is obvious. If you don’t get the distinction between the three, the answer can be kind of magical and confusing.
constraints vs. indexes vs. keys
You can think of a constraint as a “check” that your RDBMS performs when you execute a DML statement.
An index is a physical object in the database (created with DDL).
A key is more of a conceptual construct which combines the characteristics of a constraint and an index. It is a constraint in a way that it makes you RDBMS enforce a rule when DML is executed (e.g. ids of users must be unique and sequential). It is an index in a way that it is stored separately in the database and can be used for querying (via SQL). You create it via DDL, of course.
I think the article’s explanation about keys, primary keys and foreign keys was aimed at newer database developers, and especially ones using MySQL. Good database design stems from understanding the relationships between database objects and constraints and indexes help enforce those relationships. Since you, the developer is who puts these constraints and indexes in place, it is in your benefit to learn how they work.
Designing for performance
I applaud the article for pointing out that good design takes precedence over optimization and that good design provides good potential for good optimization practices. I also really liked him pointing out that it’s important to know what you should optimize. Figuring out what your bottleneck is will go a long way toward figuring out what next steps you should take.
Speaking of optimization…
Mac’s article is definitely very MySQL-oriented. I would really love for everyone to enjoy reading Oracle documentation as much as I do. But if you don’t, at least read this awesome part about Oracle’s EXPLAIN PLAN which beats MySQL’s by a long shot. I think the article does not focus much on the explain plan because MySQL’s support for it is very limited. Oracle gives you so much information in an explain plan that it is really no excuse to not know what your queries are doing.
Let the database handle data
The article points out that it’s sometimes difficult to know what code to store in the database and what in PHP. Personally, I’d love to keep it all in the database (just so you know my bias), but even if my views were more liberal in this respect, I’d disagree with the article that the choice may sometimes be unclear (and definitely in the examples the article provides).
The word “database” contains the string “data” – that should be a pretty good clue what the database is for. (Ok, that’s my snarkiness for this post.) A database is really REALLY good at handling data. Enforcing auto-increments or managing referential integrity (e.g. foreign keys) via PHP would not only be cumbersome and error-prone, but it also violates the principle of using the best tool for the job.
Overall, it was a pretty good article, albeit with a MySQL slant. I’m glad that the PHP community recognizes that most PHP applications will at some point use a database and that solid database development skills in PHP developers are important. Learning how to write database code well can also give you a whole new set of solutions to a lot of challenges you encounter in your application.
<3 databases!

1 comment for this entry:

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Tags

RSS Planet PHP

  • An intriguing use of lambda functions
    I’ve been working hard on Goodsie.com lately trying to bring it to launch. It’s been great being in on a new PHP project from (near) the beginning, as it frees up a number of things. One of those, is the fact that I can be using PHP 5.3 and all the new features that come with PHP 5.3. While I’ve used my fair share of the short-cut ternary already (?:), the […]
    Eli White
  • Accelerando
    This is a very unusual blog post for the open source area of my website, since it contains a recommendation for a science fiction book. The reason I've put it here instead of the private section is on the one hand, that it will definitely reach more geek - and therefor probably scifi interested - people here. On the other hand, the book I'm writing […]
    Tobias Schlitt
  • First year of Qaiku, and a travel writing challenge
    Qaiku, the conversational microblogging service that launched a year ago had a refresh that launched today. While it hasn't yet convinced the twittering masses, it has already proven itself as a lot more thoughtful platform for the Finnish online community, and as a valuable workstreaming tool. The new version looks quite nice and fresh. Notice the priv […]
    Henri Bergius
  • Getting started with the Midgard content repository
    I'm doing a talk today in the Bossa Conference about using Midgard as a content repository for mobile applications. As part of my presentation I wrote some simple example code for using the Midgard APIs in Python, and thought they would be good to share to those not attending the event as well. The idea of a content repository is that instead of coming […]
    Henri Bergius
  • Neural Networks in PHP
    Neural Networks in PHP By Louis Stowasser Neural networks allow emulating the behavior of a brain in software applications. Neural Networks have always had a too steep learning curve to venture towards, especially in a Web environment. Neural Mesh is an open source, pure PHP code based Neural Network manager and framework that makes it easier to work with […]
    PHP Classes