Sponsored by Dragon Age: Origins
Can't get enough Dragon Age: Origins? Check out new footage. view!
DragonAge.BioWare.com - EA presents BioWare's new dark fantasy epic Dragon Age: Origins. '9/10' from Game Informer.
84 Comments
- headzoo, on 10/12/2007, -2/+21I'm really digging IBM these days (who wrote the article). They've taken a very strong stance for open source software, and have gone out of their way to promote the software, and educate people on using it.
- aNoble, on 10/12/2007, -1/+10Personally I like PHP 5 a lot. The OO stuff they've added is really pretty nice. PHP is the best choice for many situations and for many situations and there are many other languages that are better for others. I'd say Ruby and Perl and PHPs bigest competitors at this point.
As for MySQL it's simple and fast. But if you want a serious database I'd suggest PostgreSQL. - jellygraph, on 10/12/2007, -4/+11@fjc8
thats funny, because thats how I view ASP.net as well! - mork29, on 10/12/2007, -0/+5I'd have to say only the first is a problem regarding PHP and DBs. The remaining are purely people who don't understand what a database is. Still useful advice and a good article.
- johnnybravoh, on 10/12/2007, -1/+6The author had some reasonable SQL101 type points, but some of the content was not up to snuff. When the author states that this is more correct:
require_once("DB.php");
function get_user_id( $name )
{
$dsn = 'mysql://root:password@localhost/users';
$db =& DB::Connect( $dsn, array() );
if (PEAR::isError($db)) { die($db->getMessage()); }
....
I mean why not just have all that ***** in a function called DBConnect().
Here's the one I use...
function DBConnect(){
global $db;
$dbhost = 'DBhost';
$dbuser = 'DBUser';
$dbpass = 'DBPass';
$dbname = 'DBNAme';
if (!is_object($db)){
require_once('DB.php');
PEAR::setErrorHandling(PEAR_ERROR_DIE);
$db = DB::connect( "mysql://$dbuser:$dbpass@$dbhost/$dbname");
$db->setFetchMode(DB_FETCHMODE_ASSOC);
}
return $db;
}
Turns that whole mess that the author was talking about into
function get_user_id( $name )
{
$db=DBConnect(); - dvws, on 10/12/2007, -2/+7The function naming convention (or lack thereof) is terrible.
That and lack of namespaces are the biggest PHP gripes out there. I can understand the namspace thing to an extent, but the naming conventions for functions? Thats hardly something to cry about, honestly if it bothers you that much, wright up some function wrappers and stop complaining.
@LucasOman: this isn't directed at you personally, just a general rant on the subject because I think that in general people complain about that one way too much. - jclawson, on 10/12/2007, -0/+5This is not a very good article at all. For one, they suggest PEAR DB. PEAR DB is slow as hell. If you are making an open source application where many people will use it... ok maybe. However, if you want performance, and scalability: decide on what database you are going to use and do it native. Don't use mysql use mysqli and prepared statements-- then you will avoid SQL injection problems.
Also, the author claims you cannot do "realational" operations (I think he means joins) between databases. This is completely false. Of course MySQL can do joins between databases. No, you shouldn't put 1 table per database (who is stupid enough to do that?) but you certainly can use more than one database and do joins on them.
Its hard to believe the author really has 20 years of web development experience. From reading it, it looks more like 6 months.
How did this crap make it to the front page? Is anyone actually reading it? No Digg! - johnnybravoh, on 10/12/2007, -1/+6I prefer using JOINs because it means that the only purpose of your WHERE statements is for filtering. For instance:
Select a.OwnerName, b.PetName from
Owners a inner join
Pets b on
a.OwnerID=b.OwnerID
where b.PetName="Sparky".
In that example it's very clear where you're JOINing different tables and where you're filtering data.
Doing it again without joins...
Select a.OwnerName, b.PetName from
Owners a, Pets b where
a.OwnerID=b.OwnerID
and b.PetName="Sparky"
It's less obvious where the join is and where the filter is.
I just find this a little more hard to read without JOINs. - nofxjunkee, on 10/12/2007, -1/+6I think there's a lot of crap PHP code out there that people learn from, as well as crap tutorials and the like. PHP isn't a terrible language itself. (It's not amazing either, but it's damn useful)
- inactive, on 10/12/2007, -1/+5Good article. A couple of these are applicable to _any_ database programmer, but it's nice to know a few bits and pieces in PHP that I'd forgotten.
The 'n+1' thing he mentions is prevalent and I've seen it in lots of code I've had to deal with. It's a pain! - elephantdog, on 10/12/2007, -0/+4I would have though SQL injection was on the top of the list. I've come across many different web-apps that took input from the user and fed it directly to SQL, no escaping. Then they wonder why people can't have apostrophes in their input.
- jclawson, on 10/12/2007, -0/+4You should always use the JOIN keyword. MySQL 5.1 no longer allows you to do LEFT joins mixed with INNER JOINs without the use of the keyword. This will not work:
SELECT *
FROM table1,
table2
LEFT JOIN table3 ON table3.id = table1.id
You must do this:
SELECT *
FROM table1
JOIN table2
LEFT JOIN table3 ON table3.id = table1.id - joshink, on 10/12/2007, -0/+3Digg uses PHP and it gets about 500 million page request a day!
http://www.alexaholic.com/digg.com?y=p&r=6m&z=6
But I guess all you .NET developers think that's peanuts, eh? - Aculeus, on 10/12/2007, -0/+3I can't believe they suggested PEAR DB as a good API in this day in age of PHP5. They forgot to mention Propel/Creole, PHP ADODB, and one more I can't remember. It's fairly new but uses PDO. PDO is a great move for PHP but it still needs a layer on top of it for it to be useful.
- psyon, on 10/12/2007, -2/+5For an article about common problems, the guy has a few of his own.
on his examples with the jpeg files, why does he store the filename in its own field, and still in the path? Why not just store the path as "media/"?
on #5 he has
SELECT files.* FROM users,files WHERE users.login=? AND users.id=files.user_id
Wouldn't using a JOIN be better? I know its always been faster for me. - rcook, on 10/12/2007, -2/+5He may be doing it without the "JOIN" keyword, but saying "SELECT ... FROM TABLE_A, TABLE_B ..." **is** a join. The comma in the FROM clause is the same as an INNER JOIN. You guys saying "without using a JOIN" is hurting my head ...
- aNoble, on 10/12/2007, -1/+4Yea, when there's a "_2" in the url (http://digg.com/programming/Five_common_PHP_database_problems_2) you know it's a dupe.
- voodooVince, on 10/12/2007, -0/+2One thing not mentioned is that PearDB uses a separate sequence table to emulate auto-increment. I understand this makes it more compatible with Oracle as well as solving the second problem.
- merreborn, on 10/12/2007, -1/+3Ryosen: You have no idea what you're talking about.
1) databases do a very good job of only working on the rows they absolutely have to. If you join two million row tables that you only need 20 rows from, your database will only work with those 20 rows. That's what indexes are _for_.
2) The solution to your proported "Users only look at the first 20 rows" problem is simply to LIMIT your queries. In the real world, you hit situations where you need data from more than 20 rows at a time, all the time -- reporting, for example.
I administer about a dozen multi-gigabyte databases (with millions of rows in some tables). I've benchmarked. And I've converted many, many scripts from the braindead 'n+1' model to single queries with a JOIN for huge performance gains over and over (10x or better -- scripts that used to take minutes to execute now take seconds). If you index your table correctly, the difference between a single JOIN query and 20 single table SELECTS is usually the time it takes to set up and tear down the 19 extra queries -- the actual processing time on the database end is nearly identical, if not *lower* for the join. - Tialys, on 10/12/2007, -0/+2I'm surprised they didn't mention using the wrong type of feild. Over and over again I hear: Why wont my feild go past 127? People never research feild types and assume tinyint will work for a users database... Infuriating.
- Obsidian743, on 10/12/2007, -2/+4All of those problems, except #1, are common problems with any database application, not specific to PHP.
I believe IBM is trying to imply that those who are more likely to use PHP for these tasks are perhaps more likely to make these mistakes? - heiden, on 10/12/2007, -0/+2A simplier way of getting the last inserted autoincremental id is by using mysql_insert_id(), however, this makes the query mysql specific.
- Julz, on 10/12/2007, -3/+5I would have to agree with all of these five problems, Ive only encountered one in my few years of using PHP in conjunction with MySQL, in my opinion PHP is good but not great, there has to be something better out there, same goes for MySQL!
- error401, on 10/12/2007, -0/+2Right. Encourage people to do joins, and make not a single mention of indexing.
Worse, don't use primary keys, or even unique keys. One of his arguments for using auto increment is that it forces the field to be unique - but it doesn't. In MySQL (and any other sane database), you're totally free to insert whatever values you want into an auto incremented field *unless* there's some kind of constraint like a unique or primary key. His code doesn't use either, so it doesn't actually fix this. Funny that he brings up the issue and doesn't mention the obvious and necessary solution. Since MySQL doesn't have foreign key constraints (most of the time) things get pretty weird and wacky if you don't at least enforce that primary keys be unique...
I'm pretty disappointed in IBM for publishing this, to be honest. Maybe they don't feel the need to mention indexes, but their examples should at least represent good form. - scotty79, on 10/12/2007, -1/+3Badness of n+1 is not that obvious ... imagine you have few thousands books but only ten authors, and you want to display just 20 books (with their authors) selected with conditions on book fields. And imagine that authors are described by a lot of data, including their DNA sequence. In such conditions join will be slower than getting books first, and pulling only necessary authors from database later. And avoiding joins result in much cleaner OO data access layer architecture. Plus you can make this fast even for bad cases with smart caching (which can be smarter without joins).
Cramming joins where you can is just premature optimization, often mistargeted. - Kethinov, on 10/12/2007, -0/+2Agreed. While some of his points are valid, much of it reads like "everyone should do things this way" without much rationalle. Especially with regards to his first "common problem" there are numerous valid reasons for not abstracting SQL queries.
- Ryosen, on 10/12/2007, -0/+2@merreborn
Please read the last paragraph of my post. It specifically says that you should NOT use n+1 for reports. Also, as a database administrator, you are looking at this from a back-end perspective. A software architect would look at it from a different, over-all perspective. Yes, joins are unquestionably the way to go. However, in delivering those results to the end-client, the entire set should not be consumed all at once.
The rest of my post talks about UI design which, as I stated, makes up the bulk of query consumers for PHP-based applications. My criticism of the article was based on the author's blanket assertion that n+1 is always bad. It is not. Like all things, it has its applications.
Also, in high-volume concurrent systems, it is not advisable to pull large resultsets in one shot. Then again, I wouldn't use PHP for a large-scale transactional system anyway. - jclawson, on 10/12/2007, -0/+2Do not do this to figure out the next primary key. You are opening yourself up to race conditions... use autoincrementing columns. mysql_insert_id() should only be used to get a tuple's primary key that you just inserted so that you can do other database operations such as inserting another row in another table that uses it as a foriegn key... or putting the id in the query string of a link on the page.
- motoole22, on 10/12/2007, -0/+1"or u can strip or escape characters that may apply depending on your code and database (like ',",`,;)"
Some people have apostrophe's in their names! - error401, on 10/12/2007, -0/+1If you ever need to join two tables, they belong in the same database. His advice is sound. Separate databases are for separate unrelated datasets and shouldn't ever be related with a PK/FK relationship. Most databases preclude this. I'm not sure about MySQL. Namespaces are more appropriate for cases where you might want to do this, which of course MySQL doesn't support. So maybe there is a use-case...
Nevermind. - zeeneo, on 10/12/2007, -0/+1I started to learn MySQL within PHP and I often messed up the syntax and spend ages trying to debug PHP when it was infact the SQL.
Learn MySQL first - then plug it into PHP. Write an SQL statement in query browser (or what ever) before you put it into your PHP code. If you only have a small amount of data in the DB then fill it with 1000+ records and do your queries again - if it's slow then do an EXPLAIN to find where it's cocking up.
Left, Right, Inner and Outer joins will make your SQL more readable. - Vinvin, on 10/12/2007, -0/+1I got a lack of knowledge on SQL area, and I had a question regarding the fourth example: what if one file can be owned by multiple users? Should you put another table in between to define the relationship file-owners? I guess not, but what then?
- merreborn, on 10/12/2007, -0/+1Does this code prevent SQL injection? Is it that the "?" manages to take care of sanitation?
Yes. That's what prepared statements are for. Don't use mysqli_*, that's exactly what the first point in the article was -- use a db abstraction layer (they all support prepared statements):
1) You may be developing a piece of software clients will be installing on their own boxes (ala phpBB) -- using an abstraction layer means your software will probably work on any database of the client's choosing
2) Even if (1) doesn't apply, you may find that the database you're currently using just doesn't cut it. You'll have a hell of an easier time transitioning to a new DB if you've been using an abstraction layer all along.
Pear DB is outdated, although it still works fine. MDB2 is a much more active project. - inactive, on 10/12/2007, -1/+2@ johnnybravoh:
Agreed. For awhile I joined all of my tables within the where clause and always thought that it was a bit grueling on the brain. Using the JOIN keywords allows MUCH better organization of queries and separation of joins and filters. - sdether, on 10/12/2007, -0/+1Not last I checked. As much as I think sequences are the superior way of handling this, auto-increment seems to be just as common. mySQL, Sybase and SQL Server (came out of Sybase) all have a form of auto-increment. PostgreSQL and Oracle have sequences. Not sure about the rest.
- loconet, on 10/12/2007, -0/+1Most of these problems seem to be related more to lack of knowledge of SQL rather than "PHP using databases". I guess that is one of the drawbacks of PHP being such an easy language to pick up. Many newcomers will dive straight into using PHP's database functioanlity without taking the time to learn the the basics of SQL and relational databases.
- jinushaun, on 10/12/2007, -0/+1People keep mentioning use mysqli instead of mysql. I wish it were that simple. Where I work, over 90% of our sites are hosted on servers that do NOT run PHP 5--which is required to enable mysqli. Nor do we have the ability to install PHP 5 on there ourselves.
- nofxjunkee, on 10/12/2007, -4/+5These are pretty basic, but I remember being new to PHP and using the mysql_* functions because that's what websites would teach. Dugg in hopes that any new PHP coders will read and heed these warnings! It's all good advice.
PHP as a language really is a big hack. The OO stuff in 5 is good compared to 4, but there are cases where PHP 4 still shows through the OO disguise. Popular frameworks built on PHP tend to have the hack-ish feeling to them as well. Seagull seems nice, Quanta is cool, but my favourite is the Zend Framework so far. It's not revolutionary but it does a good job of staying out of the way, and making PHP feel less like PHP.
FWIW on the project I'm doing now I did a quick prototype in Rails first, because I like Rails and I needed to do the prototype. It took 3x as many lines of code for the PHP/Zend Framework one to be at the same level as the Rails prototype. ~ 760 vs. ~ 2100, according to rake stats and sloccount. That might not be fair since the ZF is very new, but there it is. - ax0n, on 10/12/2007, -0/+1I'm guilty as charged on some of these, but I already knew it was poor coding. I'm still a PHP noob, though (relatively speaking).
I cross posted this on http://www.kcpug.org/
IBM DeveloperWorks has been kicking names and taking butt lately. - sholdowa, on 10/12/2007, -0/+1There's some poor examples in there!
INSERT INTO users VALUES ( null, 'jack', 'pass' );
assumes there's 3 columns in the table. What happens if you need to add a fourth? Look at the amount of code you'll have to rewrite because you couoldn't be bothered to define the columns you're adding data into.
Written by someone with minimal practical knowledge, I'd say. - alarion, on 10/12/2007, -0/+1people actually span multiple tables, across multiple databases for a single web app? I can see the reason for this in certain situations - but they seem to be talking about mysql which doesn't utilise tablespaces or datafiles (with the exception of innodb). I can't believe this is actually one of the five most common problems.
What they should include is:
-the lack of proper nomalization of databases
-the lack of naming columns with meaningful, non-ambigous names: using 'id' as the PK for every single table might be cute for your silly RoR/Activerecord apps, but it's slightly retarded from a database design perspective - error401, on 10/12/2007, -0/+1I'm not sure how it's done in PEAR since it just uses the native mysql_ calls in PHP, which don't support parameter binding or prepared execution (nor does MySQL prior to 4.x IIRC), so I assume the PEAR classes just handle building the query string and properly quoting any parameters for you, so you can't possibly forget or miss something.
With most platforms this is done on the database side, with the database API receiving the data in two separate variables and memory areas, making SQL injection attacks possible only in cases where the database engine has a very stupid bug, or where user data somehow ends up in the query string (ie. poor programming). As an added bonus, such platforms allow you to 'prepare' a query and compile it on the database side, and then just fire datasets into it (each inserting/updating a row, say) without having to build query strings or recompile the query on the database side.
Both PHP and MySQL were at least 5 years behind the pack on this, until very recently, which is why you don't see much adoption or documentation yet. Check out Perl's DBI or the Java database models as good examples of how things are done in the 'real world' :). - Lazybones, on 10/12/2007, -0/+1Good point...
- Vinvin, on 10/12/2007, -0/+1Well, indeed, I did not take a db course, all I had to do it with is some mentioning in a free tutorial on PHP.
- Aculeus, on 10/12/2007, -0/+1What is wrong with plain 'id' for any app. If you want to be specific use tableName.id. If you want to make a relation to it name the remote column tableNameId. Id is not ambiguos. It is known the be the PK of the table.
- jclawson, on 10/12/2007, -0/+1The file table should be just that... a file table... you need another table to indicate the owners where a property on the owner table is the foreign key _FileID which links to the primary key in the file table
- Vinvin, on 10/12/2007, -0/+1But then you could only specify one file per user, right?
- alarion, on 10/12/2007, -0/+1@rcook
close but not quite.
While selecting from multiple tables does indeed automatically signify a join, without actually joining the tables together by specific attributes either in the where clause, or by using a join/using or join on clause you are creating a cartesian join, NOT a inner/left join as you have stated. BIG difference. - jclawson, on 10/12/2007, -1/+2Don't use Pear DB... use MySQLi with prepared statements and you never have to worry about SQL injection
- freeed, on 10/12/2007, -0/+1Even though most of the problems in there are quite obvious especially if you ever had a database class, it's good to get reminded of them just to keep in mind what to look at when writing a db applications.
The database abstraction part was interesting. Is PDO PHP5? -
Show 51 - 84 of 84 discussions



What is Digg?
The Digg Toolbar for Firefox lets you Digg, submit content, and keep track of Digg even when you're not on the Digg site. Download the official