72 Comments
- inactive, on 10/12/2007, -7/+32Thinking MS Access is a real database should be up there at the number one mistake.
And having good version control and backups are par for the course in any type of technology implementation. - Olle, on 10/12/2007, -0/+14OK, I'm intrigued. I'll make a brief top ten mistakes list. Lets see if I get as much flak. Some of my top mistakes will seem obvious. But I've seen some horrors in my day. Mistakes by people who really shouldn't be making those types of mistakes.
Here's what people fail to do.
1) Make sure you put unique indexes on values that are supposed to be unique.
2) Make check constraints up the wazoo. Should the number be positive? Make a check constraint.
3) Be freaking consistent. Come up with a convention and stick to it.
4) The database is "the sacred resource". You can scale out an app server but not a database. You can only scale a database up. This gets very expensive. Do what you can to not stress the database more than you have to.
5) Normalize with common sense. Understand what you are doing and break the rules when you feel like it.
6) Make sure you understand how transactions work. Really. It never seizes to amaze me how people continue to not understand this. Not only do people not understand the different levels or serialization, but they don't even use transactions.
7) Don't forget that your database should be administered. Think about the physical design before you deploy your system. Decide what data files you need and on what hard drives they should be on and make sure that you will be able to back up your database not only 1 month after you deploy, but after 5 years too.
8) If you are going to write a stored procedure. Make sure you understand how the error handling works.
9) Make sure you take the time to understand the collation that you want to use and make a conscious decision before you start developing.
10) Oh, and make sure that you have foreign key relationships wherever you can.
I did it. 10 items. I feel like Rocky after he climbed all those stairs. - NoodlyAppendage, on 10/12/2007, -3/+11It's a matter of scale; there are many more small companies around than larger companies, yet every company needs to manage data. Access handles this quite well. Disregard that market and it's yours to lose.
- Bogtha, on 10/12/2007, -0/+6Actually, plenty of people use Access. Years ago I worked at a particularly clueless web development firm that used nothing but Access. For high-profile e-commerce websites for clients that you will have heard of. It was routine to stop the webserver, download the database file, "compact" it with Access, "fix" it with Access, upload it again, and restart the server. The clients had been with the firm for so long that this was all that they knew - they expected the downtime and didn't realise that it wasn't normal. Utterly messed up, I know, but it is quite common for clueless people to make money doing clueless things.
- bloodmoney, on 10/12/2007, -0/+6butchcassidy503,
I agree about embedded SQL in applications being the wrong way to do things, but I also think that hiding business logic in stored procedures can bite you. Experience tells me it's best to keep domain logic seperate from data. I have nothing against stored procedures, but I am careful about what I put in there. - inactive, on 10/12/2007, -3/+8putting application logic into SP's is a ***** good idea ok. and it's not something i see developers doing anyway. normally they harcode sql into their app then wonder why none of their reports are getting the same results.
also, you can change SP's and effect changes to the system on the fly where hardcoded logic requires software updates.
no digg for that single crappy rule. - PlancksCnst, on 10/12/2007, -0/+51) Don't digg it until you've read it.
2) Since you didn't follow #1, then go you your profie (there's a link at the very top that says "qtuner Profile"), then find the dugg story and click the "undigg" link next to it. - butchcassidy503, on 10/12/2007, -1/+5I disagree with the rule about staying away from stored procedures. I absolutely hate seeing SQL code embedded inside of another programming language. I've been guilty of that in the past, but it's best to put the SQL where it belongs! Just call stored procedures from your APP and leave the SQL code in SQL. This also makes migrating your app to other programming language(s) much easier.
- captaindan, on 10/12/2007, -0/+4If you want to keep SQL out of your app, you might try putting it in a library that your app uses.
- omegaworks, on 10/12/2007, -1/+5"APIs such as ODBC, JDBC, and OLE DB have promoted the notion of database independence - the idea that you can write your application code in such a manner that you can plug any database at all in for data storage."
Interfaces are actually very good things. They allow for code separation - if you let your program get too tangled up in the intricacies of the particular database implementation you may end up needing to rewrite entire swaths of it when switching databases. It also contradicts his first statement - how does anyone know how big their business will grow? Joe Shmoe chooses a database that fits his needs now, a small one. He finds out later he needs to upgrade it. By wrapping the database in an abstraction layer, its a very simple (almost always automatable) process. If not, he may need to rewrite significant amounts of code. - Jerph, on 10/12/2007, -0/+4I'm a developer, not a sysadmin. My small team doesn't have a DBA. People can easily be very knowledgable in one area and clueless in another, even if it's closely related.
- Omicron, on 10/12/2007, -5/+9i thought this was a helpful article. some of them were a little obvious, but overall well written.
+digg - Ryosen, on 10/12/2007, -0/+4Stored procedures are great if you only have one small application accessing it. But when you have multiple applications accessing the same stored procedure and that SP is changed, you can easily end up breaking the other applications. This is due in large part to inexperienced database programmers mixing business logic in with data management logic.
Also, the comment belittling developers that don't have database experience is a little naive. Sure, if you're working on an Access or small MySql database, it would be fair to expect the developer to be able to design the database. But get into a larger environment and your programmer absolutely should not be your database designer and/or administrator.
As a consultant and one of the chief architects for the global financial system of a major bank, I have seen many projects suffer from ignoring the need for specialization on the database side. - csrster, on 10/12/2007, -0/+3Why are people claiming that this article is arguing against stored procedures? What
he actually says is that sp's need to be subject to the same development controls as
all other code - design review, testing, code review, version control ....
Sounds like common sense to me. - mbourgon, on 10/12/2007, -0/+3Lemme add one thing to Olle's comment: It's a SET. A database. Not a series of records you can step through.
Example: we had a program hitting a database. Data mining. Each book and each store combo took 30 _minutes_. 2 books, 2 stores? 2 hours. Wielding the index-bat I whittled it down to 7 minutes. People were happy. I was furious.
So, I had a coworker who knew ASP look at their code. What they did was come up with every possible combo, and run that as a query. For my above example, it would run 1 _million_ queries. We changed it to actual SQL code, using WHERE clauses and the like. 2 Seconds. Period. - oepapel, on 10/12/2007, -0/+3@mbourgon,
I had a similar experience. I took over a project where the previous coder had a select * with inner and outer joins and then proceeded to perform a bubble sort in a 4GL language before generating a 1 page summary report. It took 8 hours to run when I got the code. I boiled it down to less than a second. The worst part is that the department assumed I must have cut corners and didn't trust the result and continued using the old report code for 6 months (daily) until they were convinced that it was correct. - Olle, on 10/12/2007, -1/+4Regarding business logic in stored procedures. I have done a lot of large scale database driven database development and I consider the database to be the "sacred resource". I write stored procedures sometimes. Sometimes I make the query from another programming language. I make the query from wherever I believe that it'll tax the database the least unless the query is unlikely to cause any load to talk about, in which case I avoid stored procedures.
You can scale out your app servers but scaling up a database quickly becomes very expensive. So, some authority said to put/not put your SQL in stored procedures? I don't care.
I do what I need do to postpone a performance bottleneck in the database as long as possible
On another note. I agree with the author that 3NF isn't the holy grail. But I see no reason why a lookup table with only two lookup values can't be used, for instance. There are two main ways (three if you count triggers) to guarantee that values conform to some rule, one is check constraints, the other is relational integrity. I guess he is arguing the check constraint, but I'd really like to know why he thinks it is so much better than the foreign key relationship to a lookup table that it warrants a spot on a top ten list.
Well, at least the guy is sticking his neck out. I guess if I tried to write ten big DB mistakes list there'd be someone like me telling me I'm a moron. - m85476585, on 10/12/2007, -1/+3What about storing SSNs as unencoded integers?
- oepapel, on 10/12/2007, -0/+2The number one mistake made by Database Developers is that they use a database when it is not necessary. The author touched on it but I think it bears expanding. I've seen FAR too many apps where the data is locked in a database when there was a perfectly good standard file format that could have been used instead.
- inactive, on 10/12/2007, -1/+3Dugg for:
"Don't Assume Everything is a Nail Just Because You Have a Really Big Hammer" - fgb00, on 10/12/2007, -0/+2It didn't say to stay away from stored procedures at all. The article said not to hide business logic in stored procedures, which is a very good idea. High-level logic should never be placed in stored procedures.
- donatj, on 10/12/2007, -2/+4heh, I've seen over normolization quite a few times.
- Olle, on 10/12/2007, -0/+2@rodball,
Yes, the identity column versus natural primary keys is a subject that can be discussed at length. There are some situations where one is better than the other. But there is a large group of situations in the middle where it pretty much comes down to personal preference (in my opinion) and that's why people can't agree I think.
two extreme examples:
1) Lets say you have 10 tables in a 10 level parent child relationship. Table A has a natural primary key. Table B has a foreign key relationship to A so table B has table A's primary key, plus another column as its primary key. If we keep going then table J, will need 10 columns for its primary key. Also, should you want to change the primary key column in table A then you are in deep doo doo. All its children must also be updated.
2) If you have a really big table and you happen to want to search in that table using one or more members of the natural primary key then using a natural primary key is more efficient. You will use less space since you don't have to carry an extra identity column, also you don't have to have an extra index for what would have been the natural primary key, in order to have fast searches. - CrackHappy, on 10/12/2007, -0/+2I prefer to use 'NULL' as the default value for all fields in a DB. Really confuses the ***** out of anyone else.... Why the hell is ISNULL() not working?????
- bluehouse, on 10/12/2007, -0/+2I can't totally agree with you on this one. Of course access shouldn't be used for large projects, but it's great for small implementations. I find it works really well if used as a data store only. What I mean by that is write your own front end (I like c#). Don't use their build in tools to make your database interface. As long as the DB doesn't get over 1 GB it's stable.
- CrackHappy, on 10/12/2007, -0/+2Sure, abstraction layers are cool and helpful things.
I think he's just talking about at what level do you do that abstraction? Also, should you really be building abstraction for ALL possible contingencies? I think the author has a point about at least starting with a single connection, but also think you are right in that abstraction is a useful thing.
The problem being that if you only start with a single connection and optimize toward that, you're going to end up having to completely reinvent the wheel if you ever need to change it. So building with a single connection, but built so that you build an abstraction layer between your application and the DB makes sense in that it doesn't really take much more time but gives you wiggle room later. - brlewis, on 10/12/2007, -0/+1I'm with qtuner. I wouldn't recommend this article to anybody.
- joshsz, on 10/12/2007, -0/+1You can actually scale out the database if you use something like MySQL Cluster. It's still limited in some respects, but it's getting much better.
- Olle, on 10/12/2007, -0/+1Ooooh, and I forgot my pet peeve.
11) If you are developing using an application server then don't listen to arguments which stem from when client/server 2-tier systems were all the rage and that is presented to you, without critical thinking, as common wisdom. - scruffles, on 10/12/2007, -0/+1He didn't say ODBC, JDBC, etc are bad things. He was saying you should write your SQL for the database your using now, and worry about next year later (in 11 months or so). I've seen this problem too.... developers writing an app for Oracle, but afraid to use an outer join because they MIGHT want to switch to some crappy little database in version 2.
- cully, on 10/12/2007, -1/+2All of these are pretty obvious. One that I disagree with is the "over-normalization" mistake. There is no such thing as "over-normalization." Your schema is either normalized or it isn't. Normalization doesn't mean putting some data into a new table when it could be in the original table. That isn't normalization at all. It's just silliness. Normalization means respecting the idea of 1 to 1, 1 to many, and many to many relationships. If you put data that has a 1 to 1 relationship into multiple tables, you aren't normalizing.
- dasil003, on 10/12/2007, -0/+1Stored procedures are very important, but I think you're exaggerating. The author is very upfront that SPs are critical for databases with multiple applications accessing them. However, the majority of databases are only accessed by one application. In that case, why have your business logic in two places? There's no way ALL the business logic can be crammed in the database, so might as well have it all in one place.
- rodball, on 10/12/2007, -0/+1Agreed. You can normalize to the point of absurdity, and worse, obscurity and inefficiency.
One thing that wasn't addressed in the article was the "natural" primary keys vs. some auto-incremented integer (that has no relationship to reality). I don't have strong opinions myself, but it has caused some long threads at thedailywtf.com. I wonder what his thoughts are on that subject. - CrackHappy, on 10/12/2007, -0/+1We're having data issues for a few reasons:
1) Relational integrity - the integrity is not built in the SQL Server, instead the app code has to do it. It's a proprietary system that is a pain in the ass to change as there is no actual code we as developers actually have access to, but have to deal with this wack ass GUI the crack heads over there came up with.
2) No resources for full-time database management. I'm having to do it (as a web developer!) when I am able which is not often enough. Thankfully I think I will be able to devote more significant time to the task and can improve overall integrity significantly soon.
#1 is the biggest problem as #2 becomes a lot easier if that is taken care of at a higher level than it is now. - beowulf, on 10/12/2007, -0/+1Putting business logic into a Stored Procedure isn't a good idea in point of fact. I've built systems both ways, and embedding your application logic directly into the database doesn't scale well. There are several major problems with it:
1. If you have multiple instances of your database and you make a logic change, you can't make the change in one place, recompile and redeploy. Instead, you have to hunt down each and every database where the logic resides and update the stored proc. It may not sound like a big deal, but if you have a big system with hundreds or thousands of stored procedures and dozens of installs, it quickly becomes a logistical nightmare.
2. It's very difficult to reuse the logic of stored procedures inside another component (think inheritance). If your logic is properly abstracted in code you can reuse your logic across multiple projects and maintain a single code base. If you make a change to a base level component you can easily propagate it to all of your other projects, something you can't easily do if the logic is in stored procedures.
3. Dynamic SQL execution is extremely difficult to accomplish from within stored procedures ( I know, I've tried it ). Good programming languages give you excellent templating and string parsing abilities that allow you to dynamically generate some very complicated cases. The upshot is you can often solve a problem with much much less lines of code than you would if you had to create a stored proc for each different case, or try to generate dynamic sql. In fact dynamic sql can sometimes seriously degrade a stored procedure's execution plan if the database system caches the execution plan from a previous run.
All that being said, there are some special cases where it makes sense to embed application logic into a stored procedure, however there are huge potential drawbacks and it really shouldn't be the default way you operate. In general, it's better to keep application level logic in the application logic layer. - sbritner, on 10/12/2007, -0/+1
It's a scary world we live in when transportation, government and financial service companies use Excel as their primary means to store and manage data. Tell them they don't need at least MS Access.
It's only the 21st century on the calendar. Our 21st century business infrastructure is still a chisel and stone compared to where we should be. - kenstone, on 10/12/2007, -0/+1I consider 5th Normal Form to be generally over-normalized, but it really depends. 3rd is a good sweet spot for many cases.
- Yossarian78, on 10/12/2007, -2/+3Stored procedures are the most ridiculous and obfuscated place to keep data!!! Why in God's name does anybody think that keeping *ANY* of your logic in seperate places is a good idea?
If your SQL queries are generating any of your application logic, then they should be in line with the rest of it. There is no reason you should have to be going through code, see a SQL call, then have to open up a SQL manager (some of which are better than others), track down the procedure, open and read it (which by the way, is written in a *DIFFERENT* language, and quite probably with a different variable name scheme), and then decode what is being sent back to your original application.
Taking that many steps means you end up forgetting what the heck you were doing in the first place. - dwclifton, on 10/12/2007, -0/+1The biggest mistake I see, and I see it all the time, are tables that have unnecessarily large rows. Do you need a bigint to store primary keys? Unless you're planning on putting a Google dataset's worth of records in there, probably not. I could go on and on, poor table design is the #1 mistake.
- renatoram, on 10/12/2007, -1/+2MySQL is fast, but getting slower. And it has a long record of quirks and missing features.
Moreover, it will need to rewrite the engine for the tables *another time*, since Oracle bought the InnoDB technology.
Just get PostgreSQL if you wand a "real" database for free. http://www.postgresql.org
Practically the same horsepower as Oracle, for free. - count_z, on 10/12/2007, -0/+0He author didn't say not to use SPs, he said that you shouldn't blindly put all of your logic into SPs (which happens a lot and is a pain to debug).
- usidoesit, on 10/12/2007, -0/+0No mention of OR mapping anywhere? Maybe that's just too big of a mistake. Straight SQL embedded in apps is good for complex legacy (or any) schemas. Otherwise you can't leverage SQL easily to debug. Funny how they ended up inventing EJB-QL. Just give me SQL. Why do we have to deal seamlessly with the database through the programming language? Always what is going to end up on my plate is some nasty data problem and on top of that I have to deal with an object layer that's in the way and really provides no added value.
- frakilk, on 10/12/2007, -1/+1tblAnswers lol
- mbourgon, on 10/12/2007, -0/+0Crackhappy - I don't honestly remember. I'm pretty sure it was a million or two records. And the database we had to hit - 5th normal form, IIRC, "given" (aka forced) on us from corporate. Ever see a 20-table join? I could write them in my sleep by the time I was done.
- cbehrens, on 10/12/2007, -0/+0My experience (20years) has been that stored procedures and a reasonably normalized database add a fair measure of modularity to any application. Many many times I have been able to fix bugs in distrubuted applications by fixing or adjusting stored procedures.
- mpswaim, on 10/12/2007, -0/+0 I prefer to use embedded SQL for most data accesses because we have multiple applications hitting our DB. If we make a schema change, it's easier to just update the affected applications than to rebuild the world, or to have n slightly different SPs that differ in what they return.
I reserve stored procedures for code that needs to do significant heavy lifting of data to return something meaningful to the user. My last SP had 2 temporary tables , and about a dozen selects/updates. - rjean99, on 10/12/2007, -0/+0After re-reading this article a couple times I will admit that there is a lot more that can be said on any one of these points. Also, I think much more could have been mentioned re: surrogate keys vs. user keys, foreign keys, nulls, defaults, etc. But overall, a good starting point or refresher on things to consider in database development. I realized we had NOT been doing a good job at versioning our schema's as changes are made. So I re-learned something.
On SP's: We use stored procedures when and where it makes sense, especially for performance reasons. We also use them for updating/returning incremental values in SQL, for doling out records in a round-robin work-flow scenario, etc. But generally, I consider SP's to be proprietary and non-standard. For instance, T-SQL syntax is not the same as P-SQL in Oracle and therefore migrating from one database to another would be very difficult if all access went through SP's. But it really depends on whether you are trying to be more database agnostic vs. language agnostic.
We use an nTier framework that has data abstraction classes which allow for dynamic SQL using parameter and field collections. This allows coding to interface in the business objects and the database can be changed fairly easily without affecting the business or UI tier.
Overall, some very good, albeit somewhat passionate comments on this so far. - koko27513, on 02/10/2008, -0/+0I've heard "Ok, now let's get a real good Database." can be rather disastrous.However, that can apply to just about anything, right?
http://www.VATOSPLACE.COM - mlayog, on 10/12/2007, -0/+0Good article. As a programmer, I see the disjoint between the developer and the DBA world quite often. I agree with some of the comments that is a matter of scale too.
- al4444, on 10/12/2007, -0/+0yes, good, very good
-
Show 51 - 72 of 72 discussions



What is Digg?