40 Comments
- inactive, on 10/12/2007, -1/+16150 MB *is* reasonably small. The comments in this thread indicate pretty clearly that the general Digg-reading population has little to no experience building high-volume enterprise-class applications.
eBay is not just "an auction site." Its functionality encompasses not just bidding and selling, but also listing, billing, Buy It Now, stores, search, and tons of other functionality, with each subsystem rivaling the complexity and featureset of such dinky "Web2.0" applications like digg, configured to serve simultaneously in multiple languages and multiple countries in slightly yet fundamentally different ways. Most large sites do not operate using the beginner-style "one PHP page for each function" pattern, because they need a central executable to marshal data, verify it, initialize common data structures, and dispatch it off to the middle-tier services. This makes the executable quite large, even when optimized. There's clearly not a single person here who's worked with an application of any notable scale, or you'd know that this is pretty typical for systems of such size and complexity. I know for a fact that PayPal and Amazon's main executables are also of similar size.
Joel Spolsky complained a few years ago about ignorant kids on slashdot attacking Microsoft for being slow and buggy (see http://blogs.msdn.com/ericlippert/archive/2003/10/28/53298.aspx) without having any grasp of the development process for real software. It sure looks like digg is trawling the same cesspool today. - febryle, on 10/12/2007, -4/+17Anything with the phrase "Web 2.0" should be buried, IMHO. I mean, isn't it time for Web 2.1 to come out already? At least, a beta? :)
- Kilroy2004, on 10/12/2007, -0/+9Yeah... and since when is Ebay considered Web2.0? As far as I know... nothing has changed on Ebay to make it worthy of Web2.0-ness.
- elqed, on 10/12/2007, -0/+9well... I work there and that story is a wee bit off, the author of that story needs to learn how to read and possibly use google as the only c++ on the site now is in the search engine, everything else user visible is java and written in a framework called V3; as can be seen here - http://www.sun.com/service/about/success/ebay5.html - a case study written over three years ago.
- Yarnage, on 10/12/2007, -1/+7The entire site has been compiled 2 million times. Sounds like their developers go with the "compile and see if it works" route to programming...
- Vandel405, on 10/12/2007, -0/+5"boast 3.3 million line C++ ISAPI DLL (150MB binary)"
"are hitting compiler limits on number of methods per class."
I don't see how this could possibly be construed as positive.
Especially the second statement. There is no excuse for that. Perhaps we should send them a couple hundred copies of Mr Fowler's book. - damber, on 10/12/2007, -1/+6Does anyone know what they mean by 'build' in this statement (last sentence):
====
# three Solaris servers: build and deploy eBay.com to QA; compile Java & C++; consolidate/optimize/compress XSL, JS and HTML
# time to build site: was once 10 hours; now only 30 minutes
# in the last 2.5 years, there have been 2 million builds.
====
In particular the last sentence.......
Wouldn't that be 1.5 builds per minute.. ??
Doesn't that kinda contradict their 30 minute boast ?
Not to mention that they must have the least stable system in the world, with all those changes...... - seopher, on 10/12/2007, -0/+5WHY the hell is this web2.0?! Buzzwords != being dugg.
- damber, on 10/12/2007, -0/+4my thoughts exactly - a lot of the statements, such as the 'we probably use every major storage providers products' and those you mention above are *not* things enterprises would want to boast about...
- elqed, on 10/12/2007, -1/+4Well, amazingly enough - there are lots of developers at ebay many of them work independently so 1.5 builds/min doesn't contradict the build time.
- inactive, on 10/12/2007, -0/+3Well its not like they have to boast anyway.. The fact that it runs as well as it does with the amount of use it gets and storage that it has to take into account daily is proof enough that what they are doing simply works
- jawadde, on 10/12/2007, -0/+3nice hearing it from someone inside... does the part of the story about the 150MB DLL make any sense ? About "pushing the compiler to its limits" ?
cause it seems insane to have such a complex codebase for a webapp that is in fact not that terribly complex - elqed, on 10/12/2007, -0/+3ebayisapi.dll is a beast. I've never had to work on it (thank god), but they still keep the source in scm, for laughs I assume. I don't believe it ever got that big, a similar isapi dll used for internal stuff was in many ways more complex and nearly as big was ~30M unoptimized compiled with VC++ 5.0. Tools have come a long way since then.
- jawadde, on 10/12/2007, -2/+5FTA : "boast 3.3 million line C++ ISAPI DLL (150MB binary)"
duh... 150MB code for an online auction site ! I mean, i can understand that the DB backend is huge, but I would expect the code to be "reasonably" small...
I really wonder what stuff they put in there - inactive, on 10/12/2007, -0/+2Pretty poor writing, although that itn't exactly rare on the web. What, for example, does "near-no-latency" mean? It means "some latency". What's wrong with "very fast"?
- elqed, on 10/12/2007, -0/+2well.. as I said below... the author of that story needs to check his facts. 1B/day is pretty old... last time I was in the NOC (bright and early last Monday morning) the big screen on the wall said one colo was pushing ~24gb/s... while I know that ebay's home page is on the porky side, thats still a _lot_ of hits.
- Dcherub, on 10/12/2007, -0/+2"Among Web 2.0 companies, San Jose, Calif.-based eBay is up there with Google, Amazon, Yahoo, eHarmony, Digg.com, and social networking sites MySpace.com and FaceBook.com as far as traffic, popularity and profitability are concerned"
eBay has certainly gotten pretty big to compete with digg in terms of popularity and profitability!
eBay gets a billion page views a day (apparently), just like digg. - damber, on 10/12/2007, -0/+2elqed,
you did read the bit where they apparently use 3 solaris servers to do the builds and with those servers it takes 30 mins now ??
To create a single 'ISAPI DLL' you have to build all the source together - so each developer is unlikely to be doing that on just 3 sun machines based on their numbers........ regardless of how many developers there are. - elqed, on 10/12/2007, -0/+2hehe. not to sound condescending here, but in many instances raw capacity is not quite as important as "speed" (I/O's per sec). 13 750G drives deliver the capacity, but probably won't deliver the I/O to do 1B complex selects per hour. One of the few accurate statements in the article is that ebay uses lots of Hitachi SAN storage, it's rather expensive (I promise it costs _far_ more than $6k) and pretty damn fast - it will do 1B complex selects per hour with room to spare (people to do capacity planning like that room).
- jawadde, on 10/12/2007, -0/+2you forgot to mention that digg and ebay are at the same level of profitability as well :-)
- elqed, on 10/12/2007, -0/+2@damber... please read my comments below... I have issue with this story because the author is stunningly wrong in many (most) of the statements made in this article.
The presentation tier is win2k basically doing xls transforms. The application tier is java (V3) running in websphere (at least it was a few months ago) also on win2k. The "back end" is quite a bit more complex with _lots_ of servers.
If you want more detail on part of the backend watch http://www.podtech.net/scobleshow/technology/1185/meeting-ebays-top-developer-eric-billingsley (not that I want to pimp scoble's show... I was just really curious when I saw him walking around my floor a few weeks ago, and the interview is reasonably good) - elqed, on 10/12/2007, -0/+1yes. I read the story... lemme check our build system (called ICE)... right now I can see ~100 build servers in the score board. The author is factually wrong on many many things. The number of build machines is one of them.
- etnu, on 10/12/2007, -6/+7There's no reason for a 150MB DLL. The claim of "enterprise" is *****. Every single system I've ever encountered that was that bloated was due to a simple lack of modularity.
I've worked on lots of such "enterprise" systems. The reason the libraries get so bloated is simple: mixing static & dynamic libraries, and templates.
The first problem is huge. Imagine this:
binary A links against dynamic libraries B,C,D,E, and F
Each of B,C,D,E, and F links in the same static library (G), and the duplicate copies of the code can not be optimized out since the linker isn't aware of the other libraries at link time. D'oh!
Assuming G is a 2MB library, you've now got at least an extra 10MB of waste being loaded into memory.
Of course, in practice this actually winds up being something more like a 5MB library being linked against in 10 or 20 places.
The next problem is templates.
If you're not familiar with C++, you might not understand this, so ignore it.
A typical C or C++ library has two files: a header (which has the function prototypes and/or class definitions) and an implementation file (which has the actual code).
One of the limitations of C++ templates is that you must include the implementation code in the header file.
Why is this a problem?
Well, because now any system that needs to use your templated functions / classes needs to have a copy of the code compiled in. You get many copies of the exact same code. Yikes!
The next problem that templates cause is that a complete copy of the code is generated for every variation of the template. Consider the following stl containers:
map
map
This will include 2 FULL copies of the stl "map" class (plus one copy of the stl "string" class). If you have significantly complex applications, you run into crap like this:
MyEnterpriseyServiceHandler
MyEnterpriseyServiceHandler
MyEnterpriseyServiceHandler
Of course, a C programmer would just use void* and a type flag for this type of thing, but that wouldn't be "enterprisey" enough. If you genuinely think this is "more maintainable" or "less error prone", you're a horrible software professional and should quit.
Anyway, anybody who actually thinks that 150MB binary (dll, executable, or anything) is acceptable doesn't know how to write software. I'll bet a year's salary on that statement.
I've got my share of horror stories working with eBay's engineers, but I'll give them the benefit of the doubt and assume that their code isn't actually all that bad. I do find it frightening that this article is suggesting that there is essentially one great big handler for the entire eBay site. If that's true, then it's no wonder that eBay hasn't done anything exciting (other than buy other companies) since they first started.
Statements like "...are hitting compiler limits on number of methods per class." make me laugh. Surely eBay's engineering isn't THAT bad, right?
Right? - KIERANMULLEN, on 10/12/2007, -0/+1Article was so so.. But the worse thing NO PICTURES! What is the point of Web 2.0 crap buzzwords if you cant even come up with pictures of what you are talking about?
Lame - elqed, on 10/12/2007, -0/+1@damber... one sanity check... do you _really_ think isapi dll's for a windows box are (were) compiled on solaris, I don't know that many glutens for punishment? :)
- damber, on 10/12/2007, -0/+1:-) that didn't even cross my mind (obviously you can build ISAPI DLL's on a Solaris.. but are/would they, is a good point)... I don't think it actually states which of the servers are used in the presentation tier - I was just unhappy about the way the numbers were adding up based on the 'facts' provided in the text.. it really doesn't make sense as it is.
- Yarnage, on 10/12/2007, -0/+1Maybe they build that much but it's put into a queue so the last verison that gets built was added to the queue... several weeks in advance? Dunno
- Yarnage, on 10/12/2007, -2/+3Seriously, I don't see the compiled DLL being over 1MB with their site. Some applicaitons have very complex methods and classes and have much smaller DLLs.
Maybe they embed lots of resources into the DLL (i.e. images, possibly video). - biotique, on 10/12/2007, -0/+1I was just wondering if Preimesberger could possibly squeeze Strong's name in more sentences because there just wasn't enough of "Strong said" and "according to Strong".
Who else was interviewed that we need to be reminded his name at every step?
Final verdict: Good info, _weak_ article. - tearmeapart, on 10/12/2007, -0/+1Check out http://www.bea.com/content/products/weblogic/portal/ .
A "Hello World" application is around 120 MB, but if you include WebLogic, Apache, and Java, the simpliest of applications is close to 1 GB.
150 MB is big when you compare it to other C code, but compared to Java applications, it is almost nothing. - tommorris, on 10/12/2007, -0/+1eBay is cool. Web 2.0 is cool. In someone's mind, that means they are somehow related.
- geezusfreeek, on 10/12/2007, -0/+1Maybe the reference to Web 2.0 isn't about style or AJAX, but more about social interaction, an idea very common across most Web 2.0 sites. While the term "Web 2.0" may be new, the idea that it describes is not. I believe maybe this case is simply an application of a new term to an old web site.
- geezusfreeek, on 10/12/2007, -0/+1"eBay's application servers are hitting compiler limits on number of methods per class."
I haven't ever had the "privilege" of working on a large-scale system of this sort, but doesn't this sound like terrible design? What good is an object-oriented programming environment if you are not minimizing scope to a manageable level? It seems like there would be little advantage over purely procedural code. - damber, on 10/12/2007, -2/+2Yarnage, I think you missed the last part of my post... the first sentence was mostly [sarcasm] - the last sentences make my point.. 1.5 builds per minute is impossible with a compile time of 30 mins, they would need 45 machines, which would result in 45 different builds................ besides, who 'constantly' (all day every day) builds every 40 seconds..!?!
- mattyfu, on 10/12/2007, -0/+0A $50 purchase that they made money off of...
- mikeazorin, on 10/12/2007, -1/+110 terabytes might be a bit of data, but it's not as huge as it used to be. 10 terabytes storage would cost only about $5,850 at about $450 for 750GB HD.
- iQuinn, on 10/12/2007, -1/+1Do people still use Ebay? I thought it was dead! I really don't understand why they need to add servers; I figured that they would be selling them off, or at least renting them to growing (rather than dying) companies. Then again, I just lost $50 on a crappy Ebay purchase, so maybe I am jaded.
- rbanffy, on 10/12/2007, -1/+1"are hitting compiler limits on number of methods per class."
Let's stop pretending it's a class and just call it a library. - damber, on 10/12/2007, -7/+2... wrong reply..


What is Digg?
Digg is coming to a city (and computer) near you! Check out all the details on our