I stayed out of the discussion because I'm kind of burned out on benchmarks in general. I got fired up about benchmarks a while ago and even sent an email with some proposals to Curt. He was kind enough to respond and his response can be summed up as "What's in it for the DB vendor?". Great question and, to be honest, not one I could find a good answer for.
For the database buyer; a perfect benchmark tells them which database has the best mix of cost and performance, especially in data warehousing. This is what TPC-H appears to offer (leaving aside the calculation of their metrics). However, a lot of vendors have not submitted a benchmark. It's interesting to note that vendors such as Teradata, Netezza and Vertica are TPC members but have no benchmarks. The question is why not.
For a database vendor; a perfect benchmark is a benchmark that they can win. Curt has referred to Oracle's reputed policy of WAR (win all reviews). This why their licenses specifically prohibit you from publishing benchmarks. There is simply no upside to being 3rd, 5th or anything but first in a benchmark. If Oracle are participating in a given benchmark the simple economic reality is that they know they can win it.
This is the very nature of the TPC-H, it is designed to be very elastic and to allow vendors wiggle room so that they can submit winning figures. I'm sure the TPC folks would disagree on principle but TPC is an industry group made of up of vendors. Anything that denied them this wiggle room will either be vetoed or get even less participation than we currently see.
This is a bitter pill to swallow but seems unlikely to change. These days I'm delivering identical solutions across Teradata, Netezza, Oracle and SQL Server. I have some very well formed thoughts on the relative cost and performance of these databases but of course I can't actually publish any data.
By the way, the benchmark I suggested to Curt was about reducing the hardware variables. Get a hardware vendor to stand up a few common configurations (mid-size SMP using a SAN, 12 server cluster using local storage, etc.) at a few storage levels (1TB, 10TB, 100TB) and then test each database using identical hardware. The metrics would be things like max user data, aggregate performance, concurrent simple queries, concurrent complex queries, etc. Basically trying to isolate the performance elements that are driven by the database software and establish some approximate performance boundaries. With many more metrics being produced there can be a lot more winners. Maybe the TPC should look into it…