Some data was released yesterday that purports to show that SaaS BI customer's are very pleased with it's ease of use, etc., etc. Boring. Seriously, I really like the idea of SaaS BI but I haven't seen anyone making great leaps forward. I'd say that they *can't* take us forward because of the box that they've painted themselves into. The box actually has a name: it's called BI.
The BI sandbox
Eh? What? Here's the thing; BI as we currently know it is the last stage in the information pipeline. It's the beautiful colours on the box that holds the cereal. But it's not the cereal and it's not even the box. It is *very* important (who would buy cereal in a plain cardboard box?) but is also *very* dependent on other elements in the pipeline.
I don't want to get into a long discussion about definitions of BI. Suffice it to say this: why are terms like 'data warehouse' and 'OLAP cube' still prevalent? Simply because BI does not imply data gathering, preparation and storage. Last example on this theme. If I tell you I'm a Business Intelligence manager, what would you guess is my remit? Does it include the entire data warehouse? The OLAP cubes? All of the ETL processing? No? It could but it rarely does.
It's all about the data
I once worked for a clever chap who's mantra was "it's all about the data". His daily struggle was to get the business to invest more time, effort and money into the data itself. It was a hard fight. We had a very fast data warehouse (NZ) and some perfectly serviceable BI software (BO) and nearly a dozen newly minted graduates to turn out our reports. What we did not have was a strong mandate to get the data itself absolutely sorted, to get every term clearly defined and to remove all of the wiggle room from the data. As a consequence we had the same problems that so many BI teams have. Conflicting numbers, conflicting metrics, and political battles using our data as ammunition.
Data is the 'other' 90%
I'd estimate that gathering, preparing, and storing the data for BI represents at least 90% of the total effort, with analysis and presentation being the last 10%. I really hope no one is surprised by that figure. I'd think that figure is consistent for any situation in which decisions need to be made from data. For instance a scientist in a lab would have to spend a lot of time collecting and collating measurements before she could do the final work of analyzing the results. A research doctor conducting a study will have to collect, organize and standardize all of the results study data before he can begin to evaluate the outcome.
It's NOT about speed
One of the tragedies of the Inmon-Kimball data warehouse definition war is the data warehouse has been conceived as something that you create because you want to speed up your data access. It's implied that we'd prefer to leave the data in it's original systems if we could, but alas that would be too slow to do anything with. What a load of tosh! Anyone who's been in the trenches knows that the *real* purpose of a data warehouse is to organize and preserve the data somewhere safe away from the many delete-ers and archive-ers of the IT world. We value the data for it's own sake and believe it deserves the respect of being properly stored and treated.
Nibbling at the edges
So, back to the topic, how does SaaS BI help with this issue? Let's assume that SaaS BI does what it claims and makes it much easier for "users" to produce reporting and analysis. Great, how much effort have we saved? Even if it takes half as much time and effort we've only knocked 5% off our total.
The real opportunity
And finally I come to my point: the great untapped opportunity for the SaaS [BI-DW-OLAP-ETL] acronym feast is the other 90% where the most of the hard work happens. Customers are increasingly using online applications in place of their old in-house apps. Everything from ERP to Invoicing to call centre IVRs and diallers are moving to a SaaS model. And every SaaS service that's worth it's salt offers an open API for accessing the data that they hold.
The holy grail - instant data
This is the mother-load, the shining path for data people. Imagine an end to custom integrations for each customer. Imagine an end to customers having to configure they're own ETL and design their own data warehouse before they can actually do anything with their data. The customer simply signs up to the service and you instantly present them with ready to use data. Magic. Sounds like a service worth paying for.