Tableau Software Adds In-Memory Database Engine
Hard to believe, but it's more than three years since my review of Tableau Software’s data analysis system. Tableau has managed quite well without my attention: sales have doubled every year and should exceed $40 million in 2010; they have 5,500 clients, 60,000 users and 185 employees; and they plan to add 100 more employees next year. Ah, I knew them when.
What really matters from a user perspective is that the product itself has matured. Back in 2007, my main complaint was that Tableau lacked a data engine. The system either issued SQL queries against an external database or imported a small data set into memory. This meant response time depended on the speed of the external system and that users were constrained by the external files' data structure.
Tableau’s most recent release (6.0, launched on November 10) finally changes this by adding a built-in data engine. Note that I said “changes” rather than “fixes”, since Tableau has obviously been successful without this feature. Instead, the vendor has built connectors for high-speed analytical databases and appliances including Hyperion Essbase, Greenplum, Netezza, PostgreSQL, Microsoft PowerPivot, ParAccel, Sybase IQ, Teradata, and Vertica. These provide good performance on any size database, but they still leave the Tableau user tethered to an external system. An internal database allows much more independence and offers high performance when no external analytical engine is present. This is a big advantage since such engines are still relatively rare and, even if a company has one, it might not contain all the right data or be accessible to Tableau users.
Of course, this assumes that Tableau's internal database is itself a high-speed analytical engine. That’s apparently the case: the engine is home-grown but it passes the buzzword test (in-memory, columnar, compressed) and – at least in an online demo – offered near-immediate response to queries against a 7 million row file. It also supports multi-table data structures and in-memory “blending” of disparate data sources, further freeing users from the constraints of their corporate environment. The system is also designed to work with data sets that are too large to fit into memory: it will use as much memory as possible and then access the remaining data from disk storage.
Tableau has added some nice end-user enhancements too. These include:
- new types of combination charts;
- ability to display the same data at different aggregation levels on the same chart (e.g., average as a line and individual observations as points);
- more powerful calculations including multi-pass formulas that can calculate against a calculated value
- user-entered parameters to allow what-if calculations
The Tableau interface hasn’t changed much since 2007. But that's okay since I liked it then and still like it now. In fact, it won a little test we conducted recently to see how far totally untrained users could get with a moderately complex task. (I'll give more details in a future post.)
Tableau can run either as traditional software installed on the user's PC or on a server accessed over the Internet. Pricing for a single user desktop system is still $999 for a version that can connect to Excel, Access or text files, and has risen slightly to $1,999 for one that can connect to other databases. These are perpetual license fees; annual maintenance is 20%.
There’s also a free reader that lets unlimited users download and read workbooks created in the desktop system. The server version allows multiple users to access workbooks on a central server. Pricing for this starts at $10,000 for ten users and you still need at least one desktop license to create the workbooks. Large server installations can avoid per-user fees by purchasing CPU-based licenses, which are priced north of $100,000.
Although the server configuration makes Tableau a candidate for some enterprise reporting tasks, it can't easily limit different users to different data, which is a typical reporting requirement. So Tableau is still primarily a self-service tool for business and data analysts. The new database, calculation and data blending features add considerably to their power.