In-memory tools vs query-based tools
January | 2020
New data analytics platforms are utilizing the processing power of source data systems for reporting and analytics, without having to store anything in the memory. Does this mean that we are going back to the days when reporting tools were query based without their own processing or storing power?
Let me take you couple of decades back.
Disk based technology
It all started with disk-based technology, when data was stored in the form of multiple tables and multi-dimensional structures against which all reporting tools were used to query to get the information.
Be it OBIEE, BO (now SAP BO), SSRS etc, all used to run query against relational database management system (RDBMS) based on SQL technology like SQL Server, MySQL, Oracle etc. to fetch the result. RDBMS was designed for transactional processing, whereas BI queries required data aggregations that affected the performance.
Later OLAP came into picture, wherein data from OLTP systems were pre-aggregated and populated in Cubes to improve the query performance.
In-memory technology
In order to overcome the underlying challenges of disk-based technology, in-memory technology was introduced. In this, the data from OLTP systems were loaded into the system memory RAM, instead of hard disk.
This changed the way reporting happened, BI tools used in-memory technology to load data into the memory, using columnar databases and allowing users to query data stored in memory instead of disk. While storing data in memory, data was highly compressed hence allowing more data to be stored and queried quickly.
BI reporting tools like Qlik, Tableau got very popular with the advent of in-memory capabilities. SAP HANA , Exalytics are other examples of tools that store entire data warehouse in the memory.
Analytical storage (Cloud)
Now, with the emergence of high performance cloud-based analytics storage like Google Big Query, AWS S3 and others, BI tools are looking to leverage these as comparted to in-memory. Looker is one such example that is using the processing power of high performance data storage rather than replicating the data into the memory.
RESTful web services enable interactive analysis for massively large datasets and works in conjunction with Google storage. Other systems build for high performance analytics storage are Amazon Reshift, Snowflake. Even legacy BI tools like SAP BO, Oracle has come up with their BI versions for cloud.
Are we back to where it all started?
With cloud platforms built to provide high performance analytics storage, are we again back to query-based BI tools?
Though tools like Qlik, Tableau, Thoughtspot, Incorta are still going strong with their in-memory offerings, there is certainly a strong competition in the form of query-based BI tools. Recent acquisition of Looker by Google puts a stamp on that.
Industry :
Gaurav Savdekar
Principal Consultant, Decision Sciences, Data Analytics & AI, Wipro
Gaurav has about 14 years of experience in the Data & Analytics domain. He has worked on various BI and analytical tools and has been part of multiple implementations. He has also been involved in providing consultation, drawing up the BI roadmap, and evaluating multiple tools for various clients.