The need for data virtualisation for the modern, data-driven business
Jessie Rudd, Technical Business Analyst at PBT Group
Data virtualisation can be seen as the layer residing between disparate sources of data, whether stored in the cloud or across different database types each structured differently. With very few of these integrating with one another, if at all, virtualisation acts as a buffer to view data in an integrated manner.
Denodo, positioned as a Leader within the Gartner® Magic Quadrant™ for data virtualization tools, provides the following description. “Data virtualization is a logical data layer that integrates all enterprise data siloed across the disparate systems, manages the unified data for centralised security and governance, and delivers it to business users in real time. Data virtualization provides a virtual approach to accessing, managing, and delivering data without replicating it in a physical repository.”
This is important given the complex environments requiring users to have access to specific data in organisational databases. As a tool, virtualisation has this already in place without the organisation needing to deal with all the underlying complexities of having people directly access separate databases that are not integrated.
With such a tool, or layer, sitting on top of all the databases, it provides the ‘buffer’ between what the data consumer wants to see and the source data itself. The benefit of going the virtualisation route is that it offers a centralised solution that delivers access to databases of an organisation without having to deal with the inherent complexities of each data source. This can make it more cost-effective as one tool equals one license as opposed to trying to get database-specific solutions for each environment.
Furthermore, virtualisation ‘acts’ as a seamless integration tool with the associated benefit of simpler governance.
While data virtualisation is not a new concept, it has recently gained momentum as organisations are desperate to gain maximum benefit from all the data available to them. The urge to accelerate is a direct result of the pandemic, in an attempt to succeed, or simply just to survive. Think of data virtualisation as the middleman to get sight of all data in the organisation, pulling it all together and putting the power in users’ hands to analyse and draw insights from it – through data visualisation. A data virtualisation platform can become a data service provider to data visualisation tools.
As every business wants to become a data-driven organisation, the potential benefits from virtualisation become a tempting proposition. The promises of integration, unified data, centralised governance, no replication/duplication, and real time, are all important considerations that cannot be ignored. However, the success of this will depend on various factors, including the level of maturity within the organisation.
The mentioned benefits are all complex topics, and, depending on the organisation, relate to significantly sized projects and large budgetary implications.
Embracing benefits and minding pitfalls
Decision-makers must never assume that the adoption of a data virtualisation tool merely requires some drag-and-drop to solve an organisation’s data problems. Whilst the stated benefits can be achieved, a well-defined data strategy and supporting data architecture should justify the implementation of data virtualisation. The size of the organisation, volume of data, complexity of business rules, and state of the existing data platform are all important considerations in the establishment of data virtualisation. It certainly does not eliminate the need for typical data structures that we have become accustomed to. The data lake, data warehouse, or operational data store might still be required and ultimately be another data source to the data virtualisation layer.
Data virtualisation will not fix the underlying problem of poor data quality. It comes down to first getting the basics around data right, ensuring a solid data foundation is in place and then looking at these tools as a next step.
Of course, there are times when virtualisation will not be the best fit or is being used within a hybrid. For example, the efficiency and effectiveness of virtualisations combined with the robustness and best practices of well-architected data pipelines and the many utilities within the clouds. Ultimately, it depends on the organisation and what it is willing to do on the journey to become a data-driven enterprise.