Assessing your data ecosystem: What to evaluate, where inefficiencies hide, and how to optimise for change
Julian Thomas, Principal Consultant at PBT Group
In the first part of this blog series, I spoke about why assessing your data ecosystem is critical, and why it should be treated as a regular internal audit rather than a once-off exercise. The reality is that most organisations do not operate according to a single, uniform plan. Different parts of the business move at different speeds, often in different directions, influenced by priorities, people, and practical constraints.
That is exactly why the assessment matters. But once you have committed to doing it properly, the next question becomes more practical. What are you actually evaluating, where do the real inefficiencies sit, and how do you move from insight to improvement?
Most organisations think they know the answers. In practice, they usually only see part of the picture.
What to evaluate beyond the obvious
At a basic level, every data ecosystem has a platform. You have storage, and you have processing. That is the easy part, and it is usually the one best understood.
The problem is that many assessments stop there.
A meaningful assessment needs to look across four dimensions: technology, process, people, and governance. These are not independent; rather they are interdependent. They overlap and influence each other, and weaknesses in one area will surface in another.
On the technology side, you are not only asking which platforms exist, but how they are used. Where is the data stored? How is it processed? What role does each component play in the business context? It is not enough to know that a system exists. You need to understand what it is used for, and whether it actually supports the organisation’s needs.
Then there are the components that separate a mature ecosystem from an immature one.
A business glossary, a data dictionary, and data lineage are not optional extras. They are what allow the business to connect meaning to data. Without them, you cannot answer simple questions such as where a metric comes from, what it represents, or whether it is fit for use.
In many environments, these elements are either incomplete, out of date, or missing entirely. The result is predictable. Data exists, but it cannot be used with confidence.
Monitoring and observability are another area that is often misunderstood. It is not only about whether pipelines run successfully. You need visibility into how data is being used, where costs are incurred, how data ages, and where performance issues sit. Without that, you are operating blindly.
Auditability adds another layer. It is one thing to know where data comes from. It is another to trace what happened to it, at a detailed level, as it moved through the system. That level of traceability is what allows you to diagnose problems quickly, rather than spending days trying to work out what went wrong.
Where inefficiencies actually hide
If you ask most technical teams where the problems are, they will point to tools, platforms, or skills. In my experience, that is rarely where the real issues sit.
Inefficiencies are usually hidden in processes, governance, and organisational structure.
Take something as simple as onboarding a new data source. The question is not how long it takes to build the pipeline. The question is how long it takes from the moment the business requests it to the moment it is available in production. In many organisations, that timeline stretches to months, sometimes longer.
The same applies to accessing data or generating new insights. Data may exist, but if it takes excessive time and effort to find it, request access, and understand it, then, from a business perspective, it may as well not exist.
There is also the question of coverage. What is missing from the ecosystem? How much of the business need is actually supported? Many ecosystems are built around initial projects with dedicated funding, but struggle to evolve afterwards because there is no sustained funding model to bring in new data.
Then there is the business perspective, which is often overlooked. How easy is it for stakeholders to engage with the ecosystem? Do they understand what is available? Do they know how to use it? Do they trust it?
You can build what looks like a high-performance platform, but if the business cannot engage with it, it will sit unused. A technically sound ecosystem that is difficult to use is still a failing ecosystem.
The role of structure, funding, and governance
When you start to unpack these inefficiencies, you quickly see that they are not primarily technical problems. They are structural.
Funding models determine what gets built and what gets ignored. Management models determine whether delivery scales or becomes a bottleneck. Governance determines whether the system enables progress or restricts it.
A heavily centralised model may work during initial implementation, when there is a large budget and a dedicated team. Over time, that model often becomes a constraint. Demand increases, resources do not scale, and the backlog grows. The result is predictable. Business units begin to find their own ways of working, and fragmentation increases.
A federated or hub-and-spoke model, where standards are defined centrally, but development capabilities are distributed, tends to be more sustainable. It allows the organisation to scale while maintaining coherence.
Governance also needs to be approached differently. It is not about removing governance, but about making it fit for purpose. A single, rigid standard applied to every use case does not work. Different use cases require different levels of control, and the ecosystem needs to accommodate that without compromising integrity.
How to optimise without overcomplicating
There is a tendency to treat optimisation as a separate phase. In reality, if the core components are implemented correctly, much of the need for optimisation disappears.
That said, there are some consistent areas of focus.
Automation plays a key role in reducing manual effort and improving consistency. Governance needs to be streamlined so that it enables rather than blocks progress. Collaboration across teams is essential, particularly in larger organisations where skills and resources are distributed.
But the most important point is this.
A data ecosystem without effective metadata, lineage, searchability, and monitoring is guaranteed to fail.
You can invest heavily in platforms and infrastructure, but if users cannot see what data exists, understand it, and trust it, the investment does not translate into value. The system becomes expensive, complex, and underutilised.
What a healthy ecosystem actually delivers
When an ecosystem is functioning well, the benefits are not theoretical.
People can access the data and insights they need, when they need them, at a reasonable cost. The organisation can support business initiatives without excessive delays or overhead. The environment is agile enough to adapt as requirements change.
Over time, it also shapes the data culture of the organisation. It creates a shared understanding of how data is used, how decisions are made, and how value is generated.
Importantly, it helps guard against bias. When decisions are guided by structured, transparent processes rather than individual preference, the organisation is less exposed to subjective or inconsistent choices.
From assessment to action
The final step is often the most difficult. An assessment, no matter how well executed, only creates value if it leads to change. That depends less on the quality of the assessment itself, and more on how it is positioned and who is driving it.
If it is treated as a compliance exercise, it will produce a report that is filed away. If it is initiated at the right level, aligned to strategy, and shared with the right stakeholders, it becomes a tool for driving meaningful change.
That is ultimately the difference. Not whether you assess your data ecosystem, but whether you are prepared to act on what you find.
