As organizations and companies grow, data is frequently stored and managed by different teams or systems. As a result, data silos are often the consequences of organizational structures within a company combined with tunnel vision on how the data might be used downstream. When looking to gain insights across an organization, these data silos can be problematic because they prevent business managers and data analysts from enjoying an overall, strategic view that transcends areas of specialization. With the emerging ubiquity of generative, data-driven AI and other modern analytics approaches, the problem will only get worse.
At the transactional level, the specificity and fitness to purpose of data is useful. Each function – marketing, manufacturing, finance, engineering – separately generates and tries to derive the most value from its data, usually with little regard to how other functions could use it. But to develop strategy that adequately represents all functions, it’s necessary to access and use all the data in the enterprise. Data silos prevent that.
This article examines why data silos are problematic and how you can begin to overcome the obstacles they pose to creating a full view of your organization.
What are data silos?
Data silos are separations that prevent data in one part of your organization from being accessed – or even found – and assimilated by another part. The separations make it difficult to embrace and analyze all of your organization’s data and to derive the most benefit from it.
As departments gravitate toward platforms and tools (e.g., Oracle, SAP, MySQL, Cassandra), it’s not unusual to discover a wide range of data sources at work in your organization. If left unchecked, that variety of sources also brings about data silos.
Why do they happen?
In most cases, data silos arise because different parts of the organization regard their data differently and invest in technology in different areas and at different paces. This often arises in the context of modernization and legacy data in old IT systems. Most companies hesitate to migrate systems with legacy data: the systems still work properly and there is little obvious benefit to the high cost of replacing them.
For example, relational database management systems (RDBMSes) are designed for structured data. How should you store unstructured data like photos and video? That’s a data silo waiting to happen, either because you decide to use NoSQL databases or because you turn to hybrid databases.
Data silos are also related to data lockout. An increasing reliance on software-as-a-service tools that don’t allow exporting or external use of the data makes it difficult to make data available. Which in turn, is detrimental to your goal of getting your data in front of the people who can turn it into valuable insights.
It’s difficult to take advantage of your organization’s data when it is spread among data silos.
Why are data silos problematic?
Data silos cause problems for data analysts and IT administrators for several reasons.
They introduce latency, which wastes time and resources while creating the risk of data quality and consistency issues
In the traditional model of decision support and business intelligence, you first created or captured transactions at different sites around the enterprise. Next, you aggregated it from those varied sources into a data warehouse and figured out a way to fit it all together. Then, you ran processes to extract, transform, load (ETL) and massage the data to answer questions like “What is our book-to-bill ratio this quarter?” or “What is the burn rate on XYZ Project?”
But the separation among data silos meant that there was always latency – a matter of days or weeks. The time and resources you put into roping all the data together kept you from making decisions based on current data.
They can lead you to execute strategy based on flawed data
On the path to becoming a data-driven organization, you learn to inform all decisions with data. But a decision based on flawed data is, arguably, worse than no decision at all. If you predicate a strategic move on flawed data, you can suddenly find yourself too far down the wrong path.
That applies to poor data quality as well. It is always fair to pose the question, “Can we trust the quality of our data?” It’s an almost insurmountable hurdle to growing your bottom line when decision makers are uncertain about the quality and reliability of the data they’re using.
They leave you with an incomplete view of the business
Naturally, when you have data silos, you have an incomplete view of your data. Moreover, you may also face the problem of dark data. Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing).” You don’t use the data, but you don’t delete it either. How can you say that you know what’s happening in your organization when you can’t access a certain (usually unknown) percentage of your data?
They complicate data security and regulatory compliance
If your efforts at analysis and business intelligence are thwarted by data silos, your efforts at data security and compliance probably are as well. In other words, if you cannot see all your data, how can you be certain that you’re handling it in a manner that is both secure and compliant? Can you be sure that the owner of each silo – Marketing, Engineering, Operations, Manufacturing, etc. – is enforcing compliance on its own? That’s a lot to presume.
Don’t forget that protecting the privacy of data – whether personally identifiable information (PII), your company’s intellectual property or your customers’ data – extends beyond a particular data set. You’re on the hook for protecting privacy back to the source of the data. If sensitive data inside one of your silos is somehow unprotected, it puts the company at risk.
They spawn duplicate data platforms and processes
Even if you have clear insight into your data silos, you may still have to comply with a requirement to identify organizational data within them. The search for that data may give way to duplicate processes and combing through duplicate data platforms around the company.
Once your organization has mustered the will to be data-driven, you’ll find that identifying that data is just the first step. Your process of asset discovery rapidly expands to identifying relevant data that falls outside of those regulatory compliance requirements. Discovering all of your data puts you squarely on the road to acting on your data.
Breaking down data silos
Once you recognize the data silos in your organization, breaking them down and democratizing data will become a priority. Consider some relevant strategies to make that happen.
Develop a data-driven culture
Collaborating on the development of a data-driven culture is a big step in eliminating data silos – particularly when you have the buy-in and involvement of your C-suite. Whatever excitement you may generate among data analysts and their managers will be multiplied once the execs are seen as champions of the effort to become data-driven. For that matter, since execs want to base their strategic decisions on high-quality data, it’s in their interest to help break down data silos.
Promote data integration
Data silos keep data in one part of the enterprise from fitting with data in another part. The goal of a modern data architecture is to promote smooth integration and interoperability of data without the degradation that comes from transforming and massaging data together. You can achieve that goal when you establish and maintain standards for capturing data in your organization. When data across the enterprise is designed to fit together, you spend less time in transformation and more time in analysis and decision making.
Pursue enterprise data management and governance
Do the owners of data in the silos perceive them as silos? What’s in it for them to keep the data there, inaccessible to others in the organization? If you mainstream practices like enterprise data management and data governance, then the incentive for breaking down silos comes about spontaneously.
Many business managers associate data governance with policing. They have experienced it as a discipline aimed at telling them what they should not do with data. However, data governance has evolved as a set of guidelines for deriving the greatest benefit from a data set without incurring needless risk. That means that users take responsibility for and enjoy the benefits of using data profitably. Data quality improves along with inter-operability, data silos can disintegrate and users can easily see the results of derived insights.
Conclusion
Data silos tend to arise as a consequence of differences among parts of the business. As departments and divisions evolve separately, they develop their own approach to describing and managing their data sets. It never seems like a silo until the time comes to view the enterprise’s entire data landscape. Then, the silos appear.
Data silos are problematic, unquestionably. But, they are not insurmountable, and breaking them down is a big step in deriving full value from all your organization’s data. By developing a data-driven culture, promoting data integration and pursuing data management and governance, your enterprise can reduce the impact of data silos on deeper analytical insight and greater profitability.