I saw a fascinating post by David Linthicum on InfoWorld recently in which he discussed the Top 3 Data Integration mistakes which companies make and how to avoid them.
One of these is essentially being unable to answer the question ‘Where’s the data and what does it mean?’
For the sake of completeness the 3 mistakes in full are firstly; “failing to understand the types of data you will integrate”; secondly failing “to consider performance” and finally forgetting “about security and governance.”
Of those 3 mistakes the first one would seem to me to be absolutely fundamental. Indeed, Mr Linthicum himself says: “While this seems like something that’s obvious, most major data integration mistakes can be traced back to failures around understanding what data exists in the source and target systems.”
(With apologies to George Orwell and Animal Farm) What I mean by this is that whilst all data is merely a collection of 1’s and 0’s, there are differences in complexity and accessibility.
This means that answering the question ‘where’s the data?’ is not always as straightforward as one would assume or be led to believe.
For example, in the case of many data sources the task of finding the data and understanding how it is structured is relatively simple. In these instances, such as files, RDBMs, sensor data etc., the job can often be addressed quite easily by a variety of data integration, ETL or even modeling tools.
However, where that data is held in packages such as those from Oracle, SAP or Salesforce the task of locating what you need is more of a problem.
Traditional tools, including those from the application, as well as information management software vendors do not address it effectively.
This means that customers continue to adopt a variety of different approaches to discovering where their data is and what it means. These include trying to find documentation, asking internal technical specialists, hiring external consultants, searching the internet, using guesswork or trying to make do with partial solutions.
We explore these topics recently in a White Paper called 'Where’s the data?'.
In it we discuss the traditional methods of metadata discovery and why they do not work effectively. We illustrate how they can place a significant drag on the delivery of many types of project, including data integration. We also show how this can increase the risks associated with inaccurate data entering an information ecosystem.
Finally, we outline a different solution. Using a software product which is designed to meet the metadata discovery and navigation needs associated with complex packages.
This is Safyr®, the only product dedicated to enabling data professionals find their way around the data models of these large packages and to helping them find what they need, when they need it, quickly and accurately… thereby accelerating delivery and increasing productivity.