{"id":49888,"date":"2018-10-24T00:00:00","date_gmt":"2018-10-24T00:00:00","guid":{"rendered":"https:\/\/www.techopedia.com\/data-catalogs-and-the-maturation-of-the-machine-learning-market\/"},"modified":"2018-11-20T13:42:36","modified_gmt":"2018-11-20T13:42:36","slug":"data-catalogs-and-the-maturation-of-the-machine-learning-market","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/data-catalogs-and-the-maturation-of-the-machine-learning-market\/2\/33425","title":{"rendered":"Data Catalogs and the Maturation of the Machine Learning Market"},"content":{"rendered":"

This is the age of big data<\/a>. We get inundated with information, and businesses find it a challenge to manage and extract the value from it.<\/p>\n

Today's flow of big data entails not just volume, variety and velocity<\/a>, but also complexity. As identified by SAS in Big Data History and Current Considerations<\/a> that's a factor of the streams "from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems." (Want to learn more about big data? Check out (Big) Data's Big Future<\/a>.)<\/p>\n

Finding valuable insight is not a question of simply amassing as much data as possible, but of finding the right data. It's impossible to work through it all with manual processes. This is why more and more businesses are "turning to data catalogs to democratize access to data, enable tribal data knowledge to curate information, apply data policies, and activate all data for business value quickly."<\/p>\n\n\n\n
Free Download: <\/em><\/strong>Machine Learning and Why It Matters<\/em><\/strong><\/a><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

This is where data catalogs (sometimes also known as information catalogs) enter in the picture. As defined here<\/a>, they empower "users to explore their required data sources<\/a> and understand the data sources explored, and at the same time assist organizations to achieve more value from their present investments." One of the ways it does that is by enabling much greater access to data, among different types of users that can make use of or contribute to it.<\/p>\n

The Infonomics Imperative<\/span><\/h2>\n

Noting the dramatically increased demand for data catalogs at the end of 2017, Gartner dubbed them "the new black."<\/a> They were becoming recognized as a quick and economical solution "to inventory and classify the organization's increasingly distributed and disorganized data assets and map their information supply chains." The necessity for this has arisen due to the rise of "infonomics,"<\/a> which calls for applying the same meticulousness to tracking information as one does to managing other business assets. (For more on supply chains, see How Machine Learning Can Improve Supply Chain Efficiency<\/a>.)<\/p>\n

Gartner's take jibes with The Forrester Wave™: Machine Learning Data Catalogs, Q2 2018<\/a>. Over half of the survey participants in that report said they were planning on building up their data catalog implementation. Likely they were largely motivated by the fact that each had at least seven data lakes<\/a> in their organization. As the Gartner take on data catalogs explains, data catalogs are particularly useful for pulling out "the context, meaning and value of data" that is typically left in an unclassified form in a data lake.<\/p>\n

Forrester reports that more than a third of data and analytics<\/a> decision-makers were dealing with 1,000TB or more data in 2017, an amount reported by only between 10 and 14 percent the year before. Managing data on that scale is a growing challenge, or specifically, two challenges:<\/p>\n

“1) merging existing business processes<\/a> to source data to analyze it and implement insights and 2) sourcing, gathering, managing, and governing the data as it grows.”<\/p>\n

What Data Catalogs Can Do for Businesses<\/span><\/h2>\n

Gartner identifies specific ways in which data catalogs can improve an organization's flow of information and productivity:<\/p>\n