“Data is not smart in its own right; rather, it’s only smart if it drives smart decisions,” says James Kobielus, IBM’s Big Data evangelist, delivering the keynote address at the DATAVERSITY® Smart Data Online 2016 Conference. He considers his core focus, Big Data, as “a subset of the broader concept of Smart Data.”
Kobielus says that Smart Data is a strategy that sets the groundwork for smart decisions, rather than a particular type of data. “It’s a particular set of practices for how you can leverage your data for greater insights in a wider range of scenarios,” allowing smart decisions to flow organically from data that meets a range of criteria. Big Data is scalable, fast, and comprehensive, and because there are insights that are only available on a larger scale, in that respect, he says “Big Data is Smart Data.”
To be considered Smart Data, your data should enable insights that are trusted, contextual, relevant, cognitive, predictive, and consumable. Combining the attributes of Big Data with those of Smart Data can help you “harness the power of your data” and drive decisions more effectively.
What Really is Big Data?
Big Data has high volumes, travels at high velocity, and offers high variety, Kobielus says. If your data is scalable, it can provide a more powerful exploratory repository, allowing for a broader historical perspective. A higher volume provides an opportunity to see things at greater scale that you can’t see at low volumes, he says, like micro-segmentation of target populations, or fine-grained, second-by-second behavioral analysis.
He goes on to say that because Big Data moves at high velocities, it’s possible to achieve a granularity in terms of time that can’t be seen at the batch level. The more quickly data can be ingested, the more quickly questions can be posed, and the closer to real-time insights the data can provide. The more varied and comprehensive your data is, the more easily you can have a 360-degree view of any topic. By getting data from a variety of sources and formats – historical profiles, clickstream, geospatial data, social sentiment, from mobile devices, he says, it’s possible to fill in “a fine-grained portrait of who [your customers] are, what they’re doing, and what they’re likely to do.”
Kobielus remarks, “Big Data is a natural consequence of what I call ‘ravenous analytics.’” By asking more questions of the data, the natural impulse is to aggregate and correlate a broader range of data sources, and Big Data is a natural result.
What is Smart Data?
Kobielus defines Smart Data as a superset of Big Data. It is the “ability to achieve big insights from trusted, contextualized, relevant, cognitive, predictive and consumable data at any scale, great or small.” He says that even small data can drive for smart decisions if it has these attributes.
Smart Data is Trusted
The more consolidated, conformed, cleansed, consistent, and current your data is, the more likely you are to make the best decisions, he says. It stands as a single version of truth, the fourth V in the taxonomy of Big Data: Veracity. Kobielus says there’s a need for “a repository in your data environment where officially sanctioned systems of record are consolidated, after they’ve undergone a process of profiling, matching, merging, correction and augmentation. So it’s all about Data Governance and Master Data Management and working with a single version of truth.”
Smart Data is Contextual
Ideally, he says, you should describe, markup, tag, ontologize, semanticize, and schematize your data to contextualize it so it can be used to make evidence-driven smart decisions, and “here’s where rich Metadata becomes essential.” Context includes all the variables that express the full meaning of the data, its uses, and constraints. “The more you can contextualize data, the more relevant the downstream decisions that you make from that data can be.”
Smart Data is Relevant
Relevance refers to the need for data curation, Kobielus says. Curation involves classifying, reviewing, ranking, sorting, and recommending the data’s relevance to decisions by downstream users. It entails finding what is most useful to those users and what they need to do with the data. He adds that there are automated systems that can organize the data into Data Marts or package it for a variety of downstream uses.
Smart Data is Cognitive
Cognitive Computing refers to the ability to detect deep statistical patterns in unstructured content such as streaming media data, video, and images, he says, and it allows people to use tools such as Machine Learning, Deep Learning, and Artificial Intelligence to call out useful patterns in complex data. Cognitive Computing makes it possible,
“To algorithmically accelerate human learning by enabling you to drill automatically through extraordinarily complex datasets that increasingly involve images and audio and video, to find visual and pictorial patterns that human beings, unaided, can’t do at scale, the way machines can.”
Smart data is Predictive
This refers to the ability to drive evidence-based predictions into the full range of downstream decision points. Kobielus considers Predictive Analytics the heart of real-world experimentation, because practices like A/B testing are now often embedded into digital marketing practices and other operational business processes. “It’s all about modeling, mining, analyzing and scoring your data by its predictive power to support smart decisions.”
Smart Data is Consumable
Data is useless if it can’t be consumed by target users, be they human beings or be they automated components. Kobielus cites demands from the mobile world, social media, the Cloud, and the Internet of Things as driving the need for actionable insights and intelligence at all levels:
“You need to package up this Smart Data, this algorithmically-derived intelligence, and deliver it downstream to all the tools and applications, including your Business Intelligence (BI) environment, to drive smart decisions. That involves tailoring, targeting, personalizing, mobilizing, visualizing, and sharing your data-driven insights with downstream decision-makers.”
Big Data and Smart Data: Drivers for Smart Decision-Making
Kobielus says that “Big Data is an important part of the overall Smart Data strategy, but it is not the only thing.” He stresses the importance of using tools like the Semantic Web to make the data more self-describing, and the use of algorithmic tools like Machine Learning to pull out latent insights:
“Smart Data is all about the ability to derive insights and drive decisions based on the best, most contextualized, most relevant data that has been algorithmically analyzed, and with the results – [those] insights being driven down and pushed into all manner of applications, and dashboards, and devices where decisions need to be made from it. At a high, diagrammatic level, that’s what Smart Data is.”