AI/Internet of Things

Five (5) Steps in Building a Data Approach for AI

By February 10, 2019 No Comments

Our data-centric world is driving many organizations to apply advanced analytics that use artificial intelligence (AI). AI provides intelligent answers to challenging business questions. AI also enables highly personalized user experiences, built when data scientists and analysts learn new information from data that would otherwise go undetected using traditional analytics methods.

AI-driven analytics search more deeply into organizational data, deriving smarter insights that can give businesses a powerful competitive edge.

Applying adverse thinking to AI analytics

A well-considered data approach is essential from the start. When organizations identify a business problem to be solved—and the decisions to be supported by the analytics—they reach the point where they need to think adversely about the data required to solve that problem. Here’s a five-step process for helping ensure a successful AI analytics project.

1. Make a plan

Many enterprises struggle with data silos that can render a unified view of analytical data highly challenging. Achieve clarity on the goals of your analytics project first. Then, identify potential data sources across the enterprise. Integrating this data may require a data lake in addition to conventional enterprise data warehouses.

For example, relational databases include a wealth of structured, quantitative data. Quantitative data is useful for answering questions such as how many units were sold and when—and with what other products. However, structured data is much less useful for questions such as which product might have been sold with another or suggesting a new line of business to pursue. Augmenting structured data is necessary to answer these kinds of soft, strategic questions.

2. Bring together a diversity of data

Data required to answer strategic questions is often qualitative in nature. Qualitative data generally comes from unstructured sources, such as text documents or notes, external website content, social media posts, and images. Organizations need to determine how they can such data to get additional value.

One example might involve the Internet of Things. Organizations with sensor data streaming in from a smart device, for instance, might augment the quantitative data with engineering notes or other types of softer data to enhance machine reliability and repair prediction.

3. Define the data architecture

Organizations that have undergone mergers and acquisitions or have diverse lines of business often have many diverse data sets—including different views of the same data. This situation raises several questions: Who owns the data? What is the best version to use? What is the right data architecture?

To address these questions, organizations need to move beyond database administration to architect data across diverse sources. The data sources must then be integrated in a meaningful way. While the initial prototyping for an analytics project might be adhoc, a repeatable data architecture and data flow is necessary for long-term success.

A repeatable data flow can pull in data from various end points, spanning operational business processes to mobile devices to sensors. Enterprises may want to work with a company like IBM that offers the tools and expertise required for defining the data architecture.

4.Establish data governance

Another key consideration is the data governance that helps ensure information variety from diverse sources is trustworthy, particularly for organizations in regulated industries. Along with protecting security and privacy, maintaining visibility into the data supply chain is also critical. They need to know where the data came from to validate it. Credible analytics models require the ability to detect and track down any issues in the data pipeline.

Emerging technologies provide more data governance for advanced analytics. For example, Horton works, IBM and others are part of the open source Apache Atlas project that has as its mission to bring data governance to data lake technology.

5.  Maintain a safe data pipeline

Establishing policies and procedures to create a process that allows data to flow continuously into the analytics pipeline allows enterprises to make the most of AI analytics. A vital step is to build security and privacy into both the design of the infrastructure and the software used to deliver this capability across the organization.