Unlock Your Data With The Power of Real-Time AI
To succeed with real-time AI, data ecosystems need to excel at handling fast-moving streams of events, operational data, and machine learning models to leverage insights and automate decision-making.
Increased operational efficiencies at airports.
Instant reactions to fraudulent activities at banks.
Improved recommendations for online transactions.
Better patient care at hospitals.
Investments in artificial intelligence are helping businesses to reduce costs, better serve customers, and gain competitive advantage in rapidly evolving markets. Titanium Intelligent Solutions, a global SaaS IoT organization, even saved one customer over 15% in energy costs across 50 distribution centers, thanks in large part to AI.To succeed with real-time AI, data ecosystems need to excel at handling fast-moving streams of events, operational data, and machine learning models to leverage insights and automate decision-making. Here, I’ll focus on why these three elements and capabilities are fundamental building blocks of a data ecosystem that can support real-time AI.
Real-time data and decisioning
First, a few quick definitions. Real-time data involves a continuous flow of data in motion. It’s streaming data that’s collected, processed, and analyzed on a continuous basis. Streaming data technologies unlock the ability to capture insights and take instant action on data that’s flowing into your organization; they’re a building block for developing applications that can respond in real-time to user actions, security threats, or other events. AI is the perception, synthesis, and inference of information by machines, to accomplish tasks that historically have required human intelligence. Finally, machine learning is essentially the use and development of computer systems that learn and adapt without following explicit instructions; it uses models (algorithms) to identify patterns, learn from the data, and then make data-based decisions.
Real-time decisioning can occur in minutes, seconds, milliseconds, or microseconds, depending on the use case. With real-time AI, organizations aim to provide valuable insights during the moment of urgency; it’s about making instantaneous, business-driven decisions. What kinds of decisions are necessary to be made in real-time? Here are some examples:
Fraud It’s critical to identify bad actors using high-quality AI models and data
Product recommendations It’s important to stay competitive in today’s ever-expanding online ecosystem with excellent product recommendations and aggressive, responsive pricing against competitors. Ever wonder why an internet search for a product reveals similar prices across competitors, or why surge pricing occurs?
Supply chain With companies trying to stay lean with just-in-time practices, it’s important to understand real-time market conditions, delays in transportation, and raw supply delays, and adjust for them as the conditions are unfolding.
Demand for real-time AI is accelerating
Software applications enable businesses to fuel their processes and revolutionize the customer experience. Now, with the rise of AI, this power is becoming even more evident. AI technology can autonomously drive cars, fly aircraft, create personalized conversations, and transform the customer and business experience into a real-time affair. ChatGPT and Stable Diffusion are two popular examples of how AI is becoming increasingly mainstream.
With organizations looking for increasingly sophisticated ways to employ AI capabilities, data becomes the foundational energy source for such technology. There are plenty of examples of devices and applications that drive exponential growth with streaming data and real-time AI:
Intelligent devices, sensors, and beacons are used by hospitals, airports, and buildings, or even worn by individuals. Devices like these are becoming ubiquitous and generate data 24/7. This has also accelerated the execution of edge computing solutions so compute and real-time decisioning can be closer to where the data is generated.
AI continues to transform customer engagements and interactions with chatbots that use predictive analytics for real-time conversations.
Augmented or virtual reality, gaming, and the combination of gamification with social media leverages AI for personalization and enhancing online dynamics.
Cloud-native apps, microservices and mobile apps drive revenue with their real-time customer interactions.
It’s clear how these real-time data sources generate data streams that need new data and ML models for accurate decisions. Data quality is crucial for real-time actions because decisions often can’t be taken back. Determining whether to close a valve at a power plant, offer a coupon to 10 million customers, or send a medical alert has to be dependable and on-time. The need for real-time AI has never been more urgent or necessary.
Lessons not learned from the past
Organizations have over the past decade put a tremendous amount of energy and effort into becoming data driven but many still struggle to achieve the ROI from data that they’ve sought. A 2023 New Vantage Partners/Wavestone executive survey highlights how being data-driven is not getting any easier as many blue-chip companies still struggle to maximize ROI from their plunge into data and analytics and embrace a real data-driven culture:
19.3% report they have established a data culture
26.5% report they have a data-driven organization
39.7% report they are managing data as a business asset
47.4% report they are competing on data and analytics
Outdated mindsets, institutional thinking, disparate siloed ecosystems, applying old methods to new approaches, and a general lack of a holistic vision will continue to impact success and hamper real change.
Organizations have balanced competing needs to make more efficient data-driven decisions and to build the technical infrastructure to support that goal. While big data technologies like Hadoop were used to get large volumes of data into low-cost storage quickly, these efforts often lacked the appropriate data modeling, architecture, governance, and speed needed for real-time success.
This resulted in complex ETL (extract, transform, and load) processes and difficult-to-manage datasets. Many companies today struggle with legacy software applications and complex environments, which leads to difficulty in integrating new data elements or services. To truly become data- and AI-driven, organizations must invest in data and model governance, discovery, observability, and profiling while also recognizing the need for self-reflection on their progress towards these goals.
Unifying your organization’s real-time data and AI strategies
Data, when gathered and analyzed properly, provides the inputs necessary for functional ML models. An ML model is an application created to find patterns and make decisions when accessing datasets. The application will contain ML mathematical algorithms. And, once ML models are trained and deployed, they help to more effectively guide decisions and actions that make the most of the data input. So it’s critical that organizations understand the importance of weaving together data and ML processes in order to make meaningful progress toward leveraging the power of data and AI in real-time. From architectures and databases to feature stores and feature engineering, a myriad of variables must work in sync for this to be accomplished.
ML models need to be built, trained, and then deployed in real-time. Flexible and easy-to-work-with data models are the oil that makes the engine for building models run smoothly. ML models require data for testing and developing the model and for inference when the ML models are put in production (ML inference is the process of an ML model making calculations or decisions on live data).
Data for ML is made up of individual variables called features. The features can be raw data that has been processed or analyzed or derived. ML model development is about finding the right features for the algorithms. The ML workflow for creating these features is referred to as feature engineering. The storage for these features is referred to as a feature store. Data and ML model development fundamentally depend on one another..
That’s why it is essential for leadership to build a clear vision of the impact of data-and-AI alignment—one that can be understood by executives, lines of business, and technical teams alike. Doing so sets up an organization for success, creating a unified vision that serves as a foundation for turning the promise of real-time AI into reality .
A real-time AI data ingestion platform and operational data store
Real-time data and supporting machine learning models are about data flows and machine-learning-process flows. Machine learning models require quality data for model development and for decisioning when the machine learning models are put in production. Real-time AI needs the following from a data ecosystem:
A real-time data ingestion platform for messaging, publish/subscribe (“pub/sub” asynchronous messaging services), and event streaming
A real-time operational data store for persisting data and ML model features
An aligned data ingestion platform for data in motion and an operational data store working together to reduce the data complexity of ML model development
Change data capture (CDC) that can send high-velocity database events back into the real-time data stream or in analytics platforms or other destinations.
An enterprise data ecosystem architected to optimize data flowing in both directions.
Let’s start with the real-time operational data store, as this is the central data engine for building ML models. A modern real-time operational data store excels at integrating data from multiple sources for operational reporting, real-time data processing, and support for machine learning model development and inference from event streams. Working with the real-time data and the features in one centralized database environment accelerates machine learning model execution.
Data that takes multiple hops through databases, data warehouses, and transformations moves too slow for most real-time use cases. A modern real-time operational data store (Apache Cassandra® is a great example of a database used for real-time AI by the likes of Apple, Netflix, and FedEx) makes it easier to integrate data from real-time streams and CDC pipelines.
The operational data store needs a real-time data ingestion platform with the same type of integration capabilities, one that can ingest and integrate data from streaming events. The streaming platform and data store will be constantly challenged with new and growing data streams and use cases, so they need to be scalable and work well together. This reduces the complexity for developers, data engineers, SREs, and data scientists to build and update data models and ML models.
A real-time AI ecosystem checklist
Despite all the effort that organizations put into being data-driven, the New Vantage Partners survey mentioned above highlights that organizations still struggle with data. Understanding the capabilities and characteristics for real-time AI is an important first step toward designing a data ecosystem that’s agile and scalable. Here is a set of criteria to start with:
A holistic strategic vision for data and AI that unifies an organization
A cloud-native approach designed for scale and speed across all components
A data strategy to reduce complexity and breakdown silos
A data ingestion platform and operational data store designed for real-time
Flexibility and agility across on-premises, hybrid-cloud, and cloud environments
Manageable unit costs for ecosystem growth
Wrapping up
Real-time AI is about making data actionable with speed and accuracy. Most organizations’ data ecosystems, processes and capabilities are not prepared to build and update ML models at the speed required by the business for real-time data. Applying a cloud-native approach to applications, data, and AI improves scalability, speed, reliability, and portability across deployments. Every machine learning model is underpinned by data.
A powerful datastore, along with enterprise streaming capabilities turns a traditional ML workflow (train, validate, predict, re-train …) into one that is real-time and dynamic, where the model augments and tunes itself on the fly with the latest real-time data.
Success requires defining a vision and execution strategy that delivers speed and scale across developers, data engineers, SREs, DBAs, and data scientists. It takes a new mindset and an understanding that all the data and ML components in a real-time data ecosystem have to work together for success.
How Data Science Can Help Logistics And More
Business investment in data science is one of the most important advances of the past decade. While some may argue data science has already worked its way into every industry, there are still areas where it has only scratched the surface.
Data science is defined as "an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from noisy, structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains." The key part is the "extraction of knowledge and insights" portion.
Businesses talk about becoming data-driven organizations where better decision making is possible. Data science and analysis have already had a positive impact on marketing, sales and HR, but there are places to apply it that aren't related to purchasing decisions or consumer preferences.
Data Science Empowers
The logistics, shipping, and supply chain industries are often thought of as legacy functions, but applying data science in the correct places can improve efficiency and competition. A recent study by the Council of Supply Chain Management Professionals found that data science is becoming more important to this industry.
Organizations need to look inward and determine where better knowledge and understanding can improve their business. For some, it's optimizing shift times. For others, it's organizing goods or where to place facilities geographically. Just be sure the starting point will create value and tell your organization something it didn't know before. Once you've had success, it will be easy to build from the foundation.
Build Or Partner?
Applying data science is much easier than before, but that doesn't mean it's easy. To drive value, systems need to gather and categorize data efficiently while simplifying the addition process. It's critical to continuously access real-time insights; the saying "bad data in, bad data out" exists for a reason.
Some companies build out capabilities in-house, while others turn to best-of-breed providers to handle their data collection, analysis and insight needs. For many, the decision comes down to just how integral it is to the business to be analyzing data with its own tools.
Either way, the endgame is the same — to empower companies and employees with the market observations and insights needed to make better decisions.
Where Data Science Can Help
Where can data science and analysis help the business? Ask yourself — would a faster analysis of market data lead to more sales? A competitive advantage? A better market position? Reviewing operations with these thoughts in mind can focus efforts on what's needed and should be addressed first.
• Logistics. In logistics, the application of data science can help companies better optimize operations. This includes everything from which delivery routes to take, how to better manage fuel (and which times of day to travel) and more accurate forecasting of supply and demand. Applying data science to logistics can help companies use quickly delivered insights to make adjustments as needed along the way, as different variables (such as consumer desire or gas prices) can be acted on faster. DHL's Smart Truck system uses data science and analytics to calculate the best routes for efficiency, cost and time savings.
• Supply chain. The supply chain itself has become a more strategic element of a company's business. Organizations have started to analyze how to automate demand forecasting, optimize replenishment and lead times, make inventories more accurately reflect market demand, and improve on-time production and delivery. The goal is to make the supply chain more efficient and predictable. Improved insights can also lead to better agility so that adjustments can be made in real time — and global crises can be successfully weathered. For example, as CIO notes, PepsiCo "uses analytics and machine learning to predict out-of-stocks and alert retailers to reorder."
• Shipping management. Up until recently, there had been little information available about shipping at all, leaving many companies in the dark as to what the rates might be from carriers and how they stacked up against their competitors. Without this analysis, it can be impossible for shippers to understand the impact that shipping costs (and the potential variables that can affect them such as packaging, location, discounts and seasonal rates) have on profitability. By analyzing the shipping process, from carrier negotiation to the way goods are packed, businesses can optimize their operations and find ways to cut costs without sacrificing service or speed. We worked with e-commerce company Nisolo, and it used data analytics to identify that its shipping minimums were costing the company money, allowing it to make adjustments. Data science and analytics also enabled it to model out future expansion costs.
• Manufacturing. By applying data science to the manufacturing space, organizations can get closer to the industry's goal of delivering the right products in the right quantities at the right time. Achieving this can lower the cost of goods and make items less expensive for all. There are many ways data science can be applied to manufacturing systems to help make this a reality — monitoring facility processes, modeling maintenance scenarios, recognizing patterns in downtime, reviewing safety practices, and then building out and improving operations to reflect what was learned. Data science can minimize risk, lower costs and improve productivity. Automotive manufacturer Ford is a terrific example, as it uses data science to analyze the wear and tear of equipment and identify potential machinery breakdowns before they occur.
Summing Up
Becoming a data-driven business can benefit aspects of your organization not often connected to data science and analytics such as shipping or logistics. Previously unknown details can be uncovered and optimized for efficiency, effectiveness and cost. Data science should be applied across any critical area of your business where better insights and improvements are needed.
Remember that data science is not a one-and-done process; your company and the industry you compete in are constantly evolving. This means your data and analytics systems must constantly update. If done correctly, you should be able to capture and maintain a competitive advantage.
How to implement digitization and automation in antiquated sectors like logistics >