This is the fifth article in a series dedicated to the various aspects of machine learning. Today’s article will outline what is called the machine learning cycle, which is a breakdown of how a machine learning agent begins from an untrained agent to a capable and autonomous agent that is learning in real-time, from data identification to self-assessment.
The Cycle of Life
Most machine learning agents live a cyclical and predictable existence. They are at rest, then given an objective, perform (or fail to perform) the objective, then return again to the initial state of rest. Think of how the Noid’s mortal enemy, Domino’s new automated delivery vehicle, Nuro’s R2 robot, does its job in this ad.
This four-wheel cheese chauffeur, sophisticated and showy as it is in appearance, is a simple beast at heart: It begins at its home base, i.e. a Domino’s, waiting to be loaded with the goods. Following what is likely to be a GPS route up-to-date with the latest weather and traffic conditions, it manages to expertly dodge that most bizarre of corporate anti-mascots, the Noid, who in the context of this commercial can be taken to symbolize ant pesky weather, traffic, and/or other obstacles the delivery bot may encounter en route to a drop-off. Once arriving safely at the drop-off location (a curb in front of a customer’s house), it takes off to return again to home base, leaving behind a foiled Noid.
It is a thirty-second ad, but it captures, partially at least, the machine learning cycle of a trained and deployed ML agent. Keep reading to learn about this cycle, with the R2 robot as our model for the cycle. We’ll go through the cycle as though we are the less-technically skilled Nuru designing and preparing the R2 robot to make deliveries.
Phase 1: Data Identification
Phase 1 sees us identifying which data sources are relevant to the R2 robot. This can include trustworthy sources on weather and traffic conditions, which GPS (Google Maps, Waze, etc.) is the most reliable and accurate, and general information about the various types of vehicles and pedestrians and animals one may encounter out and about on the street. Think of any data that a human could consider relevant to their own navigation of busy roads and sidewalks on wheels, and identify the sources where it is most trustworthy.
Phase 2: Prepare Data
After we’ve identified all the relevant and trustworthy sources for the R2 robot, we need to prepare the data. We need to double-check that it is accurate and clean (inaccurate and insecure data is a recipe for ML failure), because R2 robot, though it performs the task of a human and can even reason like one, it fundamentally reads its environment in a different way.
For instance, object recognition, specifical differentiation, is much more difficult for an AI agent, so it needs to know as much as empirical information about, say, a yellow glove and a banana peel before it can tell them apart on the fly.
Phase 3: Selecting the Right ML Algorithm(s)
As mentioned in our introductory article, an algorithm is a series of instructions that a ML agent needs to know to accomplish its task, and learn from what it is seeing. There can be several of these for a single agent. For a self-driving car, the most important algorithms involve environment-rendering or making sense of what the vehicle sees and hears as it is on the road.
Another important feature is the ability to take into account the multiple ways in which that environment can change when, say, a dog breaks free of its leash and starts running rampant down the sidewalk, making people jump out of the way in every which direction. So, prediction algorithms are also a must for a self-driving car.
Phase 4: Training
Now that the robot is locked and loaded with data and ML algorithms, it’s time to take it outside for some testing in a controlled environment. Here we train the algorithms to consistently create accurate and reliable interpretations and predictions about the environment and its own driving.
How fast can the robot get from point A to point B? How quick are its interpretations and reactions of sudden changes in the environment? Is it showing steady progress in adapting to the ways of the road? Is it demonstrating that the information collected on each previous run-through has been analyzed and implemented to accomplish its tasks more efficiently? These questions, and more performance-based issues, will be accounted for in the training stage.
By the time the robot is released for real-time deliveries, it should already be an expert in road navigation.
Phase 5: Evaluation
What is ultimately decided at the end of the training stage is which algorithms are the most effective and should be used, and which are not and should not. Sometimes this results in a mere whittling-down of which algorithms are employed, and sometimes it results in a back-to-the-drawing board reworking of existing algorithms to retool them for better performance.
The big question being asked in the evaluation phase is, Is this robot ready to deliver pizzas?
Phase 6: Deploy
If that robot is indeed ready to deliver pizzas, then it is time put it out on the streets so it can move that cheese. It is still wise to monitor its performance closely during its initial wave of deliveries, and it may perhaps be smarter to let it “build up” to more challenging delivery routes, even if it proved expert at it during the training phase.
Whether it’s thrown into the fire or not, R2 should be able to perform its tasks without a hitch after the rigorous training and evaluation phases.
Phase 7: Predict
The baby bird has left the nest, so it’s time for the R2 to make predictions on its own via its ML algorithms. The constant stream of new input data during every delivery will require non-stop predictions from R2. At this point, R2 is more or less an autonomous agent while out on the road, so it will need to rely on its predictions
Phase 8: Assess Predictions
R2 will need to figure out if its predictions are valid or not. There is a bit of a trial-and-error aspect to its continual improvement over time. Sometimes it may make a prediction that slows it down a bit, like if it thinks it can scale a branch but actually can’t, and therefore must go around it. Luckily, as it is a machine learning agent, it will know that the next time it encounters a similar object the better course is to swerve around it.
There are many phases in a machine learning cycle, and we see the last, here-I-am-world phases in action in the Domino’s commercial, where R2 is seen reacting to and making predictions as to what the best course of action is during its journey from Domino’s to a customer’s house.