This is the fortieth article in a series dedicated to the various aspects of machine learning (ML). Today’s article will continue the discussion of lazy vs. eager learning by going over the benefits and drawbacks of lazy learning and covering a few of the most popular forms of lazy learning.
Our last article outlined the difference between eager and lazy learning in machine learning, hitting on the main point that lazy learning algorithms tend to store data encountered and learn when tested, while eager learning algorithms begin to form hypotheses whenever it encounters new data.
Being lazy is not a bad thing if you are a machine learning agent, as one of the biggest drawbacks of eager learning algorithms is that they tend to make agents too limited or “scared” in their behavior, choosing to stick to an already-formed hypothesis when tested rather than consider multiple different ones in response to the test.
What happens with lazy learners is that the agent may experiment with a variety of different ways to complete a task when tested, and could hit on ways to complete a task that are much more efficient and complex (in a good way) than what an eager learner might put out.
The downside to lazy learning algorithms is that they can be quite costly, in terms of time and computation, to run. This is because there is a real difficulty in getting an agent to effectively rethink its approach to action with every test.
Also, many agents encounter a good deal of useless data out in the world, so it may waste its time or create a mediocre hypothesis by analyzing an irrelevant bit of data.
Despite these setbacks, it turns out that many AI developers actually want their machine learning creations to be “lazy”! And, as it is with humans, there is more than one way for a machine learning agent to be lazy. So this article, then, will give you a rundown of some of the most significant and popular forms of lazy learning used in machine learning.
Simple, easy, and versatile, this learning algorithm operates on the principle that like attracts like, especially when it comes to location.
Basically, a K-Nearest Neighbor algorithm will find solutions based on data points that are similar to observed training examples.
Think of going on a streaming service like Netflix. Netflix likely employs K-Nearest Neighbor algorithms in order to offer their customers constantly-updating recommendations based on their viewing habits.
As you’ve probably noticed by now, movies and TV shows on Netflix have labels for genre, actors, directors, etc., and that when you watch a movie of a certain genre, or with a certain actor, you in turn get recommended movies and TV shows from the same genre or featuring the same actor.
This is K-Nearest Neighbor in effect, as it has searched for the nearest “neighbors” of that Brad Pitt thriller to offer you fifteen or so new titles, all of which you can tell are “neighbors” by the inclusion of Brad Pitt or the similarity in genre.
While K-Nearest Neighbor algorithms make decisions based on the “distance” between data points, case-based reasoning is a bit more complex. It is all in the name, where the algorithm makes a “case,” which is a complex logical argument, about the most recent test.
Given the complexity of the process, case-based reasoning is typically used for tasks that typically require a bit more thought than usual.
While a K-Nearest Neighbor algorithm will throw fifteen different movies at you on Netflix, you wouldn’t want to be so spoiled for choice when it comes to, say, planning a road trip.
When plotting the route from point A to B, a GPS’ goals are clear: Get a time-efficient route that is also safe to travel. So, drawing on stored data about the layout of the world, a GPS tasked with mapping a trip from Missouri to Arkansas will be focused on offering two or three time-efficient routes that are subjected to significant scrutiny.
So, your GPS will not be recommending you a dozen ways to make your trip, but rather offer a handful of thoughtful routes.
Eager learning methods have their place, just as lazy learning methods have their place. Similarly, different types of learning methods have their place as well. K-Nearest Neighbor algorithms are the most common type of lazy learning algorithm, and makes decisions based on the “distance” (thinking of data like points on a plane) between data. Case-based reasoning represents a different type of learning method, where an algorithm will concern itself with creating descriptions of data based on a test, and the result is a complex but logically-sound solution.