Deep Reinforcement Learning: Making the Robots Smarter

If you’ve ever taught a dog to sit or shake, you’re familiar with the concept of reinforcement learning. Positive reinforcement is when an animal—or a child, if you’re lucky—learns a desired behavior based on the rewards it receives for the steps it takes to reach the desired outcome. For example, you give your dog a treat for sitting at the door when he needs to go out for a bathroom break, or you give your child a high five—or in my house, $5—when they do well on their spelling test. The subject—in this case, the dog or child—learns which behavior is good or bad based on the response it receives along the way. The same concept can be applied to artificial intelligence (AI).

Historically, one of the flaws of AI is traditionally, machines and computer programs can’t learn from their mistakes. Instead, they rely on a complex set of data that helps them recognize words, things, and missions. Rather than learning by trial and error, like humans do, they refer to their internal set of hard-coded “instructions” to determine right and wrong. And while deep learning allows them to be reprogrammed with mass amounts of new data to achieve better outcomes, they can’t improve those outcomes on their own. This process, also called “supervised learning” requires extensive involvement on the part of the programmer. That’s where reinforcement learning comes in. Recently, tech giants like Alphabet and Google have been working to teach artificial intelligence programs to think for themselves through reinforcement learning. In other words, they’re helping them solve perceived problems, “rather than being taught what solutions look like.”

Many would agree the technology is still in its infancy—or as one writer put it, it’s green-and-black-DOS-screen stage. Although it’s been tremendously successful in gaming—including Google DeepMind/AlphaGo’s much-hyped victory in the game Go—few have been able to find solid commercial uses quite yet, outside of content personalization and ad placement or other somewhat insignificant victories such as saving power or sorting trash, etc. The following are a few ways programmers will be working to develop the technology in coming years to make it more useful in the commercial world, especially in marketing, as well as in our personal lives.

Reward Shaping

AI currently uses reinforcement learning to move through scenarios that have a clear set of perceived rewards. That’s one reason gaming has been such an easy place to start. In the future, however, programmers will focus more on “reward shaping”—teaching AI to work in situations where rewards are more nuanced, with more action steps involved. This will allow robots to move from simple acts like moving through a maze, to determining that a maze needs to be moved through to achieve another perceived purpose.

These abilities could manifest in AI that provides better, more relevant recommendations after a customer makes a purchase. AI already assists with content personalization for ad delivery, but imagine Netflix recommendations that are exactly what you feel like viewing at that moment—every time—or Alexa taking a recent request and turning it into a relevant Amazon order for an item you didn’t even realize you needed.

Wider Purpose

Currently, reinforcement learning has been most successful in very specific, controlled situations. To create machines and programs that are more effectual in our work or personal lives, they will need to move “beyond a single, narrow domain” to develop common sense and handle more complex, less structured challenges. In other words, they’ll need to be able to infer when there is a real problem or mission in a living, changing environment.

Marketing professionals could apply these AI capabilities in order to be more responsive in social media reputation management situations. AI algorithms could be trained to detect unhappy customers and go one step beyond today’s programs that only analyze sentiment to actually reply with a suggestion to solve a problem.

Greater Curiosity

At the moment, machines and programs have no purpose to assess or improve situations on their own. In the future, programmers will be working to build them with greater curiosity to find ways of improving the world around them.

AI that can actually explore the world around it and make suggestions for positive change could also, in theory, create compelling thought leadership content, or at least, more relevant content marketing articles that are indistinguishable from pieces written by human beings. Since content is such a huge piece of the marketing puzzle today, taking this job off the plates of human experts can free them to work in areas like content strategy, which still require a human touch.

Working in Less Controlled Environments

Humans aren’t always logical. In the past, self-driving cars have found it difficult to drive with human-driven vehicles because their actions don’t always make sense—in essence, they can’t be anticipated. AI agents will need to learn to adjust their actions in human-centered environments, where actions often change based on mood, rather than clear rules or logic.
In the future, it’s clear deep reinforcement learning could be a game changer in almost every industry. Not only does it free up programmers from creating cumbersome data sets, it also creates limitless growth potential for AI. This will be useful in the areas of self-driving vehicles (not just cars, but planes and trains, alike), social media marketing, and customer service, as machines learn to adjust to customer complaints and service issues. Indeed, with reinforcement learning, robots will be able to take on even more “human” qualities of discernment and complex decision-making. Soon, the question of when personal robot assistants become a reality will be answered—and the only question we’ll be asking is what to do with all our free time.

Additional Resources on This Topic
The Artificial Intelligence Market is Growing Rapidly: Read This to Get Up to Date
Top 10 Digital Transformation Trends for 2017: Trend 9—Artificial Intelligence
The Ethics of Artificial Intelligence
15 Applications of Artificial Intelligence in Marketing

This article was first published on Forbes.

Daniel Newman

Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. From Big Data to IoT to Cloud Computing, Newman makes the connections between business, people and tech that are required for companies to benefit most from their technology projects, which leads to his ideas regularly being cited in CIO.Com, CIO Review and hundreds of other sites across the world. A 5x Best Selling Author including his most recent “Building Dragons: Digital Transformation in the Experience Economy,” Daniel is also a Forbes, Entrepreneur and Huffington Post Contributor. MBA and Graduate Adjunct Professor, Daniel Newman is a Chicago Native and his speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

Reward Shaping

Wider Purpose

Greater Curiosity

Working in Less Controlled Environments

Daniel Newman

Leave a Comment Cancel reply