New AI Agent “Sees” As Humans Do

And it could have huge potential for robotic systems.

Most of today’s AI agents are handicapped by a common problem. While we’ve done a good job developing systems to operate in known conditions they’ve experienced before, most systems aren’t designed to operate outside of their singular use. Think of a manufacturing environment where a system inspects product for defects. The product is moved into position, an optics system scans the product for defects, and the process is repeated again and again. What today’s AI agents aren’t terribly skilled at is taking a totally foreign environment and making actionable decisions based on what it finds. Now, computer science researchers at the University of Texas have taught an AI system to take a few quick glances and infer the surrounding environment. This is how our brains process visual cues and something that was previously thought to belong solely in the human realm.

How does the system work?

The researchers, led by professor Kristen Grauman, Ph.D., Dinesh Jayaraman, and Santhosh Ramakrishnan used deep learning (inspired by our brain’s neural networks) to train their system by showing it 360-degree images of various environments. When presented with an unknown scene, the AI agent chooses a few glimpses of the environment to capture – typically accounting for less than 20 percent of the full image. What makes the new system so unique is in the way it chooses which glimpses of the scene it’s going to use. After taking an initial shot, the system then predicts where in the scene it can “look” next to gather the most useful new information to succeed in perception tasks. Based on these glimpses, the agent is able to infer what it might have seen if it had looked in all available directions.

The agent works in a similar way as someone who finds themselves in a grocery store they’ve never visited before. If you were standing next to the apples, you can generally guess that there are oranges, bananas, and grapes nearby, but you’d likely have to look in a different direction to find the bread. Likewise, the agent makes intelligent guesses and uses them to reconstruct a full 360-degree image of the surroundings.

With the help of a few supercomputers, it took the researchers approximately a day to train their AI agent through a process known as reinforcement learning. At the current time, the system works like a person who’s stuck in place, however, researchers have plans to integrate the system in a fully mobile robotic platform to be used in search and rescue or disaster-response situations.

Interested in artificial intelligence?

Are you interested in working on the bleeding edge of artificial intelligence? Black Diamond Networks is a leading contract placement firm that works with companies pushing the limits of artificial intelligence and machine learning. Whether you’re looking for a new place to put your knowledge to work or you’re looking for a new industry to freshen things up, we’d love to talk to you. Visit or give us a call at 1.800.681.4734.

You May Also Like