Researchers at Intel’s AI Lab recently presented several compelling research papers at the International Conference on Machine Learning (ICML) (June 10-15) and the Conference on Computer Vision and Pattern Recognition (CVPR) June 16-20.
At ICML the Intel AI Lab presented two papers as detailed below:
Collaborative Evolutionary Reinforcement Learning for Robotics and More
An important branch of machine learning is Reinforcement Learning (RL). In RL, the machine “learns” which action to take; it can be a physical action, like a robot moving an arm, or a conceptual action, like a computer game selecting which chess piece to move and where to move it. As part of the learning process, RL also analyzes the action results and provides rewards depending on positive or negative outcomes. For example, in a chess game, RL analyzes the corresponding chess move and provides a positive reward if a check is blocked. One of the challenges with RL is choosing between exploiting the rewards and exploring the environment. Policy gradient-based RL methods, which rely heavily on GPUs, are commonly used by AI researchers today. While these methods are able to exploit rewards for learning, they suffer from limited exploration and costly gradient computations. Another well-known approach is Evolution, a population-based, gradient-free algorithm that can handle sparse rewards and improve exploration. Strong candidates are selected at every generation based on an evaluation condition – and in turn generate new candidates for future generations. A downside is that the evolution method takes significant processing time because the candidates are only evaluated at the end of a complete episode. The core ML dichotomy is revealed again: the choice to either explore the world of the scenario or to exploit the rewards as they are updated. To resolve this conflict, Intel Researchers have developed a new approach to RL called Collaborative Evolutionary Reinforcement Learning (CERL) that combines policy gradient and evolution methods to optimize the exploit/explore challenge.
Using CERL, Intel researchers were able to solve robotics benchmarks using fewer cumulative training samples than using traditional methods that rely on gradient-based or evolutionary learning alone. They tested CERL with these standard academic benchmarks: Humanoid, Hopper, Swimmer, HalfCheetah, and Walker2D. The most complex benchmarks involve continuous control tasks, where a robot produces actions that are not discrete. For example, the OpenAI Gym Humanoid benchmark requires a 3D humanoid model to learn to walk forward as fast as possible without falling. A cumulative reward reflects the level of success for this task. This problem is made difficult due to the relatively large state space as well as a continuous action space – since the walking speed can be selected from a continuous range of values.
Until recently, the Humanoid benchmark was unsolved (robots could learn to walk, but they couldn’t keep up a sustained walk). The Intel team solved it using CERL and achieved a score of about 4,720 in 1M time steps.
Rethinking “Training” of Neural Nets
Today, neural network models are getting larger. This means, data scientists and researchers are training huge models and then start parsing it into smaller manageable chunks. This puts tremendous strain on hardware and memory requirements. This Intel paper flips on its head, the basic assumption of needing large models. The researchers asked the question, “why don’t we start with a compact model and grow the model as we continue to train it?” While many have tried to solve this issue before, they have failed because of memory-constrained training or dynamic reconfiguration. For the first time, Intel researchers showed how to create smaller NN models and how to dynamically grow them to any level of complexity.
At CVPR, Intel AI Lab and Intel Labs presented three papers as detailed below:
Introducing PartNet: The First Large-Scale Dataset for 3D Objects
Identifying objects and their parts is critical to how humans understand and interact with the world. For example, using a stove requires not only identifying the stove itself, but also its many subcomponents: its burners, control knobs, etc. This same capability is essential to many AI vision, graphics, and robotics applications, including predicting object functionality, human-object interaction, simulation, shape editing, and shape generation. This wide range of applications has spurred great demand for large 3D datasets with “part” annotations. However, existing 3D shape datasets provide part annotations only on a relatively small number of object instances. In other words, if we want AI to be able to make us a cup of tea, large new datasets are needed to better support the training of visual AI applications to parse and understand objects with many small details or with important subcomponents. Given these shortcomings, the Intel AI Lab, UCSD, Stanford and Simon Fraser University will present PartNet, the world’s first large-scale dataset with fine-grained annotations.
The open source dataset also includes mobility annotation, meaning that researchers can use it to create largescale virtual environments to teach robots about the relationship between moving objects, for example pushing the OPEN button on a microwave will make the door swing open.
Leveraging Acoustics For Digital Imaging
In this paper, Intel demonstrates the ability to construct digital images and see around corners using acoustic echoes. Non-line-of-sight (NLOS) imaging technology enables unprecedented capabilities for applications including robotics and machine vision, remote sensing, autonomous vehicle navigation and medical imaging. This acoustic method can reconstruct hidden objects using inexpensive, off-the-shelf hardware at longer distances with shorter exposure times compared to the leading alternative NLOS imaging technologies, based on light detection and sensors. In this solution, Intel demonstrates how a system of speakers can emit sound waves and leverage microphones that capture the timing of the returning echoes to inform reconstruction algorithms– inspired by seismic imaging– to build a digital picture of the physical object or space.
Deeply-supervised Knowledge Synergy for Advancing the Training of Deep Convolutional Neural Networks
AI applications including facial recognition, image classification, object detection and semantic image segmentation leverage technologies inspired by biological neural structures, Deep Convolutional Neural Networks (CNNs), to process information and efficiently find answers. However, leading CNNs are challenging to train, as they require strict parameters for operation and the more complex they become, the longer they take to train and the more energy they consume. In this paper, Intel researchers present a new training scheme– called Deeply-supervised Knowledge Synergy– that creates “knowledge synergies”– essentially enables the CNN to transfer what it has learned through layers of the network– to improve training and performance of CNNs, improving accuracy, noisy data management and data recognition.
Sign up for the free insideBIGDATA newsletter.