Deep Learning And Unsupervised Feature Learning

We consider the problem of building highlevel, class-specific feature detectors from only unlabeled data. Authors: Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean and Andrew Y. Ng. (2012)
CONTRIBUTED BY
Andrew Ng
Ellen Klingbeil
Quoc V. Le
Ashutosh Saxena
Morgan Quigley
Olga Russakovsky
Stephen Gould
Paul Baumstarck
Eric Berger


Description

Since its birth in 1956, the AI dream has been to build systems that exhibit broad-spectrum competence and intelligence. In the STAIR (STanford AI Robot) project, we are building a robot that can navigate home and office environments, pick up and interact with objects and tools, and intelligently converse with and help people in these environments. Our single robot platform will integrate methods drawn from all areas of AI, including machine learning, vision, navigation, manipulation, planning, reasoning, and speech/natural language processing. This is in distinct contrast to the 30-year trend of working on fragmented AI sub-fields, and will be a vehicle for driving research towards true integrated AI. Over the long term, we envision a single robot that can perform tasks such as: Fetch or deliver items around the home or office. Tidy up a room, including picking up and throwing away trash, and using the dishwasher. Prepare meals using a normal kitchen. Use tools to assemble a bookshelf. A robot capable of these tasks will revolutionize home and office automation, and have important applications ranging from home assistants to elderly care. However, carrying out such tasks will require significant advances in integrating learning, manipulation, perception, spoken dialog, and reasoning.

Selected Papers

High-Accuracy 3D Sensing for Mobile Manipulation: Improving Object Detection and Door Opening., Morgan Quigley, Siddarth Batra, Stephen Gould, Ellen Klingbeil, Quoc Le, Ashley Wellman, Andrew Y. Ng. To appear in International Conference on Robotics and Automation (ICRA), 2009 [pdf]

Learning 3-D Object Orientation from Images, Ashutosh Saxena, Justin Driemeyer, Andrew Y Ng. To appear in International Conference on Robotics and Automation (ICRA), 2009.[pdf]

Reactive Grasping using Optical Proximity Sensors, Kaijen Hsiao, Paul Nangeroni, Manfred Huber, Ashutosh Saxena, Andrew Y Ng. To appear in International Conference on Robotics and Automation (ICRA), 2009.[pdf]

Learning grasp strategies with partial shape information, Ashutosh Saxena, Lawson Wong, and Andrew Y. Ng. In AAAI, 2008.[pdf]

A Fast Data Collection and Augmentation Procedure for Object Recognition, Benjamin Sapp, Ashutosh Saxena, and Andrew Y. Ng. In AAAI, 2008.[pdf]

Robotic Grasping of Novel Objects using Vision, Ashutosh Saxena, Justin Driemeyer, and Andrew Y. Ng. International Journal of Robotics Research (IJRR), vol. 27, no. 2, pp. 157-173, Feb 2008.[pdf]

A Vision-based System for Grasping Novel Objects in Cluttered Environments, Ashutosh Saxena, Lawson Wong, Morgan Quigley, and Andrew Y. Ng. In International Symposium of Robotics Research (ISRR), 2007.[pdf]

Robotic Grasping of Novel Objects, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, and Andrew Y. Ng. Neural Information Processing Systems (NIPS 19), 2006.[pdf]

Peripheral-Foveal Vision for Real-time Object Recognition and Tracking in Video, Stephen Gould, Joakim Arfvidsson, Adrian Kaehler, Benjamin Sapp, Marius Meissner, Gary Bradski, Paul Baumstarch, Sukwon Chung and Andrew Y. Ng. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07), 2007.[pdf]

Probabilistic Mobile Manipulation in Dynamic Environments, with Application to Opening Doors, Anya Petrovskaya and Andrew Y. Ng. Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07), 2007.[pdf]

Depth Estimation Using Monocular and Stereo Cues, Ashutosh Saxena, Jamie Schulte and Andrew Y. Ng. Proceedings on the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07), 2007.

Learning to grasp novel objects using vision, Ashutosh Saxena, Justin Driemeyer, Justin Kearns, Chioma Osondu, and Andrew Y. Ng. International Symposium on Experimental Robotics (ISER), 2006.[pdf]

Bayesian estimation for autonomous object manipulation based on tactile sensors, Anya Petrovskaya, Oussama Khatib, Sebastian Thrun, and Andrew Y. Ng. Proceedings of the International Conference on Robotics and Automation (ICRA), 2006.

Have we met? MDP based speaker ID for robot dialogue, Filip Krsmanovic, Curtis Spencer, Daniel Jurafsky and Andrew Y. Ng. Proceedings of the Ninth International Conference on Spoken Language Processing (InterSpeech--ICSLP), 2006.[pdf]

Workshop Papers

ROS: an open-source Robot Operating System, Morgan Quigley, Brian Gerkey, Ken Conley, Josh Faust, Tully Foote, Jeremy Leibs, Eric Berger, Rob Wheeler, Andrew Y. Ng. To Appear in Open-source software workshop of the International Conference on Robotics and Automation (ICRA), 2009.

Integrating Visual and Range Data for Robotic Object Detection, Stephen Gould, Paul Baumstarck, Morgan Quigley, Andrew Y. Ng, Daphne Koller. Presented in Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications workshop, European Conference on Computer Vision (ECCV), 2008.

Learning to Open New Doors, Ellen Klingbeil, Ashutosh Saxena, Andrew Y. Ng. In RSS Workshop on Robot Manipulation, 2008[pdf]

Learning to Open New Doors, Ellen Klingbeil, Ashutosh Saxena, Andrew Y. Ng. Presented in AAAI 17th Annual Robot Workshop and Exhibition, 2008.

STAIR: The STanford Artificial Intelligence Robot project, Andrew Y. Ng, Stephen Gould, Morgan Quigley, Ashutosh Saxena, Eric Berger. Snowbird, 2008.

STAIR: Hardware and Software Architecture, Andrew Y. Ng, Stephen Gould, Morgan Quigley, Ashutosh Saxena and Eric Berger. Presented in AAAI 2007 Robotics Workshop, 2007.

Ashutosh Saxena, Justin Driemeyer, Justin Kearns and Andrew Y. Ng. Presented in RSS workshop on Robotic Manipulation, 2006.

Datasets

Mentioned below are some of the datasets used in the development of the algorithms for STAIR.

Object Grasping Data - This dataset contains the images of objects (real and synthetic), depthmaps (range image), and the grasp labels (i.e, where to grasp the object). Details

3-D Object Data - This dataset contains the images of objects, in kitchen and office environments, taken from a 3-D camera (swissranger). The labels (object class, and the point at which to grasp an object) are also given.

Augmented dataset for Object recognition - A database of 10 office object classes, collected in a real-world cluttered environment and on green screen. This data served us in our experimental work, and also to develop classifiers for our mobile robotics application. For collecting this data, we used our technique proposed in: A Fast Data Collection and Augmentation Procedure for Object Recognition, Benjamin Sapp, Ashutosh Saxena, and Andrew Y. Ng. AAAI, 2008. Details

Other Resources- STAIR Vision Library (SVL) is a cross-platform open-source computer vision and machine learning software library. Details | Download

Video Links

Opening many doors and using elevator buttons

Opening a single door

Operating an elevator button

Operating an elevator while navigating between floors

Grasping (Barrett 3-fingered hand) in cluttered environments

Learning to Open New Doors

Fetching a stapler. Using a learned strategy for grasping novel objects, foveal-peripheral computer vision, depth perception and indoor navigation, and an MDP-based spoken dialog system, this shows STAIR fetching an item in response to a verbal request (6x speed)

Unloading items from a dishwasher (2x speed)

Opening a door (8x speed)

O2O VIDEOS


No Related Item Available

Leave a Reply

You must be logged in to post a comment