Cresceptron: The 1st Deep Learning for 3D:
Detecting and Recognizing 3D Objects from 2D Images of Cluttered Scenes and Segmenting Recognized Objects from the 2D Images

This project develops a framework called Cresceptron for automatic learning for recognition and segmentation of real-world 3D objects from their images based on exemplars of performance of such tasks. The Cresceptron has been tested on the task of visual recognition: recognizing 3-D general objects from 2-D electro-optical images of natural scenes and segmenting the recognized objects from their cluttered image background without handcrafting a 3Dobject model. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. It incorporates both individual learning and class learning; with the former, each training example is treated as a different individual and with the later, each example is a sample of a class. Several types of network mechanisms have been developed, and their properties are addressed in terms of knowledge recallability, positional invariance, generalization power, discrimination power and space complexity. Experiments with a variety of real-world images are reported to demonstrate the feasibility of the Cresceptron with an almost perfect performance.

Borrowing the three main ideas (convolution, paired layers, and increasing receptive fields from early to later layers in a deep cascade) of Neocognitron by Fukushima (which was designed for a single isolated 2D character), Cresceptron has made the following seven first-timers:

  1. The 1st method for learning large-scale 3D objects using a deep convolutional nerual network (CNN)
  2. The 1st deep learning network without human selecting features for large data (fully closed skull)
  3. The 1st concurrent detection and recognition for cluttered scenes (multiple 3D objects), using sensor-plane scans: scales and strides
  4. The 1st automatically generated paired deep-cascade: feature-matching layer paired with subsampling layer
  5. The 1st position blurring in subsampling layer
  6. The 1st max-pooling in subsampling layer
  7. The 1st does segmentation using post-analysis

References

J. Weng, N. Ahuja and T. S. Huang, ``Cresceptron: a self-organizing neural network which grows adaptively,'' in Proc.Int'l Joint Conference on Neural Networks, Baltimore, Maryland, vol. 1, pp. 576-581, June 1992. IJCNN1992pdf
J. Weng, N. Ahuja and T. S. Huang, ``Learning recognition and segmentation of 3-D objects from 2-D images,'' in Proc. 4th International Conf. Computer Vision, Berlin, Germany, pp. 121-128, May, 1993. ICCV1993pdf
J. Weng, N. Ahuja and T. S. Huang, ``Learning recognition and segmentation Using the Cresceptron,'' Int'l Jounral of Computer Vision. vol. 25, no. 2, pp. 105-139, Nov. 1997. IJCVpdf file. myPDF file.
 Back To Weng's Home Page: http://web.cps.msu.edu/~weng/