Lobe Component Analysis (LCA)

The first stage of biological mental development is development of the perceptual system.  The purpose of this early learning stage is to create effective internal representation of the environment using a limited storage capacity.  The constant stream of high-dimensional sensory inputs cannot be stored exactly, but there is a high degree of repeated, similar structure that the sensory system can take advantage of.  The single-layer Lobe Components Analysis (LCA) algorithm is designed with a similar principle in mind: learning an optimal representation of a set of input samples using a much smaller set of vectors, each called a lobe component.  Each lobe component is the most efficient estimator for the samples in its region.  LCA is an "in-place" algorithm, in the sense that it operates incrementally (no samples are stored), does not require any memory space to save the higher order statistics, such as the covariance matrix or kurtosis, and the network does not use a separate developer, such as gradient-based error backpropogation, to learn - the network develops and learns as a side effect of competitive interactions.

Above shows a set of samples, designated by crosses, around the origin.  The samples are represented by the three vectors v1, v2, and v3.  Each vector is a lobe component, and is the best single-vector representation (in both direction and length) for the set of samples within that region.  The regions are designated by R1, R2, and R3. 

The lobe components converge to represent well-separated regions using the neural learning  mechanisms of Hebbian learning and lateral inhibition: for each sample, only the closest lobe component(s) in terms of direction is the "winner", and will update the vector direction to be closer to the sample vector direction.  The variance of the samples that each lobe component updated for is preserved as the length, or energy, of the vector.

Above is a result of the unsigned version of the algorithm.  Each vector represents samples by the asbolute value of their direction.  The version of LCA available for download on this page is unsigned, although that can be changed within the program.

When the input to LCA is a set of samples from images of nature, such as the one above, the lobe components converge to a set of mostly localized edge detectors, as seen below.   This is the same result as the Independent Components Analysis algorithm, which seeks to extract signals based on some measure of independence, such as maximum kurtosis or minimum mutual information.  But LCA is not designed to minimize or maximize any criteria in particular.  The reason the results are similar is that natural images are super-Guassian.  The LCA algorithm will converge to the most independant signals because of the nature of super-Gaussian input  - correlated areas of high density.  So, LCA is ICA for super-Gaussians. 

The above is a grid of 256 lobe components developed from natural images.  They are sorted from the top left to the bottom right by number of times updated.  LCA naturally leads to sparse responding features for computer vision.  Observe that most of the area of the lower order components is gray (zero).  The remaining area, the edge area, is the receptive field of this neuron.  It will not respond very well unless there is similar activity in the input.  Good feature seperation with sparse response is a desirable criteria  for a real world perceptual system, in terms of efficiency.  These low level features are similar to those observed in the first area of the visual cortex.  They can be combined in different ways over a larger receptive field to create higher-level features for contour detection.  This recursive use of shared features is another useful property of a perceptual system with a limited resource.

Considering the many examples of super-Gaussian data, such as real images or real audio, LCA can be a very efficient way to extract the independent components.  As seen above, the LCA algorithm (using the amnesic averaging fast learning technique) converges to a known set of independent super-Gaussian source signals (comparison with ground truth - the y axis is % error) in many fewer iterations than two ICA algorithms.   The FastICA algorithm is a batch algorithm, and Extended Infomax (which does not converge at all for this case) is a block-incremental algorithm.  LCA is more constrained than either of these, yet it outperforms both here.

An extension of LCA enforces lobe component position within a two-dimensional grid.  An updating unit will also update its neighbors in a limited size neighborhood window.  Above is an example of topographic LCA using "high-level" digit data.  Below is an example of the result with low-level features.  Use of topographic updating leads to sparsely responding areas, instead of single units.  Within this area are variations of the feature.  Representing each feature in several different ways is desirable because of the many possible variations of each.

There is a description of the code and a brief tutorial here.