Basic Concepts

This page introduces the fundamental concepts behind Self-Organizing Maps (SOMs) and how they work.

What is a Self-Organizing Map?

A Self-Organizing Map (SOM), also known as a Kohonen map, is an unsupervised neural network algorithm that:

Clusters similar data points together
Reduces dimensionality by mapping high-dimensional data to a lower-dimensional grid, usually 2D
Preserves topology by keeping similar data points close together on the map
Visualizes patterns in complex, high-dimensional datasets

Key Characteristics:

Unsupervised: No labeled data required
Competitive learning: Neurons compete to represent input data
Topology preservation: Maintains neighborhood relationships
Dimensionality reduction: Maps N-dimensional data to 2D grid

How SOMs Work

The SOM Algorithm

Initialize weight vectors randomly for each neuron
Present input data to the network
Find the Best Matching Unit (BMU) - the neuron most similar to input
Update the BMU and its neighbors to be more similar to the input
Repeat until convergence or maximum iterations reached

Mathematical Foundation

Distance Calculation The similarity between input x and neuron w is typically measured using Euclidean distance:

\[d(x, w_i) = \sqrt{\sum_{j=1}^{n} (x_j - w_{i,j})^2}\]

Weight Update Rule The weight update follows:

\[w_i(t+1) = w_i(t) + \alpha(t) \cdot h_{BMU,i}(t) \cdot (x(t) - w_i(t))\]

Where: - \(\alpha(t)\) is the learning rate at time t - \(h_{BMU,i}(t)\) is the neighborhood function - \(x(t)\) is the input vector at time t

Core Components

1. Grid Topology

SOMs organize neurons in a grid structure:

Rectangular Grid

Each neuron has up to 8 neighbors
Simple, intuitive visualization
Good for most applications

Hexagonal Grid

Each neuron has up to 6 neighbors
More uniform neighborhood distances
Better for circular/radial patterns

2. Neighborhood Function

Determines how much each neuron is affected by the BMU:

Gaussian (Most Common): \[h_{BMU,i}(t) = \exp\left(-\frac{d_{BMU,i}^2}{2\sigma(t)^2}\right)\]
Bubble: Step function - neurons within radius are updated equally
Triangle: Linear decay from BMU to neighborhood boundary

3. Learning Rate Decay

Controls how much weights change during training:

Asymptotic Decay: \[\alpha(t) = \frac{\alpha_0}{1 + t/T}\]
Linear Decay: \[\alpha(t) = \alpha_0 \cdot (1 - t/T)\]

4. Distance Functions

Different ways to measure similarity:

Euclidean: Standard geometric distance
Cosine: Measures angle between vectors
Manhattan: Sum of absolute differences
Chebyshev: Maximum absolute difference

5. Quality Metrics

Quantization Error: Average distance between data points and their BMUs. Lower is better, measures how well the map represents the data.
Topographic Error: Percentage of data points whose BMU and second-BMU are not neighbors. Lower is better, measures topology preservation.

Strengths and Weaknesses

Advantages

No assumptions about data distribution
Topology preservation maintains relationships
Intuitive visualization of complex data
Unsupervised learning - no labels needed

Limitations

Computationally expensive for large datasets
Parameter sensitive - requires tuning
Interpretation challenges for very high dimensions

Best Practices

Data Preparation

Normalize features to similar scales
Remove highly correlated features
Handle missing values appropriately
Consider dimensionality reduction for very high dimensions

Parameter Selection

Experiment with different topologies and functions
Monitor training progress with error curves to guide parameter choice

Interpretation

Use multiple visualizations to understand the map
Combine with domain knowledge for meaningful insights
Validate findings with other analysis methods
Document parameter choices for reproducibility

Next Steps

Now that you understand the basics, explore:

SOM Visualization Guide - Visualization techniques