Training¶
This guide covers how to configure and monitor SOM training: initialization, the decay schedules for the learning rate and neighborhood width, the key hyperparameters, and the BMU search backend. The update rule itself is derived in Basic Concepts.
The training loop in one call¶
Training is two steps — initialize the weights, then fit:
import torch
from torchsom import SOM
data = torch.randn(2000, 8)
som = SOM(x=25, y=15, num_features=8, epochs=100, batch_size=16)
som.initialize_weights(data=data, mode="pca")
q_errors, t_errors = som.fit(data=data)
fit shuffles the data each epoch, processes it in batches, applies the
neighborhood-weighted update, decays the learning rate and neighborhood width, and
records the quantization error (QE) and topographic error (TE) per epoch. The two
returned lists are your convergence trace.
Initialization¶
initialize_weights seeds the codebook before training:
Mode |
Behavior |
|---|---|
|
Spread weights along the first two principal components of the data. Faster, more reproducible convergence — the recommended default. |
|
Sample initial weights randomly from the data range. |
som.initialize_weights(data=data, mode="pca") # or "random"
Initialization quality strongly affects the final map; PCA initialization usually reaches a lower QE/TE in fewer epochs.
Decay schedules¶
The learning rate \(\alpha(t)\) and neighborhood width \(\sigma(t)\) shrink over training so that updates start broad (global ordering) and end local (fine tuning). Pick a schedule per parameter:
Schedule |
Learning rate ( |
Neighborhood width ( |
|---|---|---|
Asymptotic (default) |
|
|
Inverse |
|
|
Linear |
|
|
The inverse and linear schedules guarantee \(\alpha(T) \to 0\) and \(\sigma(T) \to 1\) by the final epoch — zero global drift and single-neuron updates at the end, which is what gives the map its fine local structure. The exact formulas are in Basic Concepts.
som = SOM(
x=25, y=15, num_features=8,
learning_rate=0.95, # initial alpha
sigma=1.75, # initial neighborhood width
lr_decay_function="lr_linear_decay_to_zero",
sigma_decay_function="sig_inverse_decay_to_one",
)
Key hyperparameters¶
Parameter |
Default |
Guidance |
|---|---|---|
|
10 |
Full passes over the data. Increase until QE/TE flatten. |
|
5 |
Larger batches use the GPU more efficiently; raise it for big data. |
|
0.5 |
Initial step size, typically 0.1–1.0. |
|
1.0 |
Initial neighborhood radius. Scale with the grid size. |
|
1 |
Discrete neighborhood extent; also used by Just-in-Time Learning retrieval. |
|
|
Also |
|
|
Also |
|
42 |
Fix for reproducible runs. |
Tip
Always standardize features before training (e.g. scikit-learn’s
StandardScaler). The BMU search compares raw feature distances, so
unscaled features let large-magnitude columns dominate.
BMU search backend¶
Finding the Best-Matching Unit is the per-step bottleneck. The search_backend
argument selects the implementation:
Value |
Behavior |
|---|---|
|
Use FAISS if it is installed, otherwise the PyTorch backend. |
|
Full pairwise distance computation on GPU/CPU. No extra dependency. |
|
Approximate nearest-neighbor search, faster for large maps and
high-dimensional inputs. Install with |
som = SOM(x=90, y=70, num_features=300, search_backend="auto")
For the default 25×15 grids, the PyTorch backend is already fast; FAISS pays off on large maps (e.g. 90×70) or high-dimensional data.
Monitoring convergence¶
Use the returned error traces to decide whether training was long enough:
q_errors, t_errors = som.fit(data=data)
print(f"final QE = {q_errors[-1]:.4f}, final TE = {t_errors[-1]:.4f}")
Plot them with plot_training_errors()
(see Visualization Gallery). Both curves should fall and then flatten; if either
is still dropping at the last epoch, raise epochs.
You can also compute the metrics on held-out data:
qe = som.quantization_error(data=test_data)
te = som.topographic_error(data=test_data)
Next steps¶
Topologies & Boundary Conditions — Grid choice and periodic boundary conditions
Visualization Gallery — Plotting the training curve and maps
Tutorials — Full runs on real datasets
Core API —
SOMandfitreference