Utilities API

The utils module provides supporting functions for SOM training and analysis.

Distance Functions

Utility functions for distances.

Available distance functions:

Function	Description
`euclidean`	Standard Euclidean distance (default)
`cosine`	Cosine distance (1 - cosine similarity)
`manhattan`	Manhattan (L1) distance
`chebyshev`	Chebyshev (L∞) distance

Neighborhood Functions

Utility functions for neighborhood functions.

Available neighborhood functions:

Function	Description
`gaussian`	Gaussian neighborhood (default, smooth)
`mexican_hat`	Mexican hat (Ricker wavelet, inhibitory surround)
`bubble`	Bubble function (step function)
`triangle`	Triangular function (linear decay)

Decay Functions

Utility functions for decay functions.

Available decay functions:

Function	Description
`asymptotic_decay`	General asymptotic decay (default)
`lr_inverse_decay_to_zero`	Learning rate inverse decay to zero
`lr_linear_decay_to_zero`	Learning rate linear decay to zero
`sig_inverse_decay_to_one`	Sigma inverse decay to one
`sig_linear_decay_to_one`	Sigma linear decay to one

Grid and Topology

Utility functions for grid operations.

torchsom.utils.grid.adjust_meshgrid_topology(xx, yy, topology)[source]

Adjust coordinates based on topology.

Parameters:

xx (torch.Tensor) – Mesh grid of x coordinates
yy (torch.Tensor) – Mesh grid of y coordinates
topology (str) – SOM configuration, usually rectangular or hexagonal

Returns:

Adjusted x and y mesh grids for a hexagonal topology.

Return type:

Tuple[torch.Tensor, torch.Tensor]

torchsom.utils.grid.create_mesh_grid(x, y, device)[source]

Create a mesh grid for neighborhood calculations.

The function returns two 2D tensors representing the x-coordinates and y-coordinates of a grid of shape (x, y). This is useful for computing distance-based neighborhood functions in Self-Organizing Maps (SOM).

Parameters:

x (int) – Number of rows (height of the grid).
y (int) – Number of columns (width of the grid).
device (str) – The device on which tensors should be allocated (‘cpu’ or ‘cuda’).

Returns:

Two tensors (xx, yy) of shape (x, y), representing the x and y coordinates of the mesh grid.

Return type:

Tuple[torch.Tensor, torch.Tensor]

Utility functions for topology.

torchsom.utils.topology.get_all_neighbors_up_to_order(topology, max_order)[source]

Get all neighbors from order 1 up to max_order.

Parameters:

topology (str) – “rectangular” or “hexagonal”
max_order (int) – Maximum neighborhood order to include

Returns:

All neighbor offsets from order 1 to max_order combined

Return type:

list[tuple[int, int]] | dict[str, list[tuple[int, int]]]

torchsom.utils.topology.get_hexagonal_offsets(neighborhood_order=1)[source]

Get neighbor offset coordinates for hexagonal topology at any order.

Order n has 6*n elements.

Parameters:: neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.
Returns:: Offsets for even and odd rows
Return type:: Dict[str, List[Tuple[int, int]]]

torchsom.utils.topology.get_rectangular_offsets(neighborhood_order=1)[source]

Get neighbor offset coordinates for rectangular topology at any order.

Parameters:: neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.
Returns:: Coordinate offsets for rectangular grid
Return type:: List[Tuple[int, int]]

Notes

Order 1: 8 neighbors Order 2: 16 neighbors Order 3: 24 neighbors Order 3+: All positions at Chebyshev distance (order-1)

Initialization

Utility functions for initialization.

torchsom.utils.initialization.initialize_weights(weights, data, mode='random', topology='rectangular', device='cpu')[source]

Main function to initialize weights based on specified method.

Parameters:

weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
mode (str, optional) – Initialization method, “random” or “pca”. Defaults to “random”.
topology (str, optional) – Grid configuration, “rectangular” or “hexagonal”. Defaults to “rectangular”.
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

Raises:

ValueError – If an invalid initialization mode is provided

torchsom.utils.initialization.pca_init(weights, data, topology, device='cpu')[source]

Initialize SOM weights using PCA for faster convergence.

Parameters:

weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
topology (str) – Grid configuration, “rectangular” or “hexagonal”
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

torchsom.utils.initialization.random_init(weights, data, device='cpu')[source]

Initialize SOM weights by sampling random data points.

Parameters:

weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor to sample from [batch_size, num_features]
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

Available initialization methods:

Method	Description
`random`	Random sampling from input data (default)
`pca`	PCA-based initialization for faster convergence

Metrics

Utility functions for metrics.

torchsom.utils.metrics.calculate_calinski_harabasz_score(data, labels)[source]

Calculate Calinski-Harabasz index using scikit-learn.

Parameters:

data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Calinski-Harabasz index (higher is better, >= 0)

Return type:

float

torchsom.utils.metrics.calculate_clustering_metrics(data, labels, som=None)[source]

Calculate comprehensive clustering quality metrics.

Parameters:

data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
som (Optional[BaseSOM]) – SOM instance for topological metrics

Returns:

Dictionary of clustering quality metrics

Return type:

dict[str, float]

torchsom.utils.metrics.calculate_davies_bouldin_score(data, labels)[source]

Calculate Davies-Bouldin index using scikit-learn.

Parameters:

data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Davies-Bouldin index (lower is better, >= 0)

Return type:

float

torchsom.utils.metrics.calculate_quantization_error(data, weights, distance_fn)[source]

Calculate quantization error for a SOM.

Parameters:

data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [x, y, num_features]
distance_fn (Callable) – Function to compute distances between data and weights

Returns:

Average quantization error value

Return type:

float

torchsom.utils.metrics.calculate_silhouette_score(data, labels)[source]

Calculate silhouette score for clustering results using scikit-learn.

Parameters:

data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Silhouette score (-1 to 1, higher is better)

Return type:

float

torchsom.utils.metrics.calculate_topographic_error(data, weights, distance_fn, topology='rectangular')[source]

Calculate topographic error for a SOM.

Parameters:

data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [x, y, num_features]
distance_fn (Callable) – Function to compute distances between data and weights
topology (str, optional) – Grid configuration. Defaults to “rectangular”.

Returns:

Topographic error ratio

Return type:

float

torchsom.utils.metrics.calculate_topological_clustering_quality(som, labels)[source]

Calculate how well clusters respect SOM topological structure.

This metric measures the spatial coherence of clusters on the SOM grid. Higher values indicate that clusters are more spatially compact on the grid.

Parameters:

som (BaseSOM) – Trained SOM instance
labels (torch.Tensor) – Cluster labels for neurons [n_neurons]

Returns:

Topological clustering quality (0 to 1, higher is better)

Return type:

float

Quality metrics for evaluating SOM training:

Metric	Description
`quantization_error`	Average distance between input vectors and their Best Matching Units (BMUs)
`topographic_error`	Percentage of data vectors for which the first and second BMUs are not adjacent