Utilities API

The utils module provides supporting functions for SOM training and analysis.

Distance Functions

Utility functions for distances.

Available distance functions:

Function

Description

euclidean

Standard Euclidean distance (default)

cosine

Cosine distance (1 - cosine similarity)

manhattan

Manhattan (L1) distance

chebyshev

Chebyshev (L∞) distance

weighted_euclidean

Weighted Euclidean distance with feature weights

Neighborhood Functions

Utility functions for neighborhood functions.

Available neighborhood functions:

Function

Description

gaussian

Gaussian neighborhood (default, smooth)

mexican_hat

Mexican hat (Ricker wavelet, inhibitory surround)

bubble

Bubble function (step function)

triangle

Triangular function (linear decay)

Decay Functions

Utility functions for decay functions.

Available decay functions:

Function

Description

asymptotic_decay

General asymptotic decay (default)

lr_inverse_decay_to_zero

Learning rate inverse decay to zero

lr_linear_decay_to_zero

Learning rate linear decay to zero

sig_inverse_decay_to_one

Sigma inverse decay to one

sig_linear_decay_to_one

Sigma linear decay to one

Grid and Topology

Utility functions for grid operations.

torchsom.utils.grid.adjust_meshgrid_topology(xx, yy, topology)[source]

Adjust coordinates based on topology.

Parameters:
  • xx (torch.Tensor) – Mesh grid of x coordinates

  • yy (torch.Tensor) – Mesh grid of y coordinates

  • topology (str) – SOM configuration, usually rectangular or hexagonal

Returns:

Adjusted x and y mesh grids for a hexagonal topology.

Return type:

Tuple[torch.Tensor, torch.Tensor]

torchsom.utils.grid.create_mesh_grid(x, y, device)[source]

Create a mesh grid for neighborhood calculations.

The function returns two 2D tensors representing the x-coordinates and y-coordinates of a grid of shape (x, y). This is useful for computing distance-based neighborhood functions in Self-Organizing Maps (SOM).

Parameters:
  • x (int) – Number of rows (height of the grid).

  • y (int) – Number of columns (width of the grid).

  • device (str) – The device on which tensors should be allocated (‘cpu’ or ‘cuda’).

Returns:

Two tensors (xx, yy) of shape (x, y), representing the x and y coordinates of the mesh grid.

Return type:

Tuple[torch.Tensor, torch.Tensor]

Utility functions for topology.

torchsom.utils.topology.get_all_neighbors_up_to_order(topology, max_order)[source]

Get all neighbors from order 1 up to max_order.

Parameters:
  • topology (str) – “rectangular” or “hexagonal”

  • max_order (int) – Maximum neighborhood order to include

Returns:

All neighbor offsets from order 1 to max_order combined

Return type:

list[tuple[int, int]] | dict[str, list[tuple[int, int]]]

torchsom.utils.topology.get_hexagonal_offsets(neighborhood_order=1)[source]

Get neighbor offset coordinates for hexagonal topology at any order.

Order n has 6*n elements.

Parameters:

neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.

Returns:

Offsets for even and odd rows

Return type:

Dict[str, List[Tuple[int, int]]]

torchsom.utils.topology.get_rectangular_offsets(neighborhood_order=1)[source]

Get neighbor offset coordinates for rectangular topology at any order.

Parameters:

neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.

Returns:

Coordinate offsets for rectangular grid

Return type:

List[Tuple[int, int]]

Notes

Order 1: 8 neighbors Order 2: 16 neighbors Order 3: 24 neighbors Order 3+: All positions at Chebyshev distance (order-1)

Initialization

Utility functions for initialization.

torchsom.utils.initialization.initialize_weights(weights, data, mode='random', topology='rectangular', device='cpu')[source]

Main function to initialize weights based on specified method.

Parameters:
  • weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]

  • data (torch.Tensor) – Input data tensor [batch_size, num_features]

  • mode (str, optional) – Initialization method, “random” or “pca”. Defaults to “random”.

  • topology (str, optional) – Grid configuration, “rectangular” or “hexagonal”. Defaults to “rectangular”.

  • device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

Raises:

ValueError – If an invalid initialization mode is provided

torchsom.utils.initialization.pca_init(weights, data, topology, device='cpu')[source]

Initialize SOM weights using PCA for faster convergence.

Parameters:
  • weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]

  • data (torch.Tensor) – Input data tensor [batch_size, num_features]

  • topology (str) – Grid configuration, “rectangular” or “hexagonal”

  • device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

torchsom.utils.initialization.random_init(weights, data, device='cpu')[source]

Initialize SOM weights by sampling random data points.

Parameters:
  • weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]

  • data (torch.Tensor) – Input data tensor to sample from [batch_size, num_features]

  • device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.

Returns:

Initialized weights

Return type:

torch.Tensor

Available initialization methods:

Method

Description

random

Random sampling from input data (default)

pca

PCA-based initialization for faster convergence

Metrics

Utility functions for metrics.

torchsom.utils.metrics.calculate_calinski_harabasz_score(data, labels)[source]

Calculate Calinski-Harabasz index using scikit-learn.

Parameters:
  • data (torch.Tensor) – Input data [n_samples, n_features]

  • labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Calinski-Harabasz index (higher is better, >= 0)

Return type:

float

torchsom.utils.metrics.calculate_clustering_metrics(data, labels, som=None)[source]

Calculate comprehensive clustering quality metrics.

Parameters:
  • data (torch.Tensor) – Input data [n_samples, n_features]

  • labels (torch.Tensor) – Cluster labels [n_samples]

  • som (Optional[BaseSOM]) – SOM instance for topological metrics

Returns:

Dictionary of clustering quality metrics

Return type:

dict[str, float]

torchsom.utils.metrics.calculate_davies_bouldin_score(data, labels)[source]

Calculate Davies-Bouldin index using scikit-learn.

Parameters:
  • data (torch.Tensor) – Input data [n_samples, n_features]

  • labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Davies-Bouldin index (lower is better, >= 0)

Return type:

float

torchsom.utils.metrics.calculate_quantization_error(data, weights, distance_fn)[source]

Calculate quantization error for a SOM.

Parameters:
  • data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]

  • weights (torch.Tensor) – SOM weights [row_neurons, col_neurons, num_features]

  • distance_fn (Callable) – Function to compute distances between data and weights

Returns:

Average quantization error value

Return type:

float

torchsom.utils.metrics.calculate_silhouette_score(data, labels)[source]

Calculate silhouette score for clustering results using scikit-learn.

Parameters:
  • data (torch.Tensor) – Input data [n_samples, n_features]

  • labels (torch.Tensor) – Cluster labels [n_samples]

Returns:

Silhouette score (-1 to 1, higher is better)

Return type:

float

torchsom.utils.metrics.calculate_topographic_error(data, weights, distance_fn, topology='rectangular')[source]

Calculate topographic error for a SOM.

Parameters:
  • data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]

  • weights (torch.Tensor) – SOM weights [row_neurons, col_neurons, num_features]

  • distance_fn (Callable) – Function to compute distances between data and weights

  • topology (str, optional) – Grid configuration. Defaults to “rectangular”.

  • xx (#) – Meshgrid of x coordinates. Required for hexagonal topology. Defaults to None.

  • yy (#) – Meshgrid of y coordinates. Required for hexagonal topology. Defaults to None.

Returns:

Topographic error ratio

Return type:

float

torchsom.utils.metrics.calculate_topological_clustering_quality(som, labels)[source]

Calculate how well clusters respect SOM topological structure.

This metric measures the spatial coherence of clusters on the SOM grid. Higher values indicate that clusters are more spatially compact on the grid.

Parameters:
  • som (BaseSOM) – Trained SOM instance

  • labels (torch.Tensor) – Cluster labels for neurons [n_neurons]

Returns:

Topological clustering quality (0 to 1, higher is better)

Return type:

float

Quality metrics for evaluating SOM training:

Metric

Description

quantization_error

Average distance between input vectors and their Best Matching Units (BMUs)

topographic_error

Percentage of data vectors for which the first and second BMUs are not adjacent