Utilities API
The utils module provides supporting functions for SOM training and analysis.
Distance Functions
Utility functions for distances.
Available distance functions:
Function |
Description |
---|---|
|
Standard Euclidean distance (default) |
|
Cosine distance (1 - cosine similarity) |
|
Manhattan (L1) distance |
|
Chebyshev (L∞) distance |
|
Weighted Euclidean distance with feature weights |
Neighborhood Functions
Utility functions for neighborhood functions.
Available neighborhood functions:
Function |
Description |
---|---|
|
Gaussian neighborhood (default, smooth) |
|
Mexican hat (Ricker wavelet, inhibitory surround) |
|
Bubble function (step function) |
|
Triangular function (linear decay) |
Decay Functions
Utility functions for decay functions.
Available decay functions:
Function |
Description |
---|---|
|
General asymptotic decay (default) |
|
Learning rate inverse decay to zero |
|
Learning rate linear decay to zero |
|
Sigma inverse decay to one |
|
Sigma linear decay to one |
Grid and Topology
Utility functions for grid operations.
- torchsom.utils.grid.adjust_meshgrid_topology(xx, yy, topology)[source]
Adjust coordinates based on topology.
- Parameters:
xx (torch.Tensor) – Mesh grid of x coordinates
yy (torch.Tensor) – Mesh grid of y coordinates
topology (str) – SOM configuration, usually rectangular or hexagonal
- Returns:
Adjusted x and y mesh grids for a hexagonal topology.
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- torchsom.utils.grid.create_mesh_grid(x, y, device)[source]
Create a mesh grid for neighborhood calculations.
The function returns two 2D tensors representing the x-coordinates and y-coordinates of a grid of shape (x, y). This is useful for computing distance-based neighborhood functions in Self-Organizing Maps (SOM).
- Parameters:
x (int) – Number of rows (height of the grid).
y (int) – Number of columns (width of the grid).
device (str) – The device on which tensors should be allocated (‘cpu’ or ‘cuda’).
- Returns:
Two tensors (xx, yy) of shape (x, y), representing the x and y coordinates of the mesh grid.
- Return type:
Tuple[torch.Tensor, torch.Tensor]
Utility functions for topology.
- torchsom.utils.topology.get_all_neighbors_up_to_order(topology, max_order)[source]
Get all neighbors from order 1 up to max_order.
- Parameters:
topology (str) – “rectangular” or “hexagonal”
max_order (int) – Maximum neighborhood order to include
- Returns:
All neighbor offsets from order 1 to max_order combined
- Return type:
list[tuple[int, int]] | dict[str, list[tuple[int, int]]]
- torchsom.utils.topology.get_hexagonal_offsets(neighborhood_order=1)[source]
Get neighbor offset coordinates for hexagonal topology at any order.
Order n has 6*n elements.
- Parameters:
neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.
- Returns:
Offsets for even and odd rows
- Return type:
Dict[str, List[Tuple[int, int]]]
- torchsom.utils.topology.get_rectangular_offsets(neighborhood_order=1)[source]
Get neighbor offset coordinates for rectangular topology at any order.
- Parameters:
neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.
- Returns:
Coordinate offsets for rectangular grid
- Return type:
List[Tuple[int, int]]
Notes
Order 1: 8 neighbors Order 2: 16 neighbors Order 3: 24 neighbors Order 3+: All positions at Chebyshev distance (order-1)
Initialization
Utility functions for initialization.
- torchsom.utils.initialization.initialize_weights(weights, data, mode='random', topology='rectangular', device='cpu')[source]
Main function to initialize weights based on specified method.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
mode (str, optional) – Initialization method, “random” or “pca”. Defaults to “random”.
topology (str, optional) – Grid configuration, “rectangular” or “hexagonal”. Defaults to “rectangular”.
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
torch.Tensor
- Raises:
ValueError – If an invalid initialization mode is provided
- torchsom.utils.initialization.pca_init(weights, data, topology, device='cpu')[source]
Initialize SOM weights using PCA for faster convergence.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
topology (str) – Grid configuration, “rectangular” or “hexagonal”
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
torch.Tensor
- torchsom.utils.initialization.random_init(weights, data, device='cpu')[source]
Initialize SOM weights by sampling random data points.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor to sample from [batch_size, num_features]
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
torch.Tensor
Available initialization methods:
Method |
Description |
---|---|
|
Random sampling from input data (default) |
|
PCA-based initialization for faster convergence |
Metrics
Utility functions for metrics.
- torchsom.utils.metrics.calculate_calinski_harabasz_score(data, labels)[source]
Calculate Calinski-Harabasz index using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Calinski-Harabasz index (higher is better, >= 0)
- Return type:
float
- torchsom.utils.metrics.calculate_clustering_metrics(data, labels, som=None)[source]
Calculate comprehensive clustering quality metrics.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
som (Optional[BaseSOM]) – SOM instance for topological metrics
- Returns:
Dictionary of clustering quality metrics
- Return type:
dict[str, float]
- torchsom.utils.metrics.calculate_davies_bouldin_score(data, labels)[source]
Calculate Davies-Bouldin index using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Davies-Bouldin index (lower is better, >= 0)
- Return type:
float
- torchsom.utils.metrics.calculate_quantization_error(data, weights, distance_fn)[source]
Calculate quantization error for a SOM.
- Parameters:
data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [row_neurons, col_neurons, num_features]
distance_fn (Callable) – Function to compute distances between data and weights
- Returns:
Average quantization error value
- Return type:
float
- torchsom.utils.metrics.calculate_silhouette_score(data, labels)[source]
Calculate silhouette score for clustering results using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Silhouette score (-1 to 1, higher is better)
- Return type:
float
- torchsom.utils.metrics.calculate_topographic_error(data, weights, distance_fn, topology='rectangular')[source]
Calculate topographic error for a SOM.
- Parameters:
data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [row_neurons, col_neurons, num_features]
distance_fn (Callable) – Function to compute distances between data and weights
topology (str, optional) – Grid configuration. Defaults to “rectangular”.
xx (#) – Meshgrid of x coordinates. Required for hexagonal topology. Defaults to None.
yy (#) – Meshgrid of y coordinates. Required for hexagonal topology. Defaults to None.
- Returns:
Topographic error ratio
- Return type:
float
- torchsom.utils.metrics.calculate_topological_clustering_quality(som, labels)[source]
Calculate how well clusters respect SOM topological structure.
This metric measures the spatial coherence of clusters on the SOM grid. Higher values indicate that clusters are more spatially compact on the grid.
- Parameters:
som (BaseSOM) – Trained SOM instance
labels (torch.Tensor) – Cluster labels for neurons [n_neurons]
- Returns:
Topological clustering quality (0 to 1, higher is better)
- Return type:
float
Quality metrics for evaluating SOM training:
Metric |
Description |
---|---|
|
Average distance between input vectors and their Best Matching Units (BMUs) |
|
Percentage of data vectors for which the first and second BMUs are not adjacent |