Utilities API¶
The utils module provides supporting functions for SOM training and analysis.
Distance Functions¶
Utility functions for distances.
Available distance functions:
Function |
Description |
|---|---|
|
Standard Euclidean distance (default) |
|
Cosine distance (1 - cosine similarity) |
|
Manhattan (L1) distance |
|
Chebyshev (L∞) distance |
Neighborhood Functions¶
Utility functions for neighborhood functions.
Available neighborhood functions:
Function |
Description |
|---|---|
|
Gaussian neighborhood (default, smooth) |
|
Mexican hat (Ricker wavelet, inhibitory surround) |
|
Bubble function (step function) |
|
Triangular function (linear decay) |
Decay Functions¶
Utility functions for decay functions.
Available decay functions:
Function |
Description |
|---|---|
|
General asymptotic decay (default) |
|
Learning rate inverse decay to zero |
|
Learning rate linear decay to zero |
|
Sigma inverse decay to one |
|
Sigma linear decay to one |
Grid and Topology¶
Utility functions for grid operations.
- torchsom.utils.grid.adjust_meshgrid_topology(xx, yy, topology)[source]¶
Adjust coordinates based on topology.
- Parameters:
xx (torch.Tensor) – Mesh grid of x coordinates
yy (torch.Tensor) – Mesh grid of y coordinates
topology (str) – SOM configuration, usually rectangular or hexagonal
- Returns:
Adjusted x and y mesh grids for a hexagonal topology.
- Return type:
Tuple[torch.Tensor, torch.Tensor]
- torchsom.utils.grid.create_mesh_grid(x, y, device)[source]¶
Create a mesh grid for neighborhood calculations.
The function returns two 2D tensors representing the x-coordinates and y-coordinates of a grid of shape (x, y). This is useful for computing distance-based neighborhood functions in Self-Organizing Maps (SOM).
- Parameters:
- Returns:
Two tensors (xx, yy) of shape (x, y), representing the x and y coordinates of the mesh grid.
- Return type:
Tuple[torch.Tensor, torch.Tensor]
Utility functions for topology.
- torchsom.utils.topology.get_all_neighbors_up_to_order(topology, max_order)[source]¶
Get all neighbors from order 1 up to max_order.
- torchsom.utils.topology.get_hexagonal_offsets(neighborhood_order=1)[source]¶
Get neighbor offset coordinates for hexagonal topology at any order.
Order n has 6*n elements.
- torchsom.utils.topology.get_rectangular_offsets(neighborhood_order=1)[source]¶
Get neighbor offset coordinates for rectangular topology at any order.
- Parameters:
neighborhood_order (int, optional) – Order of neighborhood ring. Defaults to 1.
- Returns:
Coordinate offsets for rectangular grid
- Return type:
Notes
Order 1: 8 neighbors Order 2: 16 neighbors Order 3: 24 neighbors Order 3+: All positions at Chebyshev distance (order-1)
Initialization¶
Utility functions for initialization.
- torchsom.utils.initialization.initialize_weights(weights, data, mode='random', topology='rectangular', device='cpu')[source]¶
Main function to initialize weights based on specified method.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
mode (str, optional) – Initialization method, “random” or “pca”. Defaults to “random”.
topology (str, optional) – Grid configuration, “rectangular” or “hexagonal”. Defaults to “rectangular”.
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
- Raises:
ValueError – If an invalid initialization mode is provided
- torchsom.utils.initialization.pca_init(weights, data, topology, device='cpu')[source]¶
Initialize SOM weights using PCA for faster convergence.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor [batch_size, num_features]
topology (str) – Grid configuration, “rectangular” or “hexagonal”
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
- torchsom.utils.initialization.random_init(weights, data, device='cpu')[source]¶
Initialize SOM weights by sampling random data points.
- Parameters:
weights (torch.Tensor) – Weight tensor to initialize [row_neurons, col_neurons, num_features]
data (torch.Tensor) – Input data tensor to sample from [batch_size, num_features]
device (str, optional) – Device for tensor computations. Defaults to “cuda” if available, else “cpu”.
- Returns:
Initialized weights
- Return type:
Available initialization methods:
Method |
Description |
|---|---|
|
Random sampling from input data (default) |
|
PCA-based initialization for faster convergence |
Metrics¶
Utility functions for metrics.
- torchsom.utils.metrics.calculate_calinski_harabasz_score(data, labels)[source]¶
Calculate Calinski-Harabasz index using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Calinski-Harabasz index (higher is better, >= 0)
- Return type:
- torchsom.utils.metrics.calculate_clustering_metrics(data, labels, som=None)[source]¶
Calculate comprehensive clustering quality metrics.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
som (Optional[BaseSOM]) – SOM instance for topological metrics
- Returns:
Dictionary of clustering quality metrics
- Return type:
- torchsom.utils.metrics.calculate_davies_bouldin_score(data, labels)[source]¶
Calculate Davies-Bouldin index using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Davies-Bouldin index (lower is better, >= 0)
- Return type:
- torchsom.utils.metrics.calculate_quantization_error(data, weights, distance_fn)[source]¶
Calculate quantization error for a SOM.
- Parameters:
data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [x, y, num_features]
distance_fn (Callable) – Function to compute distances between data and weights
- Returns:
Average quantization error value
- Return type:
- torchsom.utils.metrics.calculate_silhouette_score(data, labels)[source]¶
Calculate silhouette score for clustering results using scikit-learn.
- Parameters:
data (torch.Tensor) – Input data [n_samples, n_features]
labels (torch.Tensor) – Cluster labels [n_samples]
- Returns:
Silhouette score (-1 to 1, higher is better)
- Return type:
- torchsom.utils.metrics.calculate_topographic_error(data, weights, distance_fn, topology='rectangular', pbc=False)[source]¶
Calculate topographic error for a SOM.
- Parameters:
data (torch.Tensor) – Input data tensor [batch_size, num_features] or [num_features]
weights (torch.Tensor) – SOM weights [x, y, num_features]
distance_fn (Callable) – Function to compute distances between data and weights
topology (str, optional) – Grid configuration. Defaults to “rectangular”.
pbc (bool, optional) – Whether periodic boundary conditions are enabled. Defaults to False.
- Returns:
Topographic error ratio
- Return type:
- torchsom.utils.metrics.calculate_topological_clustering_quality(som, labels)[source]¶
Calculate how well clusters respect SOM topological structure.
This metric measures the spatial coherence of clusters on the SOM grid. Higher values indicate that clusters are more spatially compact on the grid.
- Parameters:
som (BaseSOM) – Trained SOM instance
labels (torch.Tensor) – Cluster labels for neurons [n_neurons]
- Returns:
Topological clustering quality (0 to 1, higher is better)
- Return type:
Quality metrics for evaluating SOM training:
Metric |
Description |
|---|---|
|
Average distance between input vectors and their Best Matching Units (BMUs) |
|
Percentage of data vectors for which the first and second BMUs are not adjacent |