NNSOM package#
Submodules#
NNSOM.plots module#
- class NNSOM.plots.SOMPlots(dimensions)#
Bases:
SOMSOMPlots extends the SOM class by adding visualization capabilities to the Self-Organizing Map (SOM). It allows for the graphical representation of the SOM’s structure, the distribution of input data across its neurons, and various other analytical visualizations that aid in the interpretation of the SOM’s behavior and characteristics.
- dimensions#
The dimensions of the SOM grid.
- Type:
tuple
- plt_top()#
Plots the topology of the SOM using hexagonal units.
- plt_top_num()#
Plots the topology of the SOM with numbered neurons.
- hit_hist(x, textFlag)#
Plots a hit histogram showing how many data points are mapped to each neuron.
- gray_hist(x, perc)#
Plots a histogram with neurons colored in shades of gray based on a given percentage value.
- color_hist(x, avg)#
Plots a color-coded histogram based on the average values provided for each neuron.
- cmplx_hit_hist(x, perc_gb, clust, ind_missClass, ind21, ind12)#
Plots a complex hit histogram showing the distribution of data and misclassifications.
- plt_nc()#
Plots the neighborhood connections between the SOM neurons.
- neuron_dist_plot()#
Plots the distances between neurons to visualize the SOM’s topology.
- simple_grid(avg, sizes)#
Plots a simple hexagonal grid with varying colors and sizes based on provided data.
- setup_axes()#
Sets up the axes for plotting individual neuron statistics.
- plt_dist(dist)#
Plots distributions of values across the SOM neurons.
- plt_wgts()#
Plots the weights of the SOM neurons as line graphs.
- plt_pie(title, perc, *argv)#
Plots pie charts for each neuron to show data distribution in categories.
- plt_histogram(som, data)#
Plots histograms for each neuron to show the distribution of data.
- plt_boxplot(data)#
Plots boxplots for each neuron to show the distribution of data.
- plt_dispersion_fan_plot(data)#
Plots dispersion or fan plots for each neuron.
- plt_violin_plot(som, data)#
Plots violin plots for each neuron to show the distribution of data.
- plt_scatter(x, indices, clust, reg_line=True)#
Plots scatter graphs for each neuron to show the distribution of two variables.
- multiplot(plot_type, *args)#
Facilitates plotting of multiple types of graphs based on the plot_type argument.
- button_click_event(button_type, ax, neuron_ind, **kwargs)#
- cmplx_hit_hist(x, clust, perc, ind_missClass, ind21, ind12, mouse_click=False, **kwargs)#
Generates a complex hit histogram. It indicates what the majority class in each cluster is, and how many specific class occur in each cluster.
- Parameters:
x – array-like The input data to be clustered
clust – list List of indices of inputs that belong in each cluster
perc – array-like Percent of the specific class in each cluster
ind_missClass – array-like Indices of consistently misclassified inputs
ind21 – array-like Indices of false positive cases
ind12 – array-like Indices of false negative cases
- color_hist(x, avg, mouse_click=False, **kwargs)#
- component_planes(X)#
- component_positions(x)#
Plots the SOM weight vectors, the Iris dataset input vectors, and connects neighboring neurons.
Parameters: - som: A trained SOM instance with attributes ‘w’ for weight vectors and ‘dimensions’ for grid dimensions. - X_scaled: The normalized Iris dataset input vectors.
- create_click_handler(button_type, ax, neuron_ind, **kwargs)#
- custom_cmplx_hit_hist(x, face_labels, edge_labels, edge_width, mouse_click=False, **kwargs)#
Generate cmplex hit hist Users can specify the face color, edge width and edge color for each neuron.
- x: array-like or sequence of vectors
The input data to be clustered
- face_labels: array-like
class labels to determine the face color of the hexagons
- edge_labels: array-like
class labels to determine the edge color of the hexagons
- edge_width: array-like
A list of edge_width standerdised between (1 - 20). You can call get_edge_width to get the standardised edge width in the utils function. len(edge_width) must be equal to the number of neurons
- mouse_click: bool
If true, the interactive plot and sub-clustering functionalities to be activated
- kwargs: dict
Additional arguments to be passed to the onpick function Possible keys include: ‘data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, patches, text
- determine_button_types(**kwargs)#
- gray_hist(x, perc, mouse_click=False, **kwargs)#
- hit_hist(x, textFlag=True, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate Hit Histogram
- Parameters:
x (array-like) – The input data to be clustered
textFlag (bool) – If true, the number of members of each cluster is printed on the cluster.
mouse_click (bool) – If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – If true, the pick event is connected to the plot
kwargs (dict) – Additional arguments to be passed to the on_pick function Possible keys includes: ‘data’, ‘labels’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Return type:
fig, ax, patches, text
- neuron_dist_plot(mouse_click=False, connect_pick_event=True, **kwargs)#
Generates distance map. The gray hexagons represent cluster centers. The colors of the elongated hexagons between the cluster centers represent the distance between the centers. The darker the color the larget the distance.
- Parameters:
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, pathces, text
- onpick(event, hexagons, hexagon_to_neuron, **kwargs)#
Interactive Plot Function :param event: event
a mouse click event
- Parameters:
hexagons – list a list of hexagons
hexagon_to_neuron – dict a dictionary mapping hexagons to neurons
**kwargs –
- Returns:
None
- plot(plot_type, data_dict=None, ind=None, target_class=None, use_add_array=False, **kwargs)#
Generic Plot Function. It generates a plot based on the plot type and data provides.
- Parameters:
plot_type (str) – The type of plot to be generated: [“top”, “top_num”, “hit_hist”, “gray_hist”, “color_hist”, “complex_hist”, “nc”, “neuron_dist”, “simple_grid”, “stem”, “pie”, “wgts”, “pie”, “hist”, “box”, “violin”, “scatter”, “component_positions”, “component_planes”]
data_dict (dict (optional)) – A dictionary containing the data to be plotted. The key is prefixed with the data type and the value is the data itself. {“data”, “target”, “clust”, “add_1d_array”, “add_2d_array”}
ind (int, str or array-like (optional)) – The indices of the data to be plotted.
target_class (int (optional)) – The target class to be plotted.
use_add_array (bool (optional)) – If true, the additional array to be used.
**kwargs (dict) – Additional arguments to be passed to the interactive plot function.
- plot_box(ax, data, neuronNum)#
- plot_hist(ax, data, neuronNum)#
Helper function to plot histogram in the interactive plots :param ax: :param data: :param neuronNum:
Returns:
- plot_pie(ax, data, neuronNum)#
Helper function to plot pie chart in the interactive plots :param ax: :param data: :param neuronNum:
Returns:
- plot_scatter(ax, num1, num2, neuronNum)#
Helper function to display scatter plot in the interactive plots :param ax: :param data: :param num1: :param num2: :param neuronNum:
Returns:
- plot_stem(ax, align, height, neuronNum)#
- plot_violin(ax, data, neuronNum)#
- plt_boxplot(x, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate box plot for each neuron.
- Parameters:
x – array-like The input data to be plotted in box plot
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- plt_histogram(x, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate histogram for each neuron.
- Parameters:
x – array-like The input data to be plotted in histogram
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- plt_nc(mouse_click=False, connect_pick_event=True, **kwargs)#
Generates neighborhood connection map. The gray hexagons represent cluster centers.
- Parameters:
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, pathces
- plt_pie(x, s=None, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate pie plot for each neuron.
- Parameters:
x – Array or sequence of vectors. The wedge size
s – 1-D array-like, optional Scale the size of the pie chart according to the percent of the specific class in that cell. (0-100) (default: None)
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- plt_scatter(x, y, reg_line=True, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate Scatter Plot for Each Neuron.
- Parameters:
x – input data
indices – array-like indices e.g. (0, 1) or [0, 1]
clust – list of indices of input data for each cluster
reg_line – Flag
- Returns:
fig, ax, h_axes
- plt_stem(x, y, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate stem plot for each neuron.
- Parameters:
x – array-like The x-axis values (align)
y – array-like or sequence of vectors The y-axis values (height)
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- plt_top(mouse_click=False, connect_pick_event=True, **kwargs)#
Plots the topology of the SOM using hexagonal units.
- Parameters:
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, pathces
- plt_top_num(mouse_click=False, connect_pick_event=True, **kwargs)#
Plots the topology of the SOM with numbered neurons.
- Parameters:
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, pathces, text
- plt_violin_plot(x, mouse_click=False, connect_pick_event=True, **kwargs)#
Generate violin plot for each neuron.
- Parameters:
x – array-like The input data to be plotted in violin plot
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- plt_wgts(mouse_click=False, connect_pick_event=True, **kwargs)#
Generate line plot for each neuron.
- Parameters:
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, h_axes
- setup_axes()#
- simple_grid(avg, sizes, mouse_click=False, connect_pick_event=True, **kwargs)#
Basic hexagon grid plot Colors are selected from avg array. Sizes of inner hexagons are selected rom sizes array.
- Parameters:
avg – array-like Average values for each neuron
sizes – array-like Sizes of inner hexagons
mouse_click – bool If true, the interactive plot and sub-clustering functionalities to be activated
connect_pick_event – bool If true, the pick event is connected to the plot
kwarg –
dict Additional arguments to be passed to the onpick function Possible keys include:
’data’, ‘clust’, ‘target’, ‘num1’, ‘num2’, ‘cat’, and ‘topn’
- Returns:
fig, ax, pathces, cbar
- sub_clustering(data, neuron_ind)#
Helper function for interactive function which create the sub-cluster :param data: :param neuron_ind:
Returns:
- weight_as_image(rows=None, mouse_click=False, connect_pick_event=True, **kwargs)#
NNSOM.som module#
- class NNSOM.som.SOM(dimensions)#
Bases:
objectA class to represent a Self-Organizing Map (SOM), a type of artificial neural network trained using unsupervised learning to produce a two-dimensional, discretized representation of the input space of the training samples.
- dimensions#
The dimensions of the SOM grid. Determines the layout and number of neurons in the map.
- Type:
tuple, list, or array-like
- numNeurons#
The total number of neurons in the SOM, calculated as the product of the dimensions.
- Type:
int
- pos#
The positions of the neurons in the SOM grid.
- Type:
array-like
- neuron_dist#
The distances between neurons in the SOM.
- Type:
array-like
- w#
The weight matrix of the SOM, representing the feature vectors of the neurons.
- Type:
array-like
- sim_flag#
A flag indicating whether the SOM has been simulated or not.
- Type:
bool
- __init__(self, dimensions):
Initializes the SOM with the specified dimensions.
- init_w(self, x):
Initializes the weights of the SOM using principal components analysis on the input data x.
- sim_som(self, x):
Simulates the SOM with x as the input, determining which neurons are activated by the input vectors.
- train(self, x, init_neighborhood=3, epochs=200, steps=100):
Trains the SOM using the batch SOM algorithm on the input data x.
- quantization_error(self, dist)#
Calculate quantization error
- topological_error(self, data)#
Calculate 1st and 1st-2nd toplogical error
- distortion_error(self, data)#
Calculate distortion error
- save_pickle(self, filename, path, data_format='pkl'):
Saves the SOM object to a file using the pickle format.
- load_pickle(self, filename, path, data_format='pkl'):
Loads a SOM object from a file using the pickle format.
- cluster_data(x)#
Cluster the input data based on the trained SOM reference vectors.
- Parameters:
x (ndarray (normalized)) – The input data to be clustered.
- Returns:
clusters (list of lists) – A list containing sub-lists, where each sublist represents a cluster. The indices of the input data points belonging to the same cluster are stored in the corresponding sublist, sorted by their proximity to the cluster center.
cluster_distances (list of lists) – A list containing sub-lists, where each sublist represents the distances of the input data points to the corresponding cluster center, sorted in the same order as the indices in the clusters list.
max_cluster_distances (ndarray) – A list containing the maximum distance between each cluster center and the data points belonging to that cluster.
cluster_sizes (ndarray) – A list containing the number of data points in each cluster.
- Raises:
ValueError – If the SOM has not been trained.
ValueError – If the number of features in the input data and the SOM weights do not match.
- distortion_error(x)#
Calculate distortion
- init_w(x, norm_func=None)#
Initializes the weights of the SOM using principal components analysis (PCA) on the input data x.
- Parameters:
x (np.ndarray) – The input data used for weight initialization.
- load_pickle(filename, path, data_format='pkl')#
Load the SOM object from a file using pickle.
- Parameters:
filename (str) – The name of the file to load the SOM object from.
path (str) – The path to the file to load the SOM object from.
data_format (str) – The format to load the SOM object from. Must be one of: pkl
- Return type:
None
- normalize(x, norm_func=None)#
Normalize the input data using a custom function.
- Parameters:
x (array-like) – The input data to be normalized.
norm_func (callable, optional) – A custom normalization or standardization function to be applied to the input data. If provided, it should take the input data as its argument and return the preprocessed data. Default is None, in which case the input data is returned as-is.
- Returns:
x_preprocessed – The preprocessed input data.
- Return type:
array-like
- Raises:
Warning – If norm_func is None, a warning is raised to indicate the potential inefficiency in SOM training.
Examples
>>> import numpy as np >>> from sklearn.datasets import load_iris >>> from sklearn.feature_extraction.text import TfidfVectorizer >>> from sklearn.preprocessing import StandardScaler
>>> # Case 1: Tabular data (without normalization) >>> iris = load_iris() >>> X = iris.data >>> som = SOM(dimensions=(5, 5)) >>> X_norm = som.normalize(X) >>> print(np.allclose(np.transpose(X_norm), X)) True
>>> # Case 2: Image data (using custom normalization) >>> image_data = np.random.randint(0, 256, size=(28, 28)) >>> som = SOM(dimensions=(10, 10)) >>> custom_norm_func = lambda x: x / 255 # Custom normalization function >>> image_data_norm = som.normalize(image_data, norm_func=custom_norm_func) >>> print(image_data_norm.min(), image_data_norm.max()) 0.0 1.0
>>> # Case 3: Text data (without normalization) >>> text_data = ["This is a sample text.", "Another example sentence."] >>> vectorizer = TfidfVectorizer() >>> tfidf_matrix = vectorizer.fit_transform(text_data) >>> som = SOM(dimensions=(8, 8)) >>> text_data_norm = som.normalize(tfidf_matrix.toarray()) >>> print(np.allclose(np.transpose(text_data_norm), tfidf_matrix.toarray())) True
- quantization_error(dist)#
Calculate quantization error
- save_pickle(filename, path, data_format='pkl')#
Save the SOM object to a file using pickle.
- Parameters:
filename (str) – The name of the file to save the SOM object to.
path (str) – The path to the file to save the SOM object to.
data_format (str) – The format to save the SOM object in. Must be one of: pkl
- Return type:
None
- sim_som(x)#
Simulates the SOM with x as the input, determining which neurons are activated by the input vectors.
- Parameters:
x (np.ndarray) – The input data to simulate the SOM with.
- Returns:
The simulated output of the SOM.
- Return type:
np.ndarray
- topological_error(x)#
Calculate topological error
- train(x, init_neighborhood=3, epochs=200, steps=100, norm_func=None)#
Trains the SOM using the batch SOM algorithm on the input data x.
- Parameters:
x (np.ndarray) – The input data to train the SOM with.
init_neighborhood (int, optional) – The initial neighborhood size.
epochs (int, optional) – The number of epochs to train for.
steps (int, optional) – The number of steps for training.
- Return type:
None
NNSOM.utils module#
- NNSOM.utils.cal_class_cluster_intersect(clust, *args)#
Calculate the intersection sizes of each class with each neuron cluster.
This function computes the size of the intersection between each given class (represented by arrays of indices) and each neuron cluster (represented by a list of lists of indices). The result is a 2D array where each row corresponds to a neuron cluster, and each column corresponds to one of the classes.
- Parameters:
clust (list of lists) – A collection of neuron clusters, where each neuron cluster is a list of indices.
*args (sequence of array-like) – A variable number of arrays, each representing a class with indices.
- Returns:
A 2D array where the entry at position (i, j) represents the number of indices in the j-th class that are also in the i-th neuron cluster.
- Return type:
numpy.ndarray
Examples
>>> clust = [[4, 5, 9], [1, 7], [2, 10, 11], [3, 6, 8]] >>> ind1 = np.array([1, 2, 3]) >>> ind2 = np.array([4, 5, 6]) >>> ind3 = np.array([7, 8, 9]) >>> ind4 = np.array([10, 11, 12]) >>> get_sizes_clust(clust, ind1, ind2, ind3, ind4) array([[0, 2, 1, 0], [1, 0, 1, 0], [1, 0, 0, 2], [1, 1, 1, 0]])
- NNSOM.utils.calculate_button_positions(num_buttons, sidebar_width)#
- NNSOM.utils.calculate_positions(dim)#
- NNSOM.utils.cart2pol(x, y)#
- NNSOM.utils.closest_class_cluster(cat_feature, clust)#
Returns the cluster array with the closest class for each cluster.
Paramters#
- cat_featurearray-like
Categorical feature array.
- clustlist
A cluster array of indices sorted by distances.
- returns:
closest_class – A cluster array with the closest class for each cluster.
- rtype:
numpy array
- NNSOM.utils.count_classes_in_cluster(cat_feature, clust)#
Count the occurrences of each class in each cluster using vectorized operations for efficiency.
- Parameters:
cat_feature (array-like) – Categorical feature array.
clust (list) – A list of arrays, each containing the indices of elements in a cluster.
- Returns:
cluster_counts – A 2D array with counts of each class in each cluster.
- Return type:
numpy array
- NNSOM.utils.create_buttons(fig, button_types)#
- NNSOM.utils.distances(pos)#
- NNSOM.utils.flatten(data)#
Recursively flattens a nested list structure of numbers into a single list.
- Parameters:
data – A number (int or float) or a nested list of numbers. The data to be flattened.
- Returns:
A list of numbers, where all nested structures in the input have been flattened into a single list.
- NNSOM.utils.get_cluster_array(feature, clust)#
Returns a NumPy array of objects, each containing the feature values for each cluster.
- Parameters:
feature (array-like) – Feature array.
clust (list) – A list of cluster arrays, each containing indices sorted by distances.
- Returns:
cluster_array – A NumPy array where each element is an array of feature values for that cluster.
- Return type:
numpy.ndarray
- NNSOM.utils.get_cluster_avg(feature, clust)#
Returns the average value of a feature for each cluster.
- Parameters:
feature (array-like) – Feature array.
clust (list) – A list of cluster arrays, each containing indices sorted by distances.
- Returns:
cluster_avg – A cluster array with the average value of the feature for each cluster.
- Return type:
numpy array
- NNSOM.utils.get_cluster_data(data, clust)#
For each cluster, extract the corresponding data points and return them in a list.
- Parameters:
data (numpy array) – The dataset from which to extract the clusters’ data.
clust (list of arrays) – A list where each element is an array of indices for data points in the corresponding cluster.
- Returns:
cluster_data_list – A list where each element is a numpy array containing the data points of a cluster.
- Return type:
list of numpy arrays
- NNSOM.utils.get_color_labels(clust, *listOfIndices)#
Generates color label for each cluster based on indices of classes.
- Parameters:
clust – sequence of vectors A sequence of vectors, each containing the indices of elements in a cluster.
*args – 1-d array A list of indices where the specific class is present.
- NNSOM.utils.get_conf_indices(target, results, target_class)#
Get the indices of True Positive, True Negative, False Positive, and False Negative for a specific target class.
- Parameters:
target (array-like) – The true target values.
results (array-like) – The predicted values.
target_class (int) – The target class for which to get the confusion indices.
- Returns:
tp_index (numpy array) – Indices of True Positives.
tn_index (numpy array) – Indices of True Negatives.
fp_index (numpy array) – Indices of False Positives.
fn_index (numpy array) – Indices of False Negatives.
- NNSOM.utils.get_dominant_class_error_types(dominant_classes, error_types)#
Map dominant class to the corresponding majority error type for each cluster, dynamically applying the correct error type based on the dominant class.
Parameters:#
- dominant_classes: array-like (som.numNeurons, )
List of dominant class labels for each cluster. May contain NaN values.
- error_types: list of array-like (numClasses, som.numNeurons)
Variable number of arrays, each representing majority error types for each class.
Returns:#
- array-like (som.numNeurons, )
List of majority error type for each cluster corresponding to the dominant class.
- NNSOM.utils.get_edge_shape()#
- NNSOM.utils.get_edge_widths(indices, clust)#
Calculate edge width for each cluster based on the number of indices in the cluster.
- Parameters:
indices – 1-d array Array of indices for the specific class.
clust – sequence of vectors A sequence of vectors, each containing the indices of elements in a cluster.
- Returns:
- 1-d array
Array of edge widths for each cluster.
- Return type:
lwidth
- NNSOM.utils.get_global_min_max(data)#
Finds the global minimum and maximum values in a nested list structure.
This function flattens the input data into a single list and then determines the minimum and maximum values.
- Parameters:
data – A nested list of integers. The structure can be of any depth.
- Returns:
A tuple (min_value, max_value) where min_value is the minimum value in the data, and max_value is the maximum value.
- NNSOM.utils.get_hexagon_shape()#
- NNSOM.utils.get_ind_misclassified(target, prediction)#
Get the indices of misclassified items.
- Parameters:
target (array-like) – The true target values.
prediction (array-like) – The predicted values.
- Returns:
misclassified_indices – List of indices of misclassified items.
- Return type:
list
- NNSOM.utils.get_perc_cluster(cat_feature, target, clust)#
Return cluster array with the percentage of a specific target class in each cluster.
- Parameters:
cat_feature (array-like) – Categorical feature array.
target (int or str) – Target class to calculate the percentage.
clust (list) – A cluster array of indices sorted by distances.
- Returns:
cluster_array – A cluster array with the percentage of target class.
- Return type:
numpy array
- NNSOM.utils.get_perc_misclassified(target, prediction, clust)#
Calculate the percentage of misclassified items in each cluster and return as a numpy array.
- Parameters:
target (array-like) – The true target values.
prediction (array-like) – The predicted values.
clust (array-like) – List of arrays, each containing the indices of elements in a cluster.
- Returns:
proportion_misclassified – Percentage of misclassified items in each cluster.
- Return type:
numpy array
- NNSOM.utils.majority_class_cluster(cat_feature, clust)#
Returns the cluster array with the majority class for each cluster.
Paramters#
- cat_featurearray-like
Categorical feature array.
- clustlist
A cluster array of indices sorted by distances.
- returns:
majority_class – A cluster array with the majority class
- rtype:
numpy array
- NNSOM.utils.normalize_position(position)#
- NNSOM.utils.pol2cart(theta, rho)#
- NNSOM.utils.preminmax(p)#
- NNSOM.utils.rotate_xy(x1, y1, angle)#
- NNSOM.utils.spread_positions(position, positionMean, positionBasis)#