hots.clustering
Provide clustering algorithms and all clustering-related methods. Here are the available clustering algorithms : k-means, hierarchical, spectral, custom spectral.
- hots.clustering.build_adjacency_matrix(labels_)[source]
Build the adjacency matrix of clustering.
- Parameters:
labels (_type_) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.build_matrix_indiv_attr(df)[source]
Build entire clustering matrix.
- Parameters:
df (pd.DataFrame) – _description_
- Returns:
_description_
- Return type:
Tuple[pd.DataFrame, Dict]
- hots.clustering.build_similarity_matrix(df)[source]
Build a similarity matrix for the clustering.
- Parameters:
df (pd.DataFrame) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.change_clustering(mvg_containers, df_clust, labels_, dict_id_c, tol_open_clust)[source]
Adjust the clustering with individuals to move to the closest cluster.
- Parameters:
mvg_containers (List) – _description_
df_clust (pd.DataFrame) – _description_
labels (List) – _description_
dict_id_c (Dict) – _description_
tol_open_clust (float) – _description_
- Returns:
_description_
- Return type:
Tuple[pd.DataFrame, List, int]
- hots.clustering.change_clustering_maxkcut(conflict_graph, df_clust, labels_, dict_id_c)[source]
Change current clustering with max-k-cut on moving containers.
- Parameters:
conflict_graph (nx.Graph) – _description_
df_clust (pd.DataFrame) – _description_
labels (List) – _description_
dict_id_c (Dict) – _description_
- Returns:
_description_
- Return type:
Tuple[pd.DataFrame, List, int]
- hots.clustering.check_container_deviation(working_df, labels_, profiles_, dict_id_c)[source]
Check the deviation of container from its cluster.
- Parameters:
working_df (pd.DataFrame) – _description_
labels (List) – _description_
profiles (np.array) – _description_
dict_id_c (Dict) – _description_
- hots.clustering.compute_distance_cluster_r(p, r, mu_r, u, dp)[source]
Compute distance between individual p and cluster r.
- Parameters:
p (int) – _description_
r (int) – _description_
mu_r (float) – _description_
u (np.array) – _description_
dp (float) – _description_
- Returns:
_description_
- Return type:
float
- hots.clustering.compute_mu_r(w, d, labels_, r, u)[source]
Compute center of cluster r.
- Parameters:
w (np.array) – _description_
d (np.array) – _description_
labels (List) – _description_
r (int) – _description_
u (np.array) – _description_
- Returns:
_description_
- Return type:
float
- hots.clustering.eval_clustering(df_clust, w, dict_id_c)[source]
Evaluate the clustering with ICS and ICD.
- Parameters:
df_clust (pd.DataFrame) – _description_
w (np.array) – _description_
dict_id_c (Dict) – _description_
- Returns:
_description_
- Return type:
Tuple[float, float]
- hots.clustering.get_cluster_balance(df_clust)[source]
Display size of each cluster.
- Parameters:
df_clust (pd.DataFrame) – _description_
- hots.clustering.get_cluster_mean_profile(df_clust)[source]
Compute the mean profile of each cluster.
- Parameters:
df_clust (pd.DataFrame) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.get_cluster_variance(profiles_)[source]
Compute the variance of each cluster.
- Parameters:
profiles (np.array) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.get_distance_cluster(instance, cluster_centers_)[source]
Compute the distance between each cluster.
- Parameters:
instance (Instance) – _description_
cluster_centers (np.array) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.get_distance_container_cluster(conso_cont, profile)[source]
Compute the distance between the container profile and his cluster’s mean profile.
- Parameters:
conso_cont (np.array) – _description_
profile (np.array) – _description_
- Returns:
_description_
- Return type:
float
- hots.clustering.get_far_container(c1, c2, df_clust, profiles)[source]
Get the farthest container between c1 and c2 compared to profile.
- Parameters:
c1 (str) – _description_
c2 (str) – _description_
df_clust (pd.DataFrame) – _description_
profiles (np.array) – _description_
- Returns:
_description_
- Return type:
str
- hots.clustering.get_silhouette(df_clust, labels_)[source]
Get the Silhouette score from clustering.
- Parameters:
df_clust (pd.DataFrame) – _description_
labels (List) – _description_
- Returns:
_description_
- Return type:
float
- hots.clustering.get_sum_cluster_variance(profiles_, vars_)[source]
Compute a matrix of sum of variances of each pair of cluster.
- Parameters:
profiles (np.array) – _description_
vars (np.array) – _description_
- Returns:
_description_
- Return type:
np.array
- hots.clustering.hierarchical_clustering(data, k)[source]
Perform the hierarchical ascendant clustering.
- Parameters:
data (pd.DataFrame) – _description_
k (int) – _description_
- Returns:
_description_
- Return type:
List
- hots.clustering.k_means(data, k)[source]
Perform the K-means clustering.
- Parameters:
data (pd.DataFrame) – _description_
k (int) – _description_
- Returns:
_description_
- Return type:
List
- hots.clustering.matrix_line(args)[source]
Build one line for clustering matrix.
- Parameters:
args (Tuple[str, pd.DataFrame]) – _description_
- Returns:
_description_
- Return type:
Tuple[int, Dict]
- hots.clustering.p_dist(data, metric='euclidean')[source]
Compute distances between each data point pair.
- Parameters:
data (pd.DataFrame) – _description_
metric (str, optional) – _description_, defaults to ‘euclidean’
- Returns:
_description_
- Return type:
np.array
- hots.clustering.perform_clustering(data, algo, k)[source]
Call the specified method to perform clustering.
- Parameters:
data (pd.DataFrame) – _description_
algo (str) – _description_
k (int) – _description_
- Returns:
_description_
- Return type:
Callable[[pd.DataFrame, int], List]
- hots.clustering.perso_spectral_clustering(data, k)[source]
Perform a customized version of spectral clustering.
- Parameters:
data (pd.DataFrame) – _description_
k (int) – _description_
- Returns:
_description_
- Return type:
np.array