hots.plugins.clustering.*
The clustering logic is implemented by several plugins in the
hots.plugins.clustering package.
Clustering builder utilities for HOTS.
- hots.plugins.clustering.builder.assign_new_containers_to_nearest_cluster(clust_mat: DataFrame, label_col: str = 'cluster') DataFrame[source]
For any row with cluster == -1, assign it to the cluster of its nearest existing container.
Mutates and returns clust_mat.
- hots.plugins.clustering.builder.build_adjacency_matrix(labels_)[source]
Build the adjacency matrix of clustering.
- Parameters:
labels (List) – List of clusters assigned to individuals
- Returns:
Adjacency matrix
- Return type:
np.array
- hots.plugins.clustering.builder.build_matrix_indiv_attr(df: DataFrame, tick_field: str, indiv_field: str, metrics: list, id_map: dict) DataFrame[source]
Build a container×time matrix from individual‐level DataFrame.
- hots.plugins.clustering.builder.build_post_clust_matrices(clust_mat)[source]
Build result clustering dataframes and matrices to be used.
- hots.plugins.clustering.builder.build_pre_clust_matrices(df, tick_field, indiv_field, metrics, id_map, clustering, new_containers: bool = False)[source]
Build period clustering dataframes and matrices to be used.
- hots.plugins.clustering.builder.build_similarity_matrix(mat: DataFrame) DataFrame[source]
Compute pairwise Euclidean distance matrix from input matrix.
- hots.plugins.clustering.builder.build_var_delta_cluster_matrix(df_clust, cluster_var_matrix, *, zero_diag=True)[source]
Build variance of deltas matrix from cluster.
- hots.plugins.clustering.builder.change_clustering(mvg_containers, clustering, dict_id_c: dict, tol_open_clust: float | None = None)[source]
Reassign each container in mvg_containers to the closest existing cluster (by Euclidean distance to the cluster mean profile).
- hots.plugins.clustering.builder.cluster_mean_profile(df_clust: DataFrame, cluster_col: str = 'cluster') ndarray[source]
Compute the mean profile of each cluster.
- hots.plugins.clustering.builder.dist_from_mean(df_clust, profiles, cid: str) float[source]
Return distance from cid to its cluster mean profile.
- hots.plugins.clustering.builder.get_far_container(c1, c2, df_clust: DataFrame, profiles: ndarray) str[source]
Return c1 if it’s farther from its cluster mean than c2 is, else return c2.
- hots.plugins.clustering.builder.pairwise_sum_profile_var(profiles: ndarray) ndarray[source]
Compute a matrix of variance of sum of profiles for each pair of cluster.
Clustering plugin: mini‐batch KMeans streaming.
- class hots.plugins.clustering.kmeans.StreamKMeans(params: dict[str, Any], instance)[source]
Bases:
ClusteringPluginStreamKMeans plugin using scikit‐learn’s MiniBatchKMeans.
Clustering plugin: agglomerative (hierarchical) clustering.
- class hots.plugins.clustering.hierarchical.HierarchicalClustering(parameters: dict, instance)[source]
Bases:
ClusteringPluginHierarchical clustering plugin using SciPy linkage.
Clustering plugin: spectral clustering with precomputed affinity.
- class hots.plugins.clustering.spectral.SpectralClustering(parameters: dict, instance)[source]
Bases:
ClusteringPluginSpectral clustering plugin using a precomputed similarity matrix.
Clustering plugin: custom spectral clustering for HOTS.
- class hots.plugins.clustering.custom_spectral.CustomSpectralClustering(parameters: dict, instance)[source]
Bases:
ClusteringPluginCustom spectral clustering plugin using normalized Laplacian.