Source: r-cran-genieclust Section: gnu-r Priority: optional Maintainer: Debian R Packages Maintainers Uploaders: Andreas Tille Vcs-Browser: https://salsa.debian.org/r-pkg-team/r-cran-genieclust Vcs-Git: https://salsa.debian.org/r-pkg-team/r-cran-genieclust.git Homepage: https://cran.r-project.org/package=genieclust Standards-Version: 4.6.2 Rules-Requires-Root: no Build-Depends: debhelper-compat (= 13), dh-r, r-base-dev, r-cran-rcpp (>= 1.0.4) Testsuite: autopkgtest-pkg-r Package: r-cran-genieclust Architecture: any Depends: ${R:Depends}, ${shlibs:Depends}, ${misc:Depends} Recommends: ${R:Recommends} Suggests: ${R:Suggests} Description: GNU R Genie++ Hierarchical Clustering Algorithm with Noise Points Detection A retake on the Genie algorithm - a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 ). Now faster and more memory efficient; determining the whole hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only 1-2 minutes. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). . The package also features an implementation of economic inequity indices (the Gini, Bonferroni index) and external cluster validity measures (partition similarity scores; e.g., the adjusted Rand, Fowlkes-Mallows, adjusted mutual information, pair sets index). . See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.