1
1
<a href =" https://quitefastmst.gagolewski.com " ><img src =" https://www.gagolewski.com/_static/img/quitefastmst.png " align =" right " height =" 128 " width =" 128 " /></a >
2
- # [ ** quitefastmst* *] ( https://quitefastmst.gagolewski.com/ ) Package for R and Python
2
+ # [ * quitefastmst* ] ( https://quitefastmst.gagolewski.com/ ) Package for R and Python
3
3
4
- ** TO DO: this package is a work in progress**
4
+ ** TO DO: this package is a work in progress** – ** check back later **
5
5
6
- ** check back later**
7
6
8
-
9
-
10
-
11
- ### * quitefastmst* : Euclidean and Mutual Reachability Distance Minimum Spanning Trees
7
+ ### * quitefastmst* : Euclidean and Mutual Reachability Minimum Spanning Trees
12
8
13
9
14
10
![ quitefastmst for Python] ( https://github.com/gagolews/quitefastmst/workflows/quitefastmst%20for%20Python/badge.svg )
15
11
![ quitefastmst for R] ( https://github.com/gagolews/quitefastmst/workflows/quitefastmst%20for%20R/badge.svg )
16
12
17
13
18
- > A comprehensive tutorial, benchmarks, and a reference manual is available
19
- at <https://quitefastmst.gagolewski.com/> .
20
14
21
15
16
+ Package ** features** :
22
17
23
- ## Author and Contributors
18
+ * [ Euclidean Minimum Spanning Trees] ( https://en.wikipedia.org/wiki/Euclidean_minimum_spanning_tree )
19
+ using single-, sesqui-, and dual-tree Borůvka algorithms – quite fast
20
+ in spaces of low intrinsic dimensionality,
24
21
25
- ** Author and Maintainer** : [ Marek Gagolewski] ( https://www.gagolewski.com/ )
22
+ * support for mutual reachability distances based on the Euclidean metric
23
+ (like in the HDBSCAN\* algorithm; see Campello, Moulavi, Sander, 2013),
26
24
25
+ * Euclidean nearest neighbours with nicely-optimised K-d trees,
27
26
28
- ## About
27
+ * relatively fast fallback algorithms for spaces of higher dimensionality,
29
28
30
- TO DO
29
+ * supports multiprocessing via OpenMP.
31
30
32
- Package features:
33
31
34
- * Euclidean Minimum Spanning Trees using single, sesqui, and
35
- dual-tree Boruvka algorithms – fast in spaces of low intrinsic dimensionality
32
+ Refer to the package ** homepage ** at < https://quitefastmst.gagolewski.com/ >
33
+ for the reference manual, tutorials, examples, and benchmarks.
36
34
37
- * support for mutual reachability distances based on the Euclidean metric
38
- (like in the HDBSCAN\* algorithm)
35
+ ** Author and maintainer** : [ Marek Gagolewski] ( https://www.gagolewski.com/ )
39
36
40
- * Euclidean nearest neighbours with nicely-optimised K-d trees (support
41
- parallel processing)
42
37
43
- * fallback algorithms for spaces of higher dimensionality
38
+ Possible applications in data analysis:
39
+ clustering (HDBSCAN\* , Genie, Single linkage, etc.),
40
+ classification and regression (k-nearest neighbours),
41
+ outlier and noise point detection, and many more.
44
42
45
43
46
44
45
+ ## How to Install
47
46
48
- ## Examples, Tutorials, and Documentation
47
+ ### Python Version
49
48
50
- TO DO
49
+ To install from [ PyPI ] ( https://pypi.org/project/quitefastmst ) , call:
51
50
52
- * To learn more about R, check out Marek's open-access (free!) textbook*
53
- [ Deep R Programming] ( https://deepr.gagolewski.com/ ) .
51
+ ``` bash
52
+ pip3 install quitefastmst # python3 -m pip install quitefastmst
53
+ ```
54
54
55
- * To learn more about Python, check out Marek's recent open-access (free!) textbook*
55
+ * To learn more about Python, check out my open-access textbook*
56
56
[ Minimalist Data Wrangling in Python] ( https://datawranglingpy.gagolewski.com/ ) .
57
57
58
58
59
59
60
- ## How to Install
61
-
62
- ### Python Version
63
-
64
- TO DO
65
-
66
-
67
60
### R Version
68
61
69
- TO DO
62
+ To install from [ CRAN ] ( https://CRAN.R-project.org/package=quitefastmst ) , call:
70
63
64
+ ``` r
65
+ install.packages(" quitefastmst" )
66
+ ```
67
+
68
+ * To learn more about R, check out my open-access textbook*
69
+ [ Deep R Programming] ( https://deepr.gagolewski.com/ ) .
71
70
72
71
73
72
74
73
### Other
75
74
76
- The core functionality is implemented in the form of a header-only
77
- C++ library. It can thus be easily adapted for use in other projects.
75
+ The core functionality is implemented in the form of a C++ library.
76
+ It can thus be easily adapted for use in other projects.
78
77
79
78
New contributions are welcome, e.g., Julia, Matlab/GNU Octave wrappers.
80
79
81
80
82
81
83
-
84
82
## License
85
83
86
84
Copyright (C) 2025–2025 Marek Gagolewski < https://www.gagolewski.com/ >
@@ -100,56 +98,58 @@ received a copy of the License along with this program. If not, see
100
98
101
99
## References
102
100
103
- Borůvka O., O jistém problému minimálním,
104
- Práce Moravské Přírodovědecké Společnosti 3 , 1926, 37–58.
101
+ Borůvka, O., O jistém problému minimálním,
102
+ * Práce Moravské Přírodovědecké Společnosti* ** 3 ** , 1926, 37–58.
105
103
106
- Bentley J.L., Multidimensional binary search trees used for associative
107
- searching, Communications of the ACM 18 (9), 509–517, 1975,
108
- DOI:10.1145/361002.361007.
104
+ Bentley, J.L., Multidimensional binary search trees used for associative
105
+ searching, * Communications of the ACM* ** 18 ** (9), 509–517, 1975,
106
+ [ DOI: 10.1145/361002.361007] ( https://doi.org/10.1145/361002.361007 ) .
109
107
110
- Campello R.J.G.B., Moulavi D., Zimek A., Sander J., Hierarchical
108
+ Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J., Hierarchical
111
109
density estimates for data clustering, visualization, and outlier detection,
112
- ACM Transactions on Knowledge Discovery from Data (TKDD) 10 (1),
113
- 2015, 1–51, DOI:10.1145/2733381.
110
+ * ACM Transactions on Knowledge Discovery from Data (TKDD)* ** 10 ** (1),
111
+ 2015, 1–51, [ DOI: 10.1145/2733381] ( https://doi.org/10.1145/2733381 ) .
114
112
115
- Campello R.J.G.B., Moulavi D., Sander J.,
113
+ Campello, R.J.G.B., Moulavi, D., Sander, J.,
116
114
Density-based clustering based on hierarchical density estimates,
117
115
* Lecture Notes in Computer Science* ** 7819** , 2013, 160–172.
118
116
[ DOI: 10.1007/978-3-642-37456-2_14] ( https://doi.org/10.1007/978-3-642-37456-2_14 ) .
119
117
120
- Gagolewski M., Cena A., Bartoszuk M., Brzozowski L.,
118
+ Gagolewski, M., Cena, A., Bartoszuk, M., Brzozowski, L.,
121
119
Clustering with minimum spanning trees: How good can it be?,
122
120
* Journal of Classification* ** 42** , 2025, 90–112.
123
121
[ DOI: 10.1007/s00357-024-09483-1] ( https://doi.org/10.1007/s00357-024-09483-1 ) .
124
122
125
- Gagolewski M., A framework for benchmarking clustering algorithms,
123
+ Gagolewski, M., A framework for benchmarking clustering algorithms,
126
124
* SoftwareX* ** 20** , 2022, 101270.
127
125
[ DOI: 10.1016/j.softx.2022.101270] ( https://doi.org/10.1016/j.softx.2022.101270 ) .
128
126
< https://clustering-benchmarks.gagolewski.com/ > .
129
127
130
- Jarník V., O jistém problému minimálním,
131
- Práce Moravské Přírodovědecké Společnosti 6, 1930, 57–63.
128
+ Jarník, V., O jistém problému minimálním,
129
+ * Práce Moravské Přírodovědecké Společnosti* ** 6** , 1930, 57–63.
130
+
131
+ Maneewongvatana, S., Mount, D.M., It's okay to be skinny, if your friends
132
+ are fat, * The 4th CGC Workshop on Computational Geometry* , 1999.
132
133
133
- Maneewongvatana S., Mount D.M., It's okay to be skinny, if your friends
134
- are fat, The 4th CGC Workshop on Computational Geometry, 1999.
134
+ March, W.B., Parikshit, R., Gray, A.G., Fast Euclidean minimum spanning
135
+ tree: Algorithm, analysis, and applications,
136
+ * Proc. 16th ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining (KDD '10)* , 2010, 603–612.
135
137
136
- March W.B., Parikshit R., Gray A.G., Fast Euclidean minimum spanning
137
- tree: Algorithm, analysis, and applications, Proc. 16th ACM SIGKDD Intl.
138
- Conf. Knowledge Discovery and Data Mining (KDD '10), 2010, 603–612.
139
138
Olson C.F., Parallel algorithms for hierarchical clustering,
140
- Parallel Computing 21 (8), 1995, 1313–1325.
139
+ * Parallel Computing* ** 21 ** (8), 1995, 1313–1325.
141
140
142
- McInnes L., Healy J., Accelerated hierarchical density-based
143
- clustering, IEEE Intl. Conf. Data Mining Workshops (ICMDW), 2017, 33–42,
144
- DOI:10.1109/ICDMW.2017.12.
141
+ McInnes, L., Healy, J., Accelerated hierarchical density-based
142
+ clustering, * IEEE Intl. Conf. Data Mining Workshops (ICMDW)* , 2017, 33–42,
143
+ [ DOI: 10.1109/ICDMW.2017.12] ( https://doi.org/10.1109/ICDMW.2017.12 ) .
145
144
146
- Prim R., Shortest connection networks and some generalizations,
147
- The Bell System Technical Journal 36 (6), 1957, 1389–1401.
145
+ Prim, R., Shortest connection networks and some generalizations,
146
+ * The Bell System Technical Journal* ** 36 ** (6), 1957, 1389–1401.
148
147
149
- Sample N., Haines M., Arnold M., Purcell T., Optimizing search
150
- strategies in K-d Trees, 5th WSES/IEEE Conf. on Circuits, Systems,
151
- Communications & Computers (CSCC'01), 2001.
148
+ Sample, N., Haines, M., Arnold, M., Purcell, T.,
149
+ Optimizing search strategies in K-d Trees,
150
+ * 5th WSES/IEEE Conf. on Circuits, Systems, Communications & Computers (CSCC'01)* ,
151
+ 2001 .
152
152
153
153
154
- See the package's [ homepage] ( https://quitefastmst.gagolewski.com/ ) for more
155
- references.
154
+ See the package's [ homepage] ( https://quitefastmst.gagolewski.com/ )
155
+ for more references.
0 commit comments