Skip to content

Commit eb8f0d2

Browse files
committed
docs:First complete version.
1 parent deba49a commit eb8f0d2

File tree

2 files changed

+128
-2
lines changed

2 files changed

+128
-2
lines changed

README-work.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# Math::DistanceFunctions::Native
2+
3+
[![Actions Status](https://github.com/antononcube/Raku-LibAccelerate-DistanceFunctions/actions/workflows/linux.yml/badge.svg)](https://github.com/antononcube/Raku-LibAccelerate-DistanceFunctions/actions)
4+
[![Actions Status](https://github.com/antononcube/Raku-LibAccelerate-DistanceFunctions/actions/workflows/macos.yml/badge.svg)](https://github.com/antononcube/Raku-LibAccelerate-DistanceFunctions/actions)
5+
6+
[![License: Artistic-2.0](https://img.shields.io/badge/License-Artistic%202.0-0298c3.svg)](https://opensource.org/licenses/Artistic-2.0)
7+
8+
Raku package with distance functions implemented in C.
9+
Apple's Accelerate library is used if available.
10+
11+
The primary motivation for making this library is to have fast sorting and nearest neighbors computations
12+
over collections of LLM-embedding vectors.
13+
14+
------
15+
16+
## Usage examples
17+
18+
19+
### Regular vectors
20+
21+
Make a large (largish) collection of large vectors and find Euclidean distances over them:
22+
23+
```perl6
24+
use Math::DistanceFunctions::Native;
25+
26+
my @vecs = (^1000).map({ (^1000).map({1.rand}).cache.Array }).Array;
27+
my @searchVector = (^1000).map({1.rand});
28+
29+
my $start = now;
30+
my @dists = @vecs.map({ euclidean-distance($_, @searchVector)});
31+
my $tend = now;
32+
say "Total time of computing {@vecs.elems} distances: {round($tend - $start, 10 ** -6)} s";
33+
say "Average time of a single distance computation: {($tend - $start) / @vecs.elems} s";
34+
```
35+
36+
### `CArray` vectors
37+
38+
Use `CArray` vectors instead:
39+
40+
```perl6
41+
use NativeCall;
42+
my @cvecs = @vecs.map({ CArray[num64].new($_) });
43+
my $cSearchVector = CArray[num64].new(@searchVector);
44+
45+
$start = now;
46+
my @cdists = @cvecs.map({ euclidean-distance($_, $cSearchVector)});
47+
$tend = now;
48+
say "Total time of computing {@cvecs.elems} distances: {round($tend - $start, 10 ** -6)} s";
49+
say "Average time of a single distance computation: {($tend - $start) / @cvecs.elems} s";
50+
```
51+
52+
I.e., we get ≈ 200 times speed-up using `CArray` vectors and the functions of this package.
53+
54+
### Edit distance
55+
56+
The loading of this package automatically loads the (C-implemented) function `edit-distance` of
57+
["Math::DistanceFunctions::Edit"](https://github.com/antononcube/Raku-Math-DistanceFunctions-Edit).
58+
Here is an example usage:
59+
60+
```perl6
61+
edit-distance('racoon', 'raccoon')
62+
```

README.md

Lines changed: 66 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,69 @@
55

66
[![License: Artistic-2.0](https://img.shields.io/badge/License-Artistic%202.0-0298c3.svg)](https://opensource.org/licenses/Artistic-2.0)
77

8-
Raku package with distance functions implemented using Apple's Accelerate library.
9-
Generic C implementations are also provided.
8+
Raku package with distance functions implemented in C.
9+
Apple's Accelerate library is used if available.
10+
11+
The primary motivation for making this library is to have fast sorting and nearest neighbors computations
12+
over collections of LLM-embedding vectors.
13+
14+
------
15+
16+
## Usage examples
17+
18+
19+
### Regular vectors
20+
21+
Make a large (largish) collection of large vectors and find Euclidean distances over them:
22+
23+
```perl6
24+
use Math::DistanceFunctions::Native;
25+
26+
my @vecs = (^1000).map({ (^1000).map({1.rand}).cache.Array }).Array;
27+
my @searchVector = (^1000).map({1.rand});
28+
29+
my $start = now;
30+
my @dists = @vecs.map({ euclidean-distance($_, @searchVector)});
31+
my $tend = now;
32+
say "Total time of computing {@vecs.elems} distances: {round($tend - $start, 10 ** -6)} s";
33+
say "Average time of a single distance computation: {($tend - $start) / @vecs.elems} s";
34+
```
35+
```
36+
# Total time of computing 1000 distances: 0.63326 s
37+
# Average time of a single distance computation: 0.0006332598499999999 s
38+
```
39+
40+
### `CArray` vectors
41+
42+
Use `CArray` vectors instead:
43+
44+
```perl6
45+
use NativeCall;
46+
my @cvecs = @vecs.map({ CArray[num64].new($_) });
47+
my $cSearchVector = CArray[num64].new(@searchVector);
48+
49+
$start = now;
50+
my @cdists = @cvecs.map({ euclidean-distance($_, $cSearchVector)});
51+
$tend = now;
52+
say "Total time of computing {@cvecs.elems} distances: {round($tend - $start, 10 ** -6)} s";
53+
say "Average time of a single distance computation: {($tend - $start) / @cvecs.elems} s";
54+
```
55+
```
56+
# Total time of computing 1000 distances: 0.002994 s
57+
# Average time of a single distance computation: 2.994124e-06 s
58+
```
59+
60+
I.e., we get ≈ 200 times speed-up using `CArray` vectors and the functions of this package.
61+
62+
### Edit distance
63+
64+
The loading of this package automatically loads the (C-implemented) function `edit-distance` of
65+
["Math::DistanceFunctions::Edit"](https://github.com/antononcube/Raku-Math-DistanceFunctions-Edit).
66+
Here is an example usage:
67+
68+
```perl6
69+
edit-distance('racoon', 'raccoon')
70+
```
71+
```
72+
# 1
73+
```

0 commit comments

Comments
 (0)