You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/modules/ROOT/pages/benchmarks.adoc
+58-57Lines changed: 58 additions & 57 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,10 +12,10 @@ This section describes a range of performance benchmarks that have been run comp
12
12
13
13
The values in the ratio column are how many times longer running a specific operation takes in comparison to the same operation with a `double`.
14
14
15
-
IMPORTANT: On nearly all platforms there is hardware support for binary floating point math, so we are comparing hardware to software runtimes; *Decimal will be slower*
15
+
IMPORTANT: On nearly all platforms, there is hardware support for binary floating point math, so we are comparing hardware to software runtimes; *Decimal will be slower*
16
16
17
-
NOTE: Both the results from Intel and GCC types are from very close, but not identical benchmark routines since they are written in C instead of C\++.
18
-
We assume they are close enough, and the differences between the C and C++ compilers are small enough, for fair comparison
17
+
NOTE: Both the results from Intel and GCC types are from very close, but not identical benchmark routines since they are written in C instead of pass:[C++].
18
+
We assume they are close enough, and the differences between the C and pass:[C++] compilers are small enough, for fair comparison
19
19
20
20
== How to run the Benchmarks
21
21
[#run_benchmarks_]
@@ -40,21 +40,22 @@ This is repeated 5 times to generate stable results.
40
40
41
41
=== Basic Mathematical Operations
42
42
43
-
The benchmark for these operations generates a random vector containing 20,000,000 elements and does operations `+`, `-`, `*`, `/` between `vec[i] and vec[i + 1]`.
44
-
This is repeated 5 times to generate stable results.
43
+
The benchmark for these operations generates a random vector containing 20 million elements and does operations `+`, `-`, `*`, `/` between `vec[i] and vec[i + 1]`.
44
+
This is repeated five times to generate stable results.
45
45
46
46
=== `<charconv>`
47
47
48
-
Parsing and serializing number exactly is one of the key features of decimal floating point types, so we must compare the performance of `<charconv>`. For all the following the results compare against STL provided `<charconv>` for 20,000,000 conversions.
49
-
Since `<charconv>` is fully implemented in software for each type the performance gap between built-in `float` and `double` vs `decimal32_t` and `decimal64_t` is significantly smaller (or the decimal performance is better) than the hardware vs software performance gap seen above for basic operations.
48
+
Parsing and serializing number exactly is one of the key features of decimal floating point types, so we must compare the performance of `<charconv>`.
49
+
For all the following the results compare we compare against STL provided `<charconv>` for 20 million conversions.
50
+
Since `<charconv>` is fully implemented in software for each type the performance gap between built-in `float` and `double` vs `decimal32_t` and `decimal64_t` is significantly smaller (or the decimal performance is better) than the hardware vs. software performance gap seen above for basic operations.
50
51
51
-
To run these benchmarks yourself you will need a compiler with complete implementation of `<charconv>` and to run the benchmarks under C++17 or higher.
52
-
At the time of writing this is limited to:
52
+
To run these benchmarks yourself, you will need a compiler with complete implementation of `<charconv>` and to run the benchmarks under pass:[C++]17 or higher.
53
+
At the time of writing, this is limited to:
53
54
54
55
- GCC 11 or newer
55
56
- MSVC 19.24 or newer
56
57
57
-
These benchmarks are automatically disabled if your compiler does not provide feature complete `<charconv>` or if the language standard is set to C++14.
58
+
These benchmarks are automatically disabled if your compiler does not provide feature complete `<charconv>` or if the language standard is set to pass:[C++]14.
58
59
59
60
[#x64_linux_benchmarks]
60
61
== x64 Linux
@@ -394,23 +395,23 @@ Intel Compiler:
394
395
| 103,796
395
396
| 1.000
396
397
| `decimal32_t`
397
-
| 2,125,437
398
-
| 20.477
398
+
| 2,134,312
399
+
| 20.563
399
400
| `decimal64_t`
400
-
| 5,973,337
401
-
| 57.549
401
+
| 5,399,276
402
+
| 52.018
402
403
| `decimal128_t`
403
-
| 9,482,403
404
-
| 91.356
404
+
| 10,012,578
405
+
| 96.464
405
406
| `decimal_fast32_t`
406
-
| 1,011,695
407
-
| 9.747
407
+
| 1,558,774
408
+
| 15.018
408
409
| `decimal_fast64_t`
409
-
| 2,138,793
410
-
| 20.606
410
+
| 1,597,873
411
+
| 15.394
411
412
| `decimal_fast128_t`
412
-
| 8,277,721
413
-
| 79.750
413
+
| 8,105,004
414
+
| 78.086
414
415
| Intel `Decimal32`
415
416
| 1,561,213
416
417
| 15.041
@@ -436,20 +437,20 @@ GCC:
436
437
| 2,396,732
437
438
| 29.708
438
439
| `decimal64_t`
439
-
| 4,824,865
440
-
| 59.805
440
+
| 4,021,720
441
+
| 49.850
441
442
| `decimal128_t`
442
-
| 10,751,669
443
-
| 133.270
443
+
| 10,677,625
444
+
| 132.352
444
445
| `decimal_fast32_t`
445
-
| 1,103,023
446
-
| 13.672
446
+
| 1,083,011
447
+
| 13.424
447
448
| `decimal_fast64_t`
448
-
| 2,384,925
449
-
| 29.562
449
+
| 1,851,520
450
+
| 22.950
450
451
| `decimal_fast128_t`
451
-
| 8,332,936
452
-
| 103.289
452
+
| 8,121,160
453
+
| 100.664
453
454
| GCC `_Decimal32`
454
455
| 5,082,812
455
456
| 63.002
@@ -781,23 +782,23 @@ Run using an Intel i9-11900k chipset running Windows 11 and Visual Studio 17.14.
781
782
| 89,317
782
783
| 1.000
783
784
| `decimal32_t`
784
-
| 3,402,467
785
-
| 38.094
785
+
| 3,048,254
786
+
| 34.128
786
787
| `decimal64_t`
787
-
| 4,663,830
788
-
| 52.217
788
+
| 3,282,819
789
+
| 36.755
789
790
| `decimal128_t`
790
-
| 18,167,111
791
-
| 203.400
791
+
| 16,648,799
792
+
| 186.401
792
793
| `decimal_fast32_t`
793
-
| 2,363,121
794
-
| 26.458
794
+
| 2,059,743
795
+
| 23.061
795
796
| `decimal_fast64_t`
796
-
| 6,578,828
797
-
| 73.657
797
+
| 5,105,018
798
+
| 57.156
798
799
| `decimal_fast128_t`
799
-
| 12,341,026
800
-
| 138.171
800
+
| 11,587,763
801
+
| 129,737
801
802
|===
802
803
803
804
=== `from_chars`
@@ -1120,23 +1121,23 @@ Run using a Macbook pro with M4 Max chipset running macOS Sequoia 15.5 and homeb
1120
1121
| 17,145
1121
1122
| 1.000
1122
1123
| `decimal32_t`
1123
-
| 1,705,827
1124
-
| 97.951
1124
+
| 1,732,611
1125
+
| 101.056
1125
1126
| `decimal64_t`
1126
-
| 3,912,831
1127
-
| 224.682
1127
+
| 3,558,094
1128
+
| 207.529
1128
1129
| `decimal128_t`
1129
-
| 8,727,582
1130
-
| 501.153
1130
+
| 8,985,521
1131
+
| 524.090
1131
1132
| `decimal_fast32_t`
1132
-
| 1,054,418
1133
-
| 60.547
1133
+
| 1,075,184
1134
+
| 62.711
1134
1135
| `decimal_fast64_t`
1135
-
| 2,404,072
1136
-
| 138.046
1136
+
| 2,027,533
1137
+
| 118.258
1137
1138
| `decimal_fast128_t`
1138
-
| 7,981,650
1139
-
| 458.320
1139
+
| 7,583,016
1140
+
| 442.287
1140
1141
|===
1141
1142
1142
1143
=== `from_chars`
@@ -1249,7 +1250,7 @@ Run using a Macbook pro with M4 Max chipset running macOS Sequoia 15.5 and homeb
0 commit comments