speedy-int128 ~main
An experiment to speed up the 128-bit integer type
To use this package, run the following command in your project's root directory:
Manual usage
Put the following dependency into your project's dependences section:
speedy-int128
It's technically a fork of std.int128 with added inline LLVM IR for the LDC compiler to make it faster at handling 128-bit integers. This makes it as fast as Clang, because Clang was actually used as a "donor" of this LLVM IR code via a simple script.
This package also enables access to 128-bit arithmetics for the ancient versions of DMD, GDC
and LDC, which don't have the standard std.int128
module yet.
And finally, a oneliner variant is provided for use on programming competition websites.
Example
/+dub.sdl: dependency "speedy-int128" version="~>0.1.0" +/
import speedy.int128; // instead of "std.int128"
import std.stdint, std.stdio, std.range, std.algorithm;
// https://lemire.me/blog/2019/03/19/the-fastest-conventional-random-number-generator-that-can-pass-big-crush/
uint64_t lehmer64() {
static Int128 g_lehmer64_state = Int128(1L); /* bad seed */
g_lehmer64_state *= 0xda942042e4dd58b5;
return g_lehmer64_state.data.hi;
}
void main() {
1_000_000_000.iota.map!(i => lehmer64).sum.writeln;
}
Install the DUB package manager and run the example in a script-like fashion:
$ dub example.d
Or compile an optimized binary using the LDC compiler:
$ dub build --build release --single --compiler=ldc2 example.d
Performance
Benchmarks are done using the benchmark.d / benchmark.c test programs as part of CI. The optimization options are whatever the DUB tool considers default for producing release builds. Some examples:
<details> <summary>GitHub Actions CI, Linux x86_64, Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz</summary>
https://github.com/ssvb/speedy-int128/actions/runs/3859195372/jobs/6578500703
test program | language | compiler | 64-bit | 32-bit | notes |
---|---|---|---|---|---|
benchmark.d | D | DMD 2.100.2 | 2999 ms | 10755 ms | std.int128 |
benchmark.d | D | GDC 12.1.0 | 2943 ms | - | std.int128 |
benchmark.d | D | LDC 1.30.0 | 1930 ms | 5765 ms | std.int128 |
benchmark.c | C/C++ | Clang 14.0.0 | 468 ms | - | -O3 |
benchmark.d | D | LDC 1.30.0 | 402 ms | 3582 ms | speedy.int128 v0.1.0 |
benchmark.c | C/C++ | GCC 11.3.0 | 393 ms | - | -O3 |
</details>
<details> <summary>GitHub Actions CI, Linux x86_64, Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz</summary>
https://github.com/ssvb/speedy-int128/actions/runs/3859220724/jobs/6578545848
test program | language | compiler | 64-bit | 32-bit | notes |
---|---|---|---|---|---|
benchmark.d | D | DMD 2.100.2 | 3854 ms | 11125 ms | std.int128 |
benchmark.d | D | GDC 12.1.0 | 3753 ms | - | std.int128 |
benchmark.d | D | LDC 1.30.0 | 2735 ms | 6068 ms | std.int128 |
benchmark.c | C/C++ | Clang 14.0.0 | 1885 ms | - | -O3 |
benchmark.d | D | LDC 1.30.0 | 1801 ms | 4011 ms | speedy.int128 v0.1.0 |
benchmark.c | C/C++ | GCC 11.3.0 | 1792 ms | - | -O3 |
</details>
<details> <summary>BuildJet CI, Linux aarch64, ARM Neoverse-N1</summary>
https://github.com/ssvb/speedy-int128/actions/runs/3859220721/jobs/6578545846
test program | language | compiler | 64-bit | 32-bit | notes |
---|---|---|---|---|---|
benchmark.d | D | GDC 12.1.0 | 2867 ms | - | std.int128 |
benchmark.d | D | LDC 1.30.0 | 1657 ms | - | std.int128 |
benchmark.d | D | LDC 1.28.0 | 941 ms | 12739 ms | speedy.int128 v0.1.0 |
benchmark.d | D | LDC 1.30.0 | 934 ms | - | speedy.int128 v0.1.0 |
benchmark.c | C/C++ | Clang 14.0.0 | 922 ms | - | -O3 |
benchmark.c | C/C++ | GCC 11.2.0 | 898 ms | - | -O3 |
</details>
Use on programming competition websites
Programming competition websites, such as Codeforces and AtCoder, allow using D language for submitting solutions. But their compilers are typically very old and also installed without any third-party libraries. Needless to say that DUB packages can't be used there in a normal way. Another challenge is that each solution has to be submitted as a single source file with a certain size limit (only 65535 bytes on Codeforces!).
The onelinerizer.rb
script can be used to compress the original 42K of D code into a single 16K line
by removing comments, extra whitespaces and unittests. The result is
speedyint128oneliner.d,
which can be copy-pasted into the source code replacing the "import speedy.int128;"
line.
- ~main released a year ago
- ssvb/speedy-int128
- BSL-1.0
- Authors:
- Dependencies:
- none
- Versions:
-
0.1.0 2023-Jan-07 ~wip8 2023-Jan-04 ~wip77 2023-Jan-06 ~wip5 2023-Jan-06 ~wip1 2023-Jan-06 - Download Stats:
-
-
0 downloads today
-
0 downloads this week
-
0 downloads this month
-
16 downloads total
-
- Score:
- 0.3
- Short URL:
- speedy-int128.dub.pm