Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Show HN: Rscrypto, pure-Rust crypto with industry leading public benches (github.com)

27 points by LoadingALIAS 11 hours ago | 10 comments

dave_universetf 9 hours ago [-]

The readme has strong LLM smells. Was the code written by an LLM as well?

What is your experience with cryptographic engineering, in particular avoiding common implementation pitfalls that bite first-time implementers of cryptographic primitives?

Are the primitives tested against Wycheproof vectors, and proofed against the common implementation mistakes they document?

tux3 8 hours ago [-]

Yeah, spot on. This is what the code looks like: https://github.com/loadingalias/rscrypto/blob/4e24772a54fef3...

Look at these section comments that LLMs love ("// ─── Rotation helpers ────")

Now you sometimes see these section comments in legacy codebases that have very long files. What you don't see people use is U+2500 BOX DRAWINGS LIGHT HORIZONTAL unicode characters padded out just right to look pretty. We humans have regular keyboards, but these AIs are trained to output emojis and pretty unicode.

LoadingALIAS 5 hours ago [-]

The documentation and to a large extent commenting, auditing, and almost every markdown file was likely generated with an LLM. Do not mistake that for competence or quality.

This is a pre-v1 codebase. I'm looking for bench-methodology failures; I'm looking for API issues and/or code smells. I'm looking for ASM/SIMD weak points and/or testing issues.

Over time, as I have the capacity, I will almost certainly clean up anything that's just not necessary. Having said that, if something feels clean and it was done by an LLM in my harness/workflow - I'm 100% happy to leave it.

Please, dig into the code. Let me know what you see. Thanks.

LoadingALIAS 5 hours ago [-]

Yeah, all fair questions.

To address the LLM question - almost all MD files in the codebase were built around the codebase by an LLM. I simply don't have the time; this project is a side project and not my main squeeze. This is also a pre-v1 codebase; I will have time soon enough to address anything overly 'LLM' flavored.

My experience covers nearly two decades in one way or another. Having said that, I've never felt like I had the time, nor the need, for rscrypto. The last year was different; I genuinely needed this myself for my actual work. I have worked on rscrypto in part for a year. This isn't like a whimsical LLM codebase or some vibe coded junk.

I use LLMs in my workflows every single day and have for the better part of two-years; I gain more trust in them almost weekly, too. I feel like there isn't an engineer on Earth who can say otherwise and if there is... I'd probably argue with them against integrating LLMs into their tooling in some way.

Finally, the actual important question... not all primitives are tested against Wycheproof vectors yet. RSA - yes; the whole crate, not yet. Again, it's just a time thing. I've used official RFC/NIST vectors, RustCrypto/oracle differential tests, proptests, fuzz corpus replay, Miri where applicable, and backend-vs-portable equivalence tests to cover the rest of the codebase.

Also, “proofed” is too strong a word for test vectors, IMO. Wycheproof is regression evidence against known bug classes, not a proof of cryptographic correctness.

Nevertheless, it's a valid point and it's covered in my backlog as of like a month ago.

sevenoftwelve 9 hours ago [-]

Hi @LoadingAlias,

> Constant-time MAC, AEAD, and signature verification.

That sounds suspiciously incomplete to me.

Which cryptographic algorithms in the library are currently not implemented in constant time?

Where did the speedup come from? How where these optimizations achieved?

What motivated you to write the library? Why not contribute to existing rust crypto libraries instead? How is the work financed?

What peer review strategy are you following with the library? Who else but yourself has verified this code?

sevenoftwelve 9 hours ago [-]

Why do the different sha2 variants not share code? This seems like a lot of opportunities for small mistakes/discrepancies; especially considering the many architectures.

Was any of this generated using AI?

LoadingALIAS 49 minutes ago [-]

The SHA2 variants DO share the compression layer where I felt it mattered:

- SHA-224 uses the SHA-256 compression kernels w/ different IV/output truncation. - SHA-384 and SHA-512/256 use the SHA-512 compression kernels w/ different IV/output truncation.

There IS some duplicated wrapper/finalization/state code per public type, and I agree that is probably the first place where small discrepancies/mistakes can creep in over time. I appreciate you pointing it out; I've added it to the backlog and will look it over as soon as possible. The reason it exists today is more about keeping monomorphized public types simple. I’m not religious about it; if I can reduce that wrapper duplication w/o making the dispatch/type story worse, I should - and I will.

The guardrail is that SHA2 has official vectors + differential/proptest coverage against the sha2 crate for one-shot and streaming paths.

Yes, I use an LLM daily and have for a few years now. It's used as an assistant during parts of the project, especially for drafting, refactoring passes, test scaffolding, and review prompts. I use an LLM to write markdown files for the public - it's not something I'm great at. I do not treat generated code as trusted... in fact, it's the exact opposite. It has to compile, pass vectors/differentials/fuzz/Miri where applicable, and survive manual review. Also, this is crypto, the tests are not decoration; they are the bar before code counts. I know that our industry is drowning in vibe-coded nonsense; this is not that. This is like a year of my life... and maintaining it for many years to come.

A final point I wanted to leave... this is pre-v1. The point of sharing today was to get people to dig into it and find the problems. If there are other issues, inefficiencies, or smells you fine - please, share them. Thank you!

CodesInChaos 9 hours ago [-]

"Constant-time signature verification" stands out, since unlike signature creation, verification doesn't involve secrets, and thus doesn't require constant-time in most threat models.

LoadingALIAS 46 minutes ago [-]

[dead]

LoadingALIAS 5 hours ago [-]

Hey! Thank you for taking a second. Really, I appreciate it. So... fair criticism. The constant-time line is too compressed and should probably be replaced w/ some kind of matrix.

I ask you to give me a few hours. I'm not able to like devote the time to the comments that it deserves. I'm nearly home, give me a bit, please.

Thanks!

LoadingALIAS 55 minutes ago [-]

[dead]

LoadingALIAS 10 hours ago [-]

I've built rscrypto because crypto kept being where my Rust database stopped being portable: different stack on the server, different target story on WASM, different answer on RISC-V/POWER/IBM Z, and a different audit surface every time I added a primitive. The supply chain risk, given the landscape we're in today, was too high.

v0.3.1 is one feature-selected crate. Leaf features when you need one primitive (`sha2`, `rsa`, `aes-gcm`, `ed25519`, etc.) or `full` for the stack. Scope includes SHA-2/3, SHAKE, cSHAKE256, BLAKE2, BLAKE3, Ascon hash/XOF, XXH3, RapidHash, CRCs, HMAC, KMAC256, HKDF, PBKDF2, Argon2, scrypt, PHC strings, RSA, Ed25519, X25519, AES-128/256-GCM, AES-128/256-GCM-SIV, ChaCha20-Poly1305, XChaCha20-Poly1305, AEGIS-256, and Ascon-AEAD128.

The primitive stack has zero default deps and no C-libs or FFI. Optional `getrandom`, `serde`, and `rayon` features stay out until enabled.

The current bench evidence is across nine Linux runners (Intel Sapphire Rapids, Intel Ice Lake, AMD Zen4, AMD Zen5, Graviton3, Graviton4, IBM Z/s390x, IBM POWER10/ppc64le, RISE RISC-V) and my local Apple MBP M1.

Linux vs. fastest-external: 3,545 wins and 5,210 wins-or-ties out of 5,832 comparisons, 1.61x geomean.

MBP M1 vs fastest-external: 235 wins and 450 wins-or-ties out of 463 comparisons, 1.25x geomean.

BLAKE3 large inputs (`>=64 KiB`) are 2.31x geomean improvement across Linux vs the official `blake3` crate and 1.80x on MBP M1.

While it's not universally faster - it's incredibly close. Current weak spots include PBKDF2-SHA256 setup at `iters=1`, X25519 DH, RSA verification on Arm/RISC-V, small-message AEAD rows, MBP M1 BLAKE3 64 KiB rows, HMAC-SHA256 bulk pressure against `aws-lc-rs`, and SHA3-256 streaming on Apple Silicon. The `./benchmark_results/OVERVIEW.md` lists the losses next to the wins in more detail.

Trust, Testing, Etc: portable Rust is the byte-for-byte authority. SIMD/ASM paths are accelerators and are differential tested against the portable path. MAC, AEAD, and signature comparisons are constant-time. Secret-bearing types zeroize on drop. I've got a pretty thorough Miri and Fuzzer testing gate setup, too. The RSA impl has it's own CI gate. Codecov = 73.06, fuzzing included.

This is not FIPS 140-3 validated, not a TLS stack, not a key store, and not third-party audited yet. I am genuinely interested in a third-party audit and would LOVE to plan for FIPS 140-3 validation, but it's just out of my reach right now.

The codebase/lib is obviously pre-v1 and I'm asking for public review while API changes are still relatively cheap.

Repo: https://github.com/loadingalias/rscrypto

Crate: https://crates.io/crates/rscrypto

Benches: https://github.com/loadingalias/rscrypto/blob/main/benchmark...

Migration Guides: https://github.com/loadingalias/rscrypto/blob/main/docs/migr...

Me: https://x.com/loadingalias

If you're testing, benching, etc. and happen to stumble across inconsistencies, vulnerabilities, etc. - please just reach out directly via 'X' or use Github's Vulnerability Reporting. There are a decent number of people already using the library.

Also, the 'fastest-external' competitors for perf comparisons are almost always one of the following: aws-lc-rs, ring, RustCrypto, dryoc, OpenSSL, Blake3 and/or one of the many 'crc-fast/fast-crc' crate variations. I benched these external crates against eachother in the beginning to trace the most performant before hunting inefficiency and cutting out any external deps/c-libs. So, if the benches show a 2x geomean over Blake3... that means it's over the fastest implementation of Blake3 I could find and bench publicly.

Rendered at 03:15:26 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.