-
-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make age parallel #109
Comments
This is that it isn't feasible to do that without overhauling Go's cryptograhpy libraries (and might be unsafe, I don't know enough about goroutine security to say for sure). The only functions in age that actually handle the plaintext are EncryptOAEP/DecryptOAEP from crypto/rsa and Seal/Open from x/crypto/chacha20poly1305, neither of which are parallel. Both could be parallelized, but RSA generally hasn't because it needs a parallel-friendly modular exponentiation function. ChaCha is fairly easy to parallelize, but Go's implementation is handwritten assembly using vector instructions when available (unless you're using a purego build, gccgo, or an uncommon CPU architecture). I have a feeling that probably outperforms a goroutine version, but maybe not. |
@RKinsey I'm not sure if this argument actually holds. |
Yeah @RKinsey was talking about the key-wrapping phase. The actual symmetric stream encryption is where the bulk of Age's work happens (at least on larger file sizes) and it looks like it could be parallelizable. The stream is divided into fixed-size chunks of 64 kB, and each chunk uses the same encryption key but of course a different nonce. The nonce is calculated based on the chunk number. It's a seekable stream and thus theoretically easily parallelizable. Though practically the code will be more complex than what currently is - so it'd need pretty good testing suite. |
Just running chacha20-poly1305 in parallel for a few blocks easily more than doubles the speed. My own tool is written in Python and does 2.2 GB/s encryption and decryption (using 4 threads for chacha, otherwise single-threaded). It is a shame that the crypto libraries don't offer threaded implementations of these algorithms. This is on a machine where age does 1 GB/s and rage only 400 MB/s. |
@Tronic are you using the latest rage? The speed difference should be minimal right now |
@paulmillr rage 0.7.0 on Windows. |
If you encrypt files on a machine with tons of RAM and cores, age isn't any faster versus some basic slow PC.
I think it would be great to utilize resources when they're available.
Tried this on Linux via piping and via
-i -o
— seeing tiny load of one core.The text was updated successfully, but these errors were encountered: