Reputation: 6570
I am encrypting text like this (node.js):
var text = "holds a long string..."
var cipher = crypto.createCipher("aes128", "somepassword")
var crypted = cipher.update(text, 'utf8', 'hex')
crypted += cipher.final('hex');
If I save text
to a file directly, it is N bytes. If I save crypted
, the file size is about N * 2 bytes.
Any way to make the crypted text is N bytes as close as possible?
Upvotes: 3
Views: 6494
Reputation: 61952
A modern cipher like AES works on binary data. When you encrypt character data, it is first transformed into a binary representation. This is basically what UTF-8 encoding does. After encryption, you get arbitrary binary data out, which is not necessarily a valid UTF-8 encoding (almost all encodings have a special structure) when you try to decode it.
If you omit the output_encoding
from Cipher#update
and Cipher#final
, you get a Buffer
, which you can concatenate or write to a file. It manages the data in a binary format, but defaults to Hex when printed. When you write the Buffers to a file, the file size will be close to the plaintext size, but it will never reach it.
AES is a block cipher and can only encrypt a single block of exactly 16 bytes. A mode of operation like ECB or CBC enables you to encrypt multiple blocks. Finally, a padding scheme like the default PKCS#7 padding enables you to encrypt texts of arbitrary length. This padding always adds some bytes before the actual decryption. To be precise, it adds from 1 to 16 bytes.
You can use cipher.setAutoPadding(false)
to prevent padding, but then you will need to pad yourself. You could also use a streaming mode like CTR ("aes-128-ctr"), but then you need to provide a unique IV (nonce) of 12 bytes for it have any security. This nonce doesn't have to be secret, but you have to transport it to the decrypter.
In the end it is really not possible for the ciphertext to be exactly the same size as the plaintext. There is always something that inflates the ciphertext.
Never use the crypto.createCipher
. You need to use a randomized cipher to get semantic security. Use crypto.createCipheriv
with a fresh and random IV. For CTR mode, the IV must be unique and for CBC mode, it must be unpredictable.
Always use authenticated encryption. It enables you to detect wrong keys and (malicious) tampering of ciphertexts. Here's an example with AES-GCM.
Upvotes: 4
Reputation: 17157
The problem is your 'hex'
encoding. Basically you advise the cipher to
text
using the utf8
encodinghex
encodingHexadecimal encoding uses 2 bytes to represent 1 actual byte, thus you get a file size approximately twice the size of your plain text.
The solution is to use a more efficient coding for your ciphertext which is still able to hold all possible byte values, which rules out a simple string. Try:
var crypted = cipher.update(text, 'utf8', 'base64');
crypted += cipher.final('base64');
This will encode the ciphertext as a base64 encoded string.
I have created an online example, the results are:
text: 488890
crypted hex length: 977792, ratio: 2.0000245453987606
crypted base64 length: 651864, ratio: 1.3333551514655648
Security Announcement: Don't use this key/IV generation in production. I would highly advise to use a different IV for each encryption, using crypto.createCipheriv(algorithm, key, iv)
. But for a demo purpose this is fine.
Upvotes: 5