EnRUPT specification has been updated to reflect a change to one of the recommended modes of operation. We have found the new unkeyed unrandomised stream hashing mode proposed in the original paper to be insecure and therefore not suitable for irRUPT. This weakness is purely structural. It does not depend on the underlying round function or structure of the underlying authenticated stream cipher. A paper describing the vulnerability of this mode is on the way. In the mean time EnRUPT specification was updated with the most optimal solution we see. No other modes of operation are affected.
There are two ways to construct stream hashing using (ir1) round function, which would make irRUPT resistant against the attack we have discovered. One is to keep it as it is while re-sealing the state as soon as a half of it is filled with data, not after all of it is filled [i.e. between every xw/2 words of input]. This solution unfortunately comes at a high cost: irRUPT would be 5 times slower than mcRUPT or RUPT stream cipher. The second solution is simply to slow down the hashing process 3 times, which most probably no longer requires the frequent sealing stages at all.
“Most probably” because there is simply no cryptologic theory to rely on, but only our best judgement and our own attempts to analyse it. Stream hashing is a new research area and should not be used without caution and thorough cryptanalysis. We cannot guarantee with as much certainty as for the rest of EnRUPT that this new hashing mode is secure. Nobody can. Digital cryptology is a new and highly under-researched area. There are currently no guaranteed well-studied secure hashing modes. Therefore any applications requiring secure hashing should either use randomization or rely on the better-studied block hashing modes such as the strengthened Merkle-Damgård used for mdRUPT.
Unkeyed irRUPT mode with a random IV [same as mcRUPT] is as fast as RUPT stream cipher and thanks to its final secure sealing process [as secure as RUPT stream cipher], it can provide sufficient preimage resistance, while all the applications that can rely on random nonces in their hashing processes [either providing them themselves or relying on trusted third parties calculating and digitally signing those hashes] can benefit from the high speed and high collision resistance with much shorter hash values and 3 times faster hashing also requiring less memory. For instance, a 96-bit IV combined with a 96-bit randomised hash value provides the same 96-bit secure collision resistance as a 192-bit hash and 96-bit secure preimage resistance. The only limitation is that it can be used only in the applications providing a trusted digital signature.
To end the started confusion about EnRUPT, it is not just a tiny block cipher like the TEA, XTEA or XXTEA. EnRUPT round function can also be used continuously in different stream modes. For example, RUPT-256 stream cipher can be implemented in plain C as follows:
#define sw 8 // 256-bit security
#define xw (4*sw) // means 1024-bit state
// one (ir1) round: same as (er1), only adding the word
// half of the state away to the 32-bit accumulator d
// added once on every round instead of the key word
#define ir1(p) (f=rotr32(2*x[(r-1)%xw]^x[(r+1)%xw]^d^r,8)*9,\
x[r%xw]^=f,d^=f^p^x[(xw/2+r++)%xw])

// the complete [double] stream cipher round,
// returning accumulator d value as keystream
#define rupt(p) (ir1(0), ir1(p))
// Complete RUPT
{
// [default: 256-bit] key followed by [default: 256-bit] IV
u32 key[key_words], iv[iv_words];
// secret state: 4*security, accumulator and round index
u32 x[xw], d, r;
// local variables
u32 i, t;
// plaintext <=> ciphertext
u32 text[data_words];
...
// initialise, load the key, the IV, seal
for (i = 0; i < xw; i++) x[i] = 0; d = 0; r = 1;
for (i = 0; i < key_words; i++) d = rupt(0) ^ key[i];
for (i = 0; i < iv_words; i++) d = rupt(0) ^ iv[i];
for (i = 0; i < 2*xw; i++) rupt(0);
...
// to encrypt or decrypt:
for (i = 0; i < data_words; i++) text[i] ^= rupt(0);
}
That’s all. The above code is complete RUPT [EnRUPT used as a stream cipher]. Its speed is about the same [so far] as Salsa20/12 on Core 2 Duo with a significantly smaller and simpler code. That is about 3 times faster than RC4, while fitting in half of its memory even for 256-bit security and without the pain of implementing RC4 in ASIC/FPGA or Salsa20 on 8-bit processors. Any PRNG that produces 32-bit numbers can be quickly and easily replaced with RUPT to provide top speed and perfect randomness of a secure stream cipher at the same time.
Note: the only advantage of Salsa20 is its operation in CTR mode, which allows fast forwarding to any block in the stream. If it is ever required, RUPT can also be used in CTR mode, but with any desired block size. The per-block counter loading/sealing overhead is insignificant for sufficiently large blocks that are encrypted only one word at a time anyway.
I was just told by AkKort that his unrolled EnRUPT-128-128 in MSVC 2005 is 40% faster than AES-128, a simple loop is 5% faster and a size-optimized 32-bit Intel Assembly implementation occupies 66 bytes and is only 15% slower than AES-128.
I don’t know yet what processor it was measured on or what the actual numbers of clock cycles are, but it sounds exciting! I can’t wait to see the speeds of all kinds of implementations of EnRUPT vs. AES vs. RC4 on all kinds of processors and hardware platforms…
PS: The stream RUPT/aeRUPT/mcRUPT modes are supposed to be about 3 times faster than the block enRUPT/mdRUPT modes…
What’s up with this patent application? It clearly seems to cover a large number of ciphers: Dragon, Salsa20, ChaCha, Trivium, XXTEA, EnRUPT and possibly others that employ the same feedback structure:
x[i] = f(x[i-A], x[i], x[i+B]);where A and B are two positive constants offsetting the index by no more than a half of the block/state and where x[i] is more than one bit wide. In other words, all the ciphers that update each bit/word by a function of at least two of its neighbors from both sides, regardless if the bits/words are updated all simultaneously, in groups or one by one.
The DES was patented, the IDEA is patented, the SHA-2 is patented and the IBM and Hitachi hold patents for the ADD-XOR-ROL ciphers, none of which seem to harm the wide-spread use of any of those ciphers and hash functions. So crypto patents may not be a problem for the cipher’s acceptance after all, at least not as much as some of the patent-hating academics are claiming it to be…
However, I do have some explaining to do. First of all, the first provisional application for it was filed on the 5th of November 2004, that is six days before both Salsa20 and Dragon were first published on the 11th of November 2004. AFAIK, the only cipher with the above structure that was published before that date is XXTEA, which I honestly was not aware of before filing that application.
I am no patent attorney, but as a cryptographer who may be called in as a consultant, I can guarantee that it can be easily shown to the court that it is not obvious how the use of other distances could be beneficial or that it could be trivially derived from the XXTEA structure that uses only the two nearest neighbors. Why not? Because it could be more secure and it could be totally insecure. It is the security that is the whole purpose of a cryptographic component and is the reason why they are designed the way they are. For a cipher or any other cryptographic component to be insecure is like for an airplane or any other flying machine to not be able to lift off the ground. It cannot be called a flying machine if it can’t fly. In this case, security of one is not directly obvious from the other, however small the differences may be, and if you are calling it “cryptographic”, you are automatically claiming cryptographic security.
So if this patent is granted, I anticipate some difficulties for all the ciphers with the above structure. However, all the constructions of this one particular form including EnRUPT are certainly off the hook since XXTEA is prior art:
x[i] ±= f(x[i-1], x[i+1]);All the similar-looking “incomplete” word-based designs like Rabbit, all the known block chaining mechanisms and T-Functions published before this patent application was filed update the current word only by the neighbours located on one side of the current bit/word being updated. Even Prof. Bernstein’s SURF, the predecessor of Salsa20, updates each word by only one of its neighbors. All the “complete” Feistel-Network ciphers including all the “complete” source-heavy UFNs like MD5, SHA, etc. may appear somewhat similar, but in them the entire block/state is involved in the feedback. All such constructions are outside the scope of this patent application.
One may argue that for instance Salsa20 can also be viewed as a square, but that is a weak argument that may or may not hold. The word indexes in Salsa20 are quite clear, with certain well-defined distances between each other… It can be twisted to look like any shape with word shuffling, but that does not change its structure. Patent claims can easily cover something for what it is, but they cannot be invalidated by how else one can see it. A microchip can be viewed as a piece of adulterated sand. Nevertheless, every microchip has many patents protecting the technology in it quite successfully.
I cannot predict if this patent application will be granted, how or if it will ever be used against any cipher, or what the outcome may be, but one thing is certain: EnRUPT is identical in its structure to the XXTEA, which is obviously prior art.
To everyone else, good luck!
I guess the easiest way to understand and implement EnRUPT32 as a block cipher would be something like this:
#define er1(k) (rotr(2*x[(r-1)%xw]^x[(r+1)%xw]^k^r,8)*9^k)
enRUPT (u32 *x, const u32 xw, u32 *key, const u32 kw)
{
u32 r, s=4, n=s*(2*xw+kw);
for (r=1; r<=n; r++) x[r%xw] ^= er1(key[r%kw]);
}
unRUPT (u32 *x, const u32 xw, u32 *key, const u32 kw)
{
u32 r, s=4, n=s*(2*xw+kw);
for (r=n; r ; r--) x[r%xw] ^= er1(key[r%kw]);
}
Yes. This is it. A complete block cipher. There is nothing more to it. If it is used with arbitrary but constant 2n block sizes, it should be almost as fast as an unrolled implementation, almost as fast as the AES for 4-word blocks and 4-word keys.
Another cycle of life has brought to everyone’s attention once again the fact that RAM is not erased instantly on power down, that its contents can be read by the attacker and that the pre-computed key schedules are sitting ducks.
What can I say? Yet another reason to use ciphers without pre-computed key schedules such as XTEA, XXTEA or EnRUPT. Random keys are hard to recognize in memory, but the key schedules with their well-known internal dependencies can be easily identified and recovered even after a major data loss caused by cutting the power. It got proven once again and is not going to change any time soon.
Without a pre-computed key schedule, most block ciphers including Twofish and the AES will be horribly slow and stream ciphers are not a good choice for disk encryption for a lot of other reasons. Out of all the block ciphers only XTEA, XXTEA and EnRUPT are fast without a pre-computed key schedule. They do not waste any time on the key schedule at all. Zero clock cycles, no additional memory and no help to the attacker in finding or recovering the secret keys in RAM after a cold boot.
I totally disagree with the authors of the paper on the countermeasures. Hashing memory or increasing complexity of the key schedule to prevent these attacks is insanity. I do not see why it is so hard to erase the encryption key and especially the pre-computed key schedule when the screen saver kicks in, since the user will have to re-authenticate anyway.
512-byte disk sectors encrypted with EnRUPT with an arbitrarily large random key are safe against cold boot attacks without any additional countermeasures. Loss of 16 bits of a 512-bit key for instance (3.1%) means natural 100-bit security against cold boot attacks even if the attacker knows precisely where the key is stored in memory, otherwise add roughly 30 bits on top of that. Yes. Simpler is better.
It is official, lads! The first specification of EnRUPT32 has been published at SASC-2008.
Download EnRUPT32 Specification PDF
View EnRUPT32 Specification in HTML
Download the latest C source code ZIP
Looks like there is still a lot of work to do before it can be submitted to the NIST competition – fast software implementations, basic cryptanalysis, ambiguities to fix, open questions to answer… But I doubt that I will change anything in the design unless someone points out a flaw in it or a way to improve its performance without sacrificing its simplicity or its flexibility or its scalability.
Suggestions and contributions are very welcome of course.