To end the started confusion about EnRUPT, it is not just a tiny block cipher like the TEA, XTEA or XXTEA. EnRUPT round function can also be used continuously in different stream modes. For example, RUPT-256 stream cipher can be implemented in plain C as follows:
#define xw 32 // 256-bit security => 1024-bit state
// one (ir1) round: same as (er1), only adding the word
// half of the state away to the 32-bit accumulator d
// added once on every round instead of the key word
#define ir1() (t=rotr32(x[(r-1)%xw]*2^x[(r+1)%xw]^d^r,8)*9,\
d^=t^x[(r+xw/2)%xw],x[r%xw]^=t,r++)

// the complete [double] stream cipher round,
// returning accumulator d value as keystream
#define rupt() (ir1(), ir1(), d)
// Complete RUPT
{
// [default: 256-bit] key followed by [default: 256-bit] IV
u32 key[key_words], iv[iv_words];
// secret state: 4*security, accumulator and round index
u32 x[xw], d, r;
// local variables
u32 i, t;
// plaintext <=> ciphertext
u32 text[data_words];
...
// initialise, load the key, the IV, seal
for (i = 0; i < xw; i++) x[i] = 0; d = 0; r = 1;
for (i = 0; i < key_words; i++) d = rupt() ^ key[i];
for (i = 0; i < iv_words; i++) d = rupt() ^ iv[i];
for (i = 0; i < 2*xw; i++) rupt();
...
// to encrypt or decrypt:
for (i = 0; i < data_words; i++) text[i] ^= rupt();
}
That’s all. The above code is complete RUPT [EnRUPT used as a stream cipher]. Its speed is about the same [so far] as Salsa20/12 on Core 2 Duo with a significantly smaller and simpler code. That is about 3 times faster than RC4, while fitting in half of its memory even for 256-bit security and without the pain of implementing RC4 in ASIC/FPGA or Salsa20 on 8-bit processors. Any PRNG that produces 32-bit numbers can be quickly and easily replaced with RUPT to provide top speed and perfect randomness of a secure stream cipher at the same time.
Note: the only advantage of Salsa20 is its operation in CTR mode, which allows fast forwarding to any block in the stream. If it is ever required, RUPT can also be used in CTR mode, but with any desired block size. The per-block counter loading/sealing overhead is insignificant for sufficiently large blocks that are encrypted only one word at a time anyway.
I was just told by AkKort that his unrolled EnRUPT-128-128 in MSVC 2005 is 40% faster than AES-128, a simple loop is 5% faster and a size-optimized 32-bit Intel Assembly implementation occupies 66 bytes and is only 15% slower than AES-128.
I don’t know yet what processor it was measured on or what the actual numbers of clock cycles are, but it sounds exciting! I can’t wait to see the speeds of all kinds of implementations of EnRUPT vs. AES vs. RC4 on all kinds of processors and hardware platforms…
PS: The stream RUPT/aeRUPT/mcRUPT modes are supposed to be about 3 times faster than the block enRUPT/mdRUPT modes…
What’s up with this patent application? It clearly seems to cover a large number of ciphers: Dragon, Salsa20, ChaCha, Trivium, XXTEA, EnRUPT and possibly others that employ the same feedback structure:
x[i] = f(x[i-A], x[i], x[i+B]); where A and B are two positive constants offsetting the index by no more than a half of the block/state. In other words, all the ciphers that update each bit/word by a function of at least two of its neighbors from both sides, regardless if the bits/words are updated all simultaneously, in groups or one by one.
The DES was patented, the IDEA is patented, the SHA-2 is patented and the IBM and Hitachi hold patents for the ADD-XOR-ROL ciphers, none of which seem to harm the wide-spread use of any of those ciphers and hash functions. So crypto patents may not be a problem for the cipher’s acceptance after all, at least not as much as some of the patent-hating academics are claiming it to be…
However, I do have some explaining to do. First of all, the first provisional application for it was filed on the 5th of November 2004, that is six days before both Salsa20 and Dragon were first published on the 11th of November 2004. AFAIK, the only cipher with the above structure that was published before that date is XXTEA, which I honestly was not aware of before filing that application.
I am no patent attorney, but as a cryptographer who may be called in as a consultant, I can guarantee that it can be easily shown to the court that it is not obvious how the use of other distances could be beneficial or that it could be trivially derived from the XXTEA structure that uses only the two nearest neighbors. Why not? Because it could be more secure and it could be totally insecure. It is the security that is the whole purpose of a cryptographic component and is the reason why they are designed the way they are. For a cipher or any other cryptographic component to be insecure is like for an airplane or any other flying machine to not be able to lift off the ground. It cannot be called a flying machine if it can’t fly. In this case, security of one is not directly obvious from the other, however small the differences may be, and if you are calling it “cryptographic”, you are automatically claiming cryptographic security.
So if this patent is granted, I anticipate some difficulties for all the ciphers with the above structure. However, all the constructions of this one particular form including EnRUPT are certainly off the hook since XXTEA is prior art:
x[i] ±= f(x[i-1], x[i+1]); All the similar-looking “incomplete” designs like Rabbit, all the known FSRs, block chaining mechanisms and T-Functions published before this patent application was filed update the current word only by the neighbors located on one side of the current bit/word being updated. Even Prof. Bernstein’s SURF, the predecessor of Salsa20, updates each word by only one of its neighbors. All the “complete” Feistel-Network ciphers including all the “complete” source-heavy UFNs like MD5, SHA, etc. may appear somewhat similar, but in them the entire block/state is involved in the feedback. All such constructions are outside the scope of this patent application.
One may argue that for instance Salsa20 can also be viewed as a square, but that is a weak argument that may or may not hold. The word indexes in Salsa20 are quite clear, with certain well-defined distances between each other… It can be twisted to look like any shape with word shuffling, but that does not change its structure. Patent claims can easily cover something for what it is, but they cannot be invalidated by how else one can see it. A microchip can be viewed as a piece of adulterated sand. Nevertheless, every microchip has many patents protecting the technology in it quite successfully.
I cannot predict if this patent application will be granted, how or if it will ever be used against any cipher, or what the outcome may be, but one thing is certain: EnRUPT is identical in its structure to the XXTEA, which is obviously prior art.
To everyone else, good luck!
Another cycle of life has brought to everyone’s attention once again the fact that RAM is not erased instantly on power down, that its contents can be read by the attacker and that the pre-computed key schedules are sitting ducks.
What can I say? Yet another reason to use ciphers without pre-computed key schedules such as XTEA, XXTEA or EnRUPT. Random keys are hard to recognize in memory, but the key schedules with their well-known internal dependencies can be easily identified and recovered even after a major data loss caused by cutting the power. It got proven once again and is not going to change any time soon.
Without a pre-computed key schedule, most block ciphers including Twofish and the AES will be horribly slow and stream ciphers are not a good choice for disk encryption for a lot of other reasons. Out of all the block ciphers only XTEA, XXTEA and EnRUPT are fast without a pre-computed key schedule. They do not waste any time on the key schedule at all. Zero clock cycles, no additional memory and no help to the attacker in finding or recovering the secret keys in RAM after a cold boot.
I totally disagree with the authors of the paper on the countermeasures. Hashing memory or increasing complexity of the key schedule to prevent these attacks is insanity. I do not see why it is so hard to erase the encryption key and especially the pre-computed key schedule when the screen saver kicks in, since the user will have to re-authenticate anyway.
512-byte disk sectors encrypted with EnRUPT with an arbitrarily large random key are safe against cold boot attacks without any additional countermeasures. Loss of 16 bits of a 512-bit key for instance (3.1%) means natural 100-bit security against cold boot attacks even if the attacker knows precisely where the key is stored in memory, otherwise add roughly 30 bits on top of that. Yes. Simpler is better.
It is official, lads! The first specification of EnRUPT32 has been published at SASC-2008.
Download EnRUPT32 Specification PDF
View EnRUPT32 Specification in HTML
Download the latest C source code ZIP
Looks like there is still a lot of work to do before it can be submitted to the NIST competition – fast software implementations, basic cryptanalysis, ambiguities to fix, open questions to answer… But I doubt that I will change anything in the design unless someone points out a flaw in it or a way to improve its performance without sacrificing its simplicity or its flexibility or its scalability.
Suggestions and contributions are very welcome of course.