<?xml version="1.0"?>
<rss version="2.0">
  <channel>
    <title>Advogato blog for randombit</title>
    <link>http://www.advogato.org/person/randombit/</link>
    <description>Advogato blog for randombit</description>
    <language>en-us</language>
    <generator>mod_virgule</generator>
    <pubDate>Sun, 19 May 2013 22:38:43 GMT</pubDate>
    <item>
      <pubDate>Thu, 8 Jul 2010 14:11:03 GMT</pubDate>
      <title>The ChaoCipher</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=23</link>
      <guid>http://www.randombit.net/bitbashing/crypto/chao_cipher.html</guid>
      <description>
&lt;p&gt;ChaoCipher was a rotor cipher invented in 1918 by John Byrne but
kept secret by him. Just recently, the Byrne family has donated his
papers and work to the &lt;a href="http://www.nsa.gov/about/cryptologic_heritage/museum/" &gt;National
Cryptologic Museum&lt;/a&gt; (a fascinating place - you can even use an old
Cray as a couch ala Sneakers), and with some work Moshe Rubin has
figured out the full system and &lt;a href="http://www.mountainvistasoft.com/chaocipher/ActualChaocipher/Chaocipher-Revealed-Algorithm.pdf" &gt;described it in a new paper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I woke up around 4:30 this morning and had a little time to kill,
so I whipped up a quick &lt;a href="http://www.advogato.org/code/chao_cipher.py" &gt;implementation&lt;/a&gt; of the ChaoCipher in Python.
It's quite a bit shorter than the Perl implementation given in the
appendix of Moshe's paper, though it doesn't do file I/O; it just
encrypts and then decrypts the example given in the paper to verify
that things seem to be working OK. I'm sure the current implementation
could be made somewhat faster by avoiding list manipulation and
instead moving the zenith and nadir points, though some swapping and
moving seems unavoidable.&lt;/p&gt;

&lt;p&gt;The key consists of a pair of permutations on the 26 character
alphabet - that's roughly a 90 bit key! I wonder if there were key
length export restrictions in place in 1918...&lt;/p&gt;
</description>
    </item>
    <item>
      <pubDate>Wed, 20 Jan 2010 03:10:24 GMT</pubDate>
      <title>Hey Kid, Need a Crypto Card?</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=22</link>
      <guid>http://www.randombit.net/bitbashing/security/hey_kid_need_a_crypto_card.html</guid>
      <description>
&lt;p&gt;I am currently in possession of a large number of things I really
don't need to have around, including, because I'm that kind of weirdo,
a couple of PCI crypto cards - an AEP2000 (donated to me by AEP so I
could write drivers for &lt;a href="http://botan.randombit.net" &gt;botan&lt;/a&gt;
for it) and a Hifn 7811 (an ebay impulse buy).&lt;/p&gt;

&lt;p&gt;The AEP2000 card is basically a modular exponentiator engine - the
2000 in the name refers to the number 1024-bit private key RSA
operations it can perform per second (so, 4000 512-bit exponentations
per second (they were counting CRT optimizations when they made the
model numbers)), and you can use moduli up to 2048 bits. In testing I
found that it could indeed reach 2000 ops per second in
practice. There are open source Linux drivers available for this card,
but nobody has ever updated them for anything past a 2.4 kernel, and
it doesn't seem like they are SMP safe either. Since I don't have the
time (or inclination) to update and fix them, I would rather give the
card away to an open source developer who can make use of it
somehow.&lt;/a&gt;

&lt;p&gt;The Hifn 7811 offers symmetric encryption (DES, RC4, possibly
AES?), MD5 and SHA-1 hashing, and a hardware PRNG. It is similiar, but
not identical, to the &lt;a href="http://www.soekris.com/vpn1401.htm" &gt;
Soekris vpn1401&lt;/a&gt;. There are drivers for this card included in the
Linux kernel, but only 32 bit kernels are supported (I asked Evgeniy
Polyakov, the author of the driver about this, and he indicated it is
a hardware limitation). The only 32-bit machines I have left are
laptops and netbooks, which obviously can't really take a PCI card,
leaving me with a card with no place to go.&lt;/p&gt;

&lt;p&gt;If you would like to play with either (or both) of these cards,
drop me an email and let me know. I would likely give preference to
someone who will be using them to support an open source project, but
feel free to contact me even if this is not the case; mostly I'd like
to give them a good home where they will be doing something
useful.&lt;/p&gt;
</description>
    </item>
    <item>
      <pubDate>Mon, 18 Jan 2010 18:07:41 GMT</pubDate>
      <title>Reality and Politics Do Not Mix</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=21</link>
      <guid>http://www.randombit.net/bitbashing/politics/acceptable_deaths.html</guid>
      <description>
&lt;blockquote&gt;
For obvious reasons, politicians and other policy makers generally
avoid discussing what ought to be considered an "acceptable" number of
traffic deaths, or murders, or suicides, let alone what constitutes an
acceptable level of terrorism. Even alluding to such concepts would
require treating voters as adults-something which at present seems to
be considered little short of political suicide.
&lt;/blockquote&gt;

- from &lt;a href="http://online.wsj.com/article/SB10001424052748704130904574644651587677752.html" &gt;
Undressing the Terror Threat&lt;/a&gt;, Paul Campos
</description>
    </item>
    <item>
      <pubDate>Tue, 24 Nov 2009 15:22:52 GMT</pubDate>
      <title>Using std::async for easy parallel computations</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=20</link>
      <guid>http://www.randombit.net/bitbashing/programming/parallel_function_invocation_using_std_async.html</guid>
      <description>
&lt;p&gt;C++0x, the next major revision of C++, includes a number of new
language and library facilities that I am greatly looking forward to,
including a standard thread interface. Initially the agenda for C++0x
had included facilities built on threads, such as a thread pool, but as
part of the so-called 'Kona compromise' (named after the location of
the committee meeting where the compromise was made) all but the most
basic facilities were deferred for a later revision.&lt;/p&gt;

&lt;p&gt;However there were many requests for a simple facility for creating
an asynchronous function call, and a function for this, named
&lt;tt&gt;std::async&lt;/tt&gt;, was voted in at the last meeting.
&lt;tt&gt;std::async&lt;/tt&gt; is a rather blunt tool; it spawns a new thread
(though wording is included which would allow an implementation to
spawn threads in a fixed-size thread pool to reduce thread creation
overhead and reduce hardware oversubscription) and returns a "future"
representing the return value of the function. A future is a
placeholder for a value which can be passed around the program, and
if and when the value is actually needed, it can be retrieved from the
future; the get operation which might block if the value has not yet
been computed. In C++0x the future/promise system is primarily
intended for use with threads, but there doesn't seem to be any reason
a system for distributed RPC (ala &lt;a href="http://www.erights.org/elang/" &gt;E's&lt;/a&gt; Pluribus protocol) could not
provide an interface using the same classes.&lt;/p&gt;

&lt;p&gt;An operation which felt like easy low-hanging fruit for parallel
invocation is RSA's decrypt/sign operation. Mathematically, when one
signs a message using RSA, the message representation (usually a hash
function output plus some specialized padding) is converted to an
integer, and then raised to the power of &lt;tt&gt;d&lt;/tt&gt;, the RSA private
key, modulo another number. Both of these numbers are relatively
large, typically 300 to 600 digits long. A well known trick which
takes advantage of the underlying structure allows one to instead
compute two modular exponentiations, both using numbers about half the
size of &lt;tt&gt;d&lt;/tt&gt;, and combine them using the Chinese Remainder
Theorem (thus this optimization is often called RSA-CRT). The two
computations are both still quite intensive, and since they are
independent it seemed reasonable to try computing them in parallel.
Running one of the two exponentiations in a different thread showed an
immediate doubling in speed for RSA signing on a multicore! Other
mathematically intensive algorithms that offer some amount of parallel
computation, including DSA and ElGamal, also showed nice improvements.
&lt;/p&gt;

&lt;p&gt;As &lt;tt&gt;std::async&lt;/tt&gt; is not included in GCC 4.5, I wrote a simple
clone of it. This version does not offer thread pooling or the option
of telling the runtime to run the function on the same thread; it is
mostly a 'proof of concept' version I'm using until GCC includes the
real deal in libstdc++. Here is the code:&lt;/p&gt;

&lt;pre&gt;
#include &amp;lt;future&amp;gt;
#include &amp;lt;thread&amp;gt;

template&amp;lt;typename F&amp;gt;
auto std_async(F f) -&amp;gt; std::unique_future&amp;lt;decltype(f())&amp;gt;
   {
   typedef decltype(f()) result_type;
   std::packaged_task&amp;lt;result_type ()&amp;gt; task(std::move(f));
   std::unique_future&amp;lt;result_type&amp;gt; future = task.get_future();
   std::thread thread(std::move(task));
   thread.detach();
   return future;
   }
&lt;/pre&gt;

&lt;p&gt;The highly curious &lt;tt&gt;auto&lt;/tt&gt; return type of &lt;tt&gt;std_async&lt;/tt&gt;
uses C++0x's new function declaration syntax; ordinarily there is
no reason to use it but here we want to specify that the function
returns a &lt;tt&gt;unique_future&lt;/tt&gt; paramaterized by whatever it is
that &lt;tt&gt;f&lt;/tt&gt; returns. Since &lt;tt&gt;f&lt;/tt&gt; can't be referred to until
it has been mentioned as the name of an argument, the return value
has to come after the parameter list.&lt;/p&gt;

&lt;p&gt;Unlike the version of &lt;tt&gt;std::async&lt;/tt&gt; that was finally voted
in, &lt;tt&gt;std_async&lt;/tt&gt; assumes its argument takes no arguments (one of
the original proposals for &lt;tt&gt;std::async&lt;/tt&gt; used a similar
interface). This would be highly inconvenient except for the
assistance of C++0x's lambdas, which allow us to pack everything
together. For instance here is the code for RSA signing, which
packages up one half of the computation in a 0-ary lambda function:

&lt;pre&gt;
   auto future_j1 = std_async([&amp;]() { return powermod_d1_p(i); });
   BigInt j2 = powermod_d2_q(i);
   BigInt j1 = future_j1.get();
   // Now combine j1 and j2 using CRT
&lt;/pre&gt;

&lt;p&gt;Using C++0x's &lt;tt&gt;std::bind&lt;/tt&gt; instead of a lambda here should
work as well, but I ran into problem with that in the 4.5 snapshot I'm
using; the current implementation follows the TR1 style of requiring
&lt;tt&gt;result_type&lt;/tt&gt; typedefs which will not be necessary in C++0x
thanks to &lt;tt&gt;decltype&lt;/tt&gt;. Since the actual &lt;tt&gt;std::async&lt;/tt&gt; can
take an arbitrary number of arguments, the declaration of
&lt;tt&gt;future_j1&lt;/tt&gt; will eventually change to simply:&lt;/p&gt;

&lt;pre&gt;
   auto future_j1 = std::async(powermod_d1_p, i);
&lt;/pre&gt;

&lt;p&gt;The implementation of &lt;tt&gt;std_async&lt;/tt&gt; may strike you as
excessively C++0x-ish, for instance by using &lt;tt&gt;decltype&lt;/tt&gt; instead
of TR1's &lt;tt&gt;result_of&lt;/tt&gt; metaprogramming function. Part of this is
due to current limitations of GCC and/or libstdc++; the version of
&lt;tt&gt;result_of&lt;/tt&gt; in 4.5's libstdc++ does not understand lambda
functions (C++0x's &lt;tt&gt;result_of&lt;/tt&gt; is guaranteed to get this right,
because it itself uses &lt;tt&gt;decltype&lt;/tt&gt;, but apparently libstdc++
hasn't changed to use this yet).&lt;/p&gt;

&lt;p&gt;Overall I'm pretty happy with C++0x as an evolution of C++98 for
systems programming tasks. Though I am certainly interested to see how
Thompson and Pike's &lt;p href = "http://golang.org/"&gt;Go&lt;/a&gt; works out;
now that &lt;a href="http://bitc-lang.org" &gt;BitC&lt;/a&gt; is more or less
dead after the departure of its designers to Microsoft, Go seems to be
the only game in town in terms of new systems programming languages.&lt;/p&gt;
</description>
    </item>
    <item>
      <pubDate>Tue, 24 Nov 2009 00:11:15 GMT</pubDate>
      <title>24 Nov 2009</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=19</link>
      <guid>http://www.randombit.net/bitbashing/programming/convert_line_endings_in_innosetup.html</guid>
      <description>&lt;p&gt;I recently packaged &lt;a href="http://botan.randombit.net/" &gt;botan&lt;/a&gt; for Windows using &lt;a href="http://www.jrsoftware.org/isinfo.php" &gt;InnoSetup&lt;/a&gt;, an open source
installation creator. Overall I was pretty pleased with it - it seems
to do everything I need it to do without much of a hassle, and I'll
probably use it in the future if I need to package other programs or
tools for Windows.&lt;/p&gt;

&lt;p&gt;After I got the basic package working, a nit I wanted to deal with
was converting the line endings of all the header files and plain-text
documentation (readme, license file, etc) to use Windows line
endings. While many Windows programs, including Wordpad and Visual
Studio, can deal with files with Unix line endings, not all do, and it
seemed like it would be a nice touch if the files were not completely
unreadable if opened in Notepad.&lt;/p&gt;

&lt;p&gt;There is no built in support for this, but InnoSetup includes a
scripting facility (using Pascal!), including hooks that can be called
at various points in the installation process, including immediately
after a file is installed, which handles this sort of problem
perfectly. So all that was required was to learn enough Pascal to
write the function. I've included it below to help anyone who might be
searching for a similar facility:&lt;/p&gt;

&lt;pre&gt;
[Code]
const
   LF = #10;
   CR = #13;
   CRLF = CR + LF;

procedure ConvertLineEndings();
  var
     FilePath : String;
     FileContents : String;
begin
   FilePath := ExpandConstant(CurrentFileName)
   LoadStringFromFile(FilePath, FileContents);
   StringChangeEx(FileContents, LF, CRLF, False);
   SaveStringToFile(FilePath, FileContents, False);
end;
&lt;/pre&gt;

&lt;p&gt;Adding the hook with &lt;tt&gt;AfterInstall: ConvertLineEndings&lt;/tt&gt;
caused this function to run on each of my text and include files.&lt;/p&gt;
</description>
    </item>
    <item>
      <pubDate>Wed, 21 Oct 2009 07:13:04 GMT</pubDate>
      <title>SSE2 Serpent on Atom N270: twice as fast as AES-128</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=18</link>
      <guid>http://www.randombit.net/bitbashing/programming/sse2_serpent_on_atom.html</guid>
      <description>
&lt;p&gt;On the Intel Atom N270 processor, OpenSSL 0.9.8g's implementation
of AES-128 runs at 25 MiB per second (CBC mode, using &lt;tt&gt;openssl
speed&lt;/tt&gt;). In contrast, the Serpent implementation using SSE2
&lt;a href="http://www.advogato.org/bitbashing/programming/serpent_in_simd.html" &gt;I described last
month&lt;/a&gt; runs at over 60 MiB per second in ECB mode (2.4x faster) and
48 MiB per second in CTR mode (1.9x faster).&lt;/p&gt;
</description>
    </item>
    <item>
      <pubDate>Thu, 8 Oct 2009 22:12:18 GMT</pubDate>
      <title>Programming trivia: 4x4 integer matrix transpose in SSE2</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=17</link>
      <guid>http://www.randombit.net/bitbashing/programming/integer_matrix_transpose_in_sse2.html</guid>
      <description>
&lt;p&gt;The Intel SSE2 intrinsics has a macro &lt;tt&gt;_MM_TRANSPOSE4_PS&lt;/tt&gt;
which performs a matrix transposition on a 4x4 array represented by
elements in 4 SSE registers. However, it doesn't work with integer
registers because Intel intrinsics make a distinction between integer
and floating point SSE registers. Theoretically one could cast and use
the floating point operations, but it seems quite plausible that this
will not round trip properly; for instance if one of your integer
values happens to have the same value as a 32-bit IEEE denormal.&lt;/p&gt;

&lt;p&gt;However it is easy to do with the punpckldq, punpckhdq, punpcklqdq,
and punpckhqdq instructions; code and diagrams ahoy.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.randombit.net/bitbashing/programming/integer_matrix_transpose_in_sse2.html" &gt;continued &amp;raquo;&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <pubDate>Thu, 8 Oct 2009 20:09:57 GMT</pubDate>
      <title>The Case For Skein</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=16</link>
      <guid>http://www.randombit.net/bitbashing/security/the_case_for_skein.html</guid>
      <description>
&lt;p&gt;After the initial set of attacks on MD5 and SHA-1, NIST organized a
series of conferences on hash function design. I was lucky enough to
be able to attend the first one, and had a great time. This was the
place where the suggestion of a competition in the style of the AES
process to replace SHA-1 and SHA-2 was first proposed (to wide
approval). This has resulted in over 60 submissions to the &lt;a href="http://ehash.iaik.tugraz.at/wiki/The_SHA-3_Zoo" &gt;SHA-3&lt;/a&gt; contest, of
which 14 have been brought into the second round.&lt;/p&gt;

&lt;p&gt;Of the second round contenders, I think Skein is the best choice
for becoming SHA-3, and want to explain why I think so.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.randombit.net/bitbashing/security/the_case_for_skein.html" &gt;continued &amp;raquo;&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <pubDate>Wed, 9 Sep 2009 19:13:02 GMT</pubDate>
      <title>Speeding up Serpent: SIMD Edition</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=15</link>
      <guid>http://www.randombit.net/bitbashing/programming/serpent_in_simd.html</guid>
      <description>
&lt;p&gt;The &lt;a href="http://www.cl.cam.ac.uk/~rja14/serpent.html" &gt;Serpent&lt;/a&gt;
block cipher was one of the 5 finalists in the AES competition, and is
widely thought to be the most secure of them due to its conservative
design.  It was also considered the slowest candidate, which is one
major reason it did not win the AES contest. However, it turns out
that on modern machines one can use SIMD operations to implement
Serpent at speeds quite close to AES.
&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.randombit.net/bitbashing/programming/serpent_in_simd.html" &gt;continued &amp;raquo;&lt;/a&gt;&lt;/p&gt;</description>
    </item>
    <item>
      <pubDate>Tue, 21 Jul 2009 21:11:05 GMT</pubDate>
      <title>Inverting Mersenne Twister's final transform</title>
      <link>http://www.advogato.org/person/randombit/diary.html?start=14</link>
      <guid>http://www.randombit.net/bitbashing/programming/inverting_mt19937_tempering.html</guid>
      <description>
&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Mersenne_twister" &gt;Mersenne twister&lt;/a&gt;
RNG 'tempers' its output using an invertible transformation:&lt;/p&gt;

&lt;pre&gt;
unsigned int temper(unsigned int x)
   {
   x ^= (x &amp;gt;&amp;gt; 11);
   x ^= (x &amp;lt;&amp;lt; 7) &amp;amp; 0x9D2C5680;
   x ^= (x &amp;lt;&amp;lt; 15) &amp;amp; 0xEFC60000;
   x ^= (x &amp;gt;&amp;gt; 18);
   return x;
   }
&lt;/pre&gt;

&lt;p&gt;The inversion function is:&lt;/p&gt;

&lt;pre&gt;
unsigned int detemper(unsigned int x)
   {
   x ^= (x &amp;gt;&amp;gt; 18);
   x ^= (x &amp;lt;&amp;lt; 15) &amp;amp; 0xEFC60000;
   x ^= (x &amp;lt;&amp;lt; 7) &amp;amp; 0x1680;
   x ^= (x &amp;lt;&amp;lt; 7) &amp;amp; 0xC4000;
   x ^= (x &amp;lt;&amp;lt; 7) &amp;amp; 0xD200000;
   x ^= (x &amp;lt;&amp;lt; 7) &amp;amp; 0x90000000;
   x ^= (x &amp;gt;&amp;gt; 11) &amp;amp; 0xFFC00000;
   x ^= (x &amp;gt;&amp;gt; 11) &amp;amp; 0x3FF800;
   x ^= (x &amp;gt;&amp;gt; 11) &amp;amp; 0x7FF;

   return x;
   }
&lt;/pre&gt;

&lt;p&gt;This inversion has been confirmed correct with exhaustive
search.&lt;/p&gt;

&lt;p&gt;This is more a note to my future self than anything else; I'm
cleaning out my ~/projects directory, and I can either publish this
somewhere or check it into revision control (well, actually the
contents of this blog are also in revision control, but no matter).
&lt;/p&gt;
</description>
    </item>
  </channel>
</rss>
