Older blog entries for Omnifarious (starting at number 155)

9 Nov 2011 »

Working on a small library, what should I name it?

I'm working on a small library to express computations in terms of composable trees of dependencies. These dependencies can cross thread boundaries allowing one thread to depend on a result generated in another thread. This is sort of a riff on the whole promise and future concept, but the idea is that you have chains of these with a potential fanout in the chain greater than 1. Kind of like the venerable make utility in which you express what things need to be finished before starting on the particular thing you're talking about.

But I'm not sure what I should call it. Maybe Teleo because it encourages to express your program in terms of a teleology.

I'm writing this basically because I've encountered the same problem on at least two different projects now, and it occurs to me that it would be really good to have a well-defined standard way of launching things in other threads and waiting for the results that suggested an overall program architecture. The projects I worked on were all set to develop a huge mishmash of different techniques that wouldn't necessarily play well together or be easy to debug.

comments

Syndicated 2011-11-08 22:13:54 (Updated 2011-11-09 03:03:29) from Lover of ideas

20 Oct 2011 »

Architecture problem...

I used to have a really good idea of what the architecture of a system that had to respond to multiple different possible sources of input or other reasons to do things (such as some interval of time expiring). My idea was basically to make everything purely event-driven and have big event loops at the heart of the program that dispatched events and got things done.

This solves the vexing problem of how to deal with all these asynchronous occurrences without incurring excessively complex synchronization logic. Nothing gives up control to process another event until the data structures its working with are in a consistent state.

But there are two problems with this model. One is old, and one is relatively new.

The old problem is that such event-driven systems typically exhibit inversion of control, and that makes them confusing and hard to follow. There are ways to structure your program to give people a lot of hints as to what's supposed to happen next when you give up control in the middle of an important operation only to recapture it again at some later point in time in a completely different function. But it's still not the easiest thing in the world to follow.

The 'new' problem is that silicon-based CPUs have not been getting especially faster recently. They've instead been getting more numerous. This is a fairly predictable result. CPUs have a clock. This clock needs to stay synchronized across the entire CPU. Once clock speeds exceed a certain frequency, the clock signal takes longer to propagate across the entire chip than the amount of time before the next pulse is supposed to happen. This means that in order to have an effectively faster CPU on a single chip you need to break it up into independent units that do not need to be strictly synchronized with each other. It's a state horizon problem.

But most programs are not designed to take advantage of several CPUs. If you want a program that's a cohesive whole, but still gets faster as the hardware advances, you need to break it up into several threads.

It seems like maybe it would be simple to do this with a program that had multiple threads. You just have multiple event loops. But then you end up with several interesting problems. How do you decide what things happen in which event loop? What happens if you need to have data shared between things running on different event loops? You run the risk of re-introducing the synchronization issues you avoided when you added the event loops in the first place, all with the cost of inversion of control. It doesn't seem worth it.

Additionally, if you have inter-thread synchronization, what happens if it takes awhile for the other thread to free up the resource you need? How do you prevent deadlocks? Most event systems do allow you to treat the release of a mutex or a semaphore as an event, so you can't just fold waiting for the mutex back into the system as just another event without doing some trick like spawning a thread that waits for the mutex and writes into some sort of IPC mechanism once it's acquired.

And splitting up your program into multiple event threads is not trivial either. How do you detect and prevent the case of one thread being overworked? Also, there is 'state kiting' to consider. Preferably you would prefer one CPU to be handling the same modifiable state for long periods of time. You want to avoid situations where first one CPU cache, then the next have to load up the contents of a particular memory region. Typically, each core will have its own cache. If for no reason other than efficient use of space, it would be good if each core had a disjoint set of memory locations in cache. And to avoid the latency of main memory access, it would be good if that set was relatively static. This means that a single event loop should be working with a fairly small and unchanging set of memory locations.

So simply having several threads, each with its own event loop seems a solution fraught with peril, and it seems like you're throwing away a lot of the advantages you went to an event driven system (with the unpleasant inversion of control side-effect) for in the first place.

So the original idea needs modification, or perhaps a completely new idea is needed.

One modification is embodied in the language Erlang. Erlang still has an event loop and inversion of control. You waiting for messages that come in on a queue. Any other loop can add messages to any queue it knows about. These messages are roughly analogous to events. But the messages themselves convey only information that is immutable. Since it is immutable, shared or not, no synchronization is required since it cannot change.

Erlang also encourages the creation of many such event loops, each of which does a very small job. Hopefully, no individual loop is too overloaded. Modern operating systems are adept at scheduling many jobs, and so this offloads the scheduling of all of these small tasks onto the OS.

I do not think Erlang does overly much to solve the locality of reference problem.

Another approach is the approach taken by the E programming language. It makes extensive use of a concept called a 'future' or 'promise'. This is a promise to deliver the result of some operation at some future point in time. It allows these promises to be chained, so you can build up an elaborate structure of dependencies between promises. In a sense, the programming language handles the inversion of control for you. You specify the program as if control flow were normal, but the language environment automatically launches as many concurrent requests as possible and suspends execution until the results are available.

It is possible to build a set of library-level tools in C++11 to implement this kind of thing somewhat transparently in that language.

I am unsure if there are any major tradeoffs in this approach. Certainly in C++ there is a great deal of implementation complexity, and that complexity cannot be completely hidden from the user as it is in E. I wonder if that implementation complexity introduces unacceptable overhead.

I also suspect that it may be difficult to debug programs that use this sort of a model. They appear to execute sequentially, but in truth they do not. It is possible, for example, to have two outstanding promises for bytes from a file descriptor, but which order those promises will be fulfilled in will not be readily apparent from reading the code. And error conditions can crop up at strange times and propagate to non-obvious places in the control flow of your program.

I also suspect this model will not exhibit the best locality of reference semantics. There will be a tendency to frequently spawn and join threads to handle asynchronous requests. And it will not be immediately apparent to the OS CPU scheduler which threads need to work with which memory objects. And this may lead to active state kiting between CPUs.

Also, those calls to create and destroy threads have a cost, even if that cost is fairly small, it's still likely much more expensive than acquiring an unowned mutex, and probably even more expensive than the call to wait for a file descriptor readability event or waiting for a briefly held mutex to become available.

Of course, it may be possible to implement all of this without creating many threads given a sufficiently clever runtime environment that implements its own queue that folds IO state and semaphore/mutex state events into a single queue. Such an environment would still need a lot of help from the application programmer though to divide up the application to maximize locality of reference within a single thread.

This is a fairly long ramble, and I'm still not really sure what the best approach is. I think I may try to set up some kind of 'smart queue'. This queue will have a priority queue of runnable tasks, and a queue of tasks that could potentially execute given a set of conditions. When a condition is met, the queue will be informed, and if that conditions enables one or more tasks to be run, these tasks will be added to the priority queue.

I envision that the primary thing on which the priority queue will be prioritized is length of time since the task was added to the 'wait for condition' list.

I can then write a C++11 library that will allow you to automatically turn any function that returns a promise into a function that uses these conditions to split up its execution. At least, if you use sufficient care in writing the function.

The conditions (since fulfilling a promise will be a possible condition) will have data associated with them. If this data involves shared mutable state, that will require a great deal of extra care.

comments

Syndicated 2011-10-20 22:43:44 from Lover of ideas

24 Jun 2011 »

Digital signatures and documents

Documents and the digital signatures that apply to them are necessarily separate. Most current cryptographic systems either digitally sign things on the fly (TLS) or send a library of digital signatures with the document they sign (OpenPGP). Though, to be fair, in the OpenPGP case, each of those digital signatures signs a variant document.

In CAKE there are documents to be signed. Examples are documents that say "This public key exists, was created at time X, is valid for new sessions and signatures from times A through B, and is considered invalid at time E.", or "This public key is reachable at this URL from times A through B.", or "Public key I has agreed to store and forward messages for public key J from times A through B.", or "My name for public key J is N.".

For some of these documents there is only one key who's signature is relevant. For others, a specific small set of keys is relevant (the store and forward case, for example). And for others you care about all signatures, but especially signatures by other keys you trust.

Of course, you could consider the document signed to include the name of the signing entity, in which case, each signature would be for a different document.

I'm not completely sure how to handle this. In my system there will be some documents that cannot be considered valid until multiple signatures have been received. So the signature has to be totally detached from the document.

Syndicated 2011-06-24 05:43:36 from Lover of ideas

11 Jun 2011 »

Help! DynDNS has become prohibitively expensive!

They want to charge me $40/yr per domain for secondary DNS! $40/yr! This is completely ridiculous. With the volume of lookups I get, I could probably host all the domains on my own server on a DSL line if I wanted.

Is anybody out there willing to provide secondary DNS for a few domains for me? I'm willing to cough up the equivalent of $10/yr in bitcoins for the service if you really want.

Syndicated 2011-06-10 23:48:36 from Lover of ideas

30 May 2011 (updated 30 May 2011 at 23:08 UTC) »

Session properties

I've been puzzling over a minimal and orthogonal set of properties for a session. I at first thought there were 3:

Message boundaries preserved: Whether or not your messages are delivered in discrete units, or whether they are delivered as a stream of bytes in which the original sizes of the send calls bear no relevance to how the bytes are chunked together on the other end.
Ordered: Whether or not data arrives in the order you sent it
Reliable: Well, this has a tricky definition. For TCP it means that failure to deliver is considered a failure of the underlying connection. But after such a failure you can't really be sure about exactly which bytes were delivered and which weren't.

But, as is evidenced by my description of 'reliable', these properties are not as hard-edged as they might seem. I also thought about latency, for example a connection via email is relatively high latency, and a connection between memory and the CPU is generally pretty low latency. But I'm looking for hard-edged, yes/no type properties that are in some sense fundamental. Latency seems like a property that's rather fuzzy. It exists on a continuum, and isn't really a defining feature of a connection, something that would drastically alter how you wrote programs that used the connection. In an object model, it would be an object property, not something you'd make a different class for.

But I find TCP's notion of 'reliability' very curious. It isn't really, in any sense, particularly reliable. I've had ssh connections that died, but when I reconnect to my screen session, I discover that a whole bunch of the stuff I was typing made it through, it just wasn't echoed back.

It also interacts with 'ordered' in an odd way. It might make sense to have an unordered connection that was 'reliable', but what does that really mean then? If it's a TCP notion of reliability, you could just deliver the last message and have the connection drop. Also, what would it mean to have an unreliable, but ordered connection? Would that mean you could send a bunch of messages and have only the first and last ones delivered? And would it make any sense at all to have an unordered, unreliable connection in which message boundaries were not preserved?

So I've come up with a different division...

Message boundaries preserved: Whether or not your messages are delivered in discrete units, or whether they are delivered as a stream of bytes in which the original sizes of the send calls bear no relevance to how the bytes are chunked together on the other end.
Ordered: Whether or not data arrives in the order you sent it
Must not drop: This means that if a message does not make it through, the connection is considered to be in an unrecoverable error state, and no further messages may be sent. Though you may not know which message didn't make it through.
Delivery notification: Whether or not you can know that a message made it to the other side or not.

These are not fully orthogonal. For example, if message boundaries are not preserved, then, in order for a connection to be in the least sensible, it must also have the 'ordered' and 'must not drop' properties. Also, if you must not drop messages, I'm not sure that it would then be sensible to have out-of-order delivery.

One of the rules of the system I'm designing is that any property that is not required may be provided anyway. This makes non-orthogonality much easier to deal with. So the prior cases aren't really a problem.

Can any of you think of a better set of properties, or important properties that I left out?

Some good discussion also happens in this Google Buzz post that mirrors this entry.

Syndicated 2011-05-30 12:48:35 (Updated 2011-05-30 22:55:50) from Lover of ideas

28 Mar 2011 »

CAKE has reached a small milestone

CAKE reached a new milestone early this morning. It now successfully both generates and parses messages that use the new protocol. It also successfully detected a re-used session id. I also think the code that does this is also a lot better designed than the old code was. It's easier to see how to put it in the context of a larger system that implements a node that speaks the protocol

It's also much more extensively tested at a deeper level with tests that are designed to document the inner workings of the system.

Overall, it's in a much better state than I left it when I sort of stopped working on it much in 2004. And I'm going to handle the hard problems first, how to maintain the relationship between sessions and transports, and having two way realtime conversations between nodes. This rather than concentrating on the messages that will be traded back and forth at a higher level (which will be done using protobuf). That can come later, especially since I'm not likely to get it right the first time anyway.

I also need to think about getting nodes to participate in a DHT to share assertions (like how to reach a particular node) in a distributed way.

Lastly, the protocol has something of a problem with 'liveness' because I designed it with the idea of conversations being able to be initiated without any round trips. There are some mitigation for this problem in session ids, but that mitigation is somewhat problematic because it requires the recipient of a conversation initiation to keep track of some stuff for everybody who tries to talk to it.

I'm not really sure how to handle the 'liveness' problem though and still preserve the lack of round trips property. I could require that session ids contain an 'hour number' or something similar. Though that introduces a requirement for at least very coarse grain time synchronization for all nodes.

Syndicated 2011-03-28 16:10:59 (Updated 2011-03-28 16:11:22) from Lover of ideas

2 Feb 2011 »

Interesting design problem with serialization and deserialization

I have been working on a serialization framework I'm happy with for Python. I want to be able to describe CAKE protocol messages clearly and succinctly. This will make it easier to tweak the messages without having to rip apart difficult to understand code. It will also make it easier to understand if I drop the project again and then come back to it years later, or if (by some miracle) someone else decides to help me with it.

Here is what I've come up with as the interface, along with one implementation fo that interface for a simple type:

class Serializer(object):
    """This is class is an abstract base class.  Derived classes, when
    instantiated, create objects that can serialize other objects of a
    particular type to a sequence of bytes, or alternately deserialize
    a sequence of bytes into an object of a particular type."""

    __slots__ = ('__weakref__',)

    def __init__(self):
        super(Serializer, self).__init__()

    def serialize(self, val):
        """x.serialize(value) -> b'serialized value'

        This is implemented in terms of serialize_iter by default.

        It is suggested that derived classes only implement serialize
        or serialize_iter and implement one in terms of the other."""
        if self.__class__ is Serializer:
            raise NotImplentedError("This is an abstract class.")
        return b''.join(x for x in self.serialize_iter(val))

    def serialize_iter(self, val):
        """x.serialize_iter(value) -> an iterator over the bytes
        sequences making p the seralized version of value."""
        if self.__class__ is Serializer:
            raise NotImplentedError("This is an abstract class.")
        return iter((self.serialize(val),))

    def deserialize(self, data, memo=None):
        """x.deserialize(data, [memo]) ->
        (value of the appropriate type, memoryview(remaining_data))

        data must be of type 'bytes', or 'memoryview'.  The memo must
        be a value extracted from a previous NotEnoughDataError.

        It is undefined what happens if you use memo and do not pass
        the same data (plus some possible extra data on the end) into
        deserialize that you originally passed in when you got the
        NotEnoughDataError you extracted the memo from.

        May raise a ParseError if there is a problem with the data.
        If the failure was because the parser ran out of data before
        parsing was finished, this is required to be a
        NotEnoughDataError."""
        return self._deserialize(data if not isinstance(data, bytes) \
                                     else memoryview(data),
                                 memo)

    def _deserialize(self, memview, memo=None):
        """x._deserialize(memoryview) ->
        (value of the appropriate type, memoryview(remaining_data))

        Exactly like deserialize, except a memoryview object is
        required.  deserialize is implemented in terms of
        _deserialize.  Derived classes are expected to override
        _deserialize."""
        raise NotImplentedError("This is an abstract class.")



class SmallInt(Serializer):
    """This class is for integers that are 8, 16, 32, or 64 bits long.
    They may be signed or unsigned.  No other sizes are supported.

    >>> s = SmallInt(2, True)
    Traceback (most recent call last):
        ...
    ValueError: size is 2, must be 8, 16, 32 or 64
    >>> s = SmallInt(8, True)
    >>> b = list(s.serialize_iter(5))
    >>> b == [b'\\x05']
    True
    >>> o = s.deserialize(b''.join(b))
    >>> o = (o[0], o[1].tobytes())
    >>> o == (5, b'')
    True
    >>> o = s.deserialize(b''.join(b) + b'z')
    >>> o = (o[0], o[1].tobytes())
    >>> o == (5, b'z')
    True
    >>> s = SmallInt(8, True)
    >>> b = s.serialize(-5)
    >>> b == b'\\xfb'
    True
    >>> s = SmallInt(8, True)
    >>> s = s.serialize(128)
    Traceback (most recent call last):
        ...
    ValueError: 128 is out of range for an signed 8 bit integer
    >>> s = SmallInt(64, False)
    >>> b = s.serialize(2**64-1)
    >>> b == b'\\xff\\xff\\xff\\xff\\xff\\xff\\xff\\xff'
    True
    >>> s = SmallInt(64, True)
    >>> b = s.serialize(-2**63)
    >>> b == b'\\x80\\x00\\x00\\x00\\x00\\x00\\x00\\x00'
    True
    """

    _formats = dict((
            ((8, True),   '>b'),
            ((8, False),  '>B'),
            ((16, True),  '>h'),
            ((16, False), '>H'),
            ((32, True),  '>i'),
            ((32, False), '>I'),
            ((64, True),  '>q'),
            ((64, False), '>Q')
        ))

    __slots__ = ('_size', '_signed', '_low', '_high', '_format')

    def __init__(self, size, signed):
        if size not in (8, 16, 32, 64):
            raise ValueError("size is %d, must be 8, 16, 32 or 64" % (size,))
        self._size = size
        self._signed = bool(signed)
        self._format = self._formats[(size, signed)]

    def serialize(self, value):
        if not isinstance(value, (int, long)):
            raise TypeError("%r must be an int or long" % (value,))
        value = int(value)
        try:
            ret = _struct.pack(self._format, value)
        except _struct.error:
            raise ValueError("%d is out of range for an %ssigned %d bit "
                             "integer" % (value,
                                          ("un" if not self._signed else ""),
                                          self._size))
        return ret

    def _deserialize(self, memview, memo=None):
        numbytes = self._size // 8
        if len(memview) < numbytes:
            raise _NotEnoughDataError((self._size // 8) - len(memview))
        else:
            data = memview[0:numbytes].tobytes()
            remaining = memview[numbytes:]
            try:
                result = _struct.unpack(self._format, data)[0]
                return result, remaining
            except _struct.error as err:
                raise ParseErrror(err)

There is also a CompoundNumbered type for representing tuples. This allows you to represent structured messages with multiple fields. Here is example of how you might represent CAKE new session messages:

cake_newsess_v2 = _serial.CompoundNumbered(
    _serial.Count(), # Version
    _serial.Count(), # Type
    _serial.KeyName(), # Destination key
    _serial.KeyName(), # Source key
    _serial.SmallInt(64, False), # Session serial #
    _serial.CountDelimitedByteString(), # Encryption header
    _serial.CountDelimitedByteString(), # Signature.
    _serial.FixedLengthByteString(32) # Header HMAC
)

There is a problem though. The signature and header HMAC are supposed to be encrypted, but the deserializer can't know the key to use until it's decrypted the encryption header. This means that later parts of the deserialization process need to know about things from previous parts.

I have a way for the deserialization process to save state. This is used so that if deserialization throws a NotEnoughDataError because not enough data is available, the exception may have a memo field. This memo field can then be passed in again to resume close to where deserialization stopped. (Though now I'm sort of wondering if I shouldn't do something generator based instead...)

But this mechanism does not allow state to be passed forward from a previous deserializer to a new one. And this applies the other way around too. When serializing there is stuff that's not really a part of the data being serialized (like the current HMAC or encryption state) that needs to be known by serializer in order to serialize properly.

I'm thinking of adding an optional context parameter to the serialization and deserialization functions that's just an empty dictionary into which this sort of state can be stuffed. But this seems really messy. Can anybody think of any better ways to do this that are fairly general?

Syndicated 2011-02-02 22:46:39 (Updated 2011-02-02 23:03:41) from Lover of ideas

5 Dec 2010 »

Protocol buffers?

I have a problem for which protocol buffers seem like a good solution, but I'm reluctant to use them. First, protocol buffers include facilities for handling the addition of new fields in the future. This adds a small amount to a typical protocol buffer message, but it's a facility I do not need.

Also, I feel the variable sized number encoding is less efficient than it could be, though this is a very minor issue. I also feel like I have a number of special purpose data types that are not adequately represented.

I'm also not completely pleased with the C++ and/or Python APIs. I think they contain too many googlisms. I would like to see public APIs published that were free of adherence to Google coding standards like do-nothing constructors and no exceptions.

I think, maybe, I will be using protocol buffers for some messages that are sent by applications using CAKE as a transport/session layer. These include some of the sub-protocols that are required to be implemented by a conforming CAKE implementation.

On a different note, I think Google's C++ coding standards are lowering the overall quality of Open Source C++ code. This isn't a huge effect, but it's there.

It happens because Google's good name is associated with a set of published standards for C++ coding that include advice that while possibly good for Google internally is of dubious quality as general purpose advice. It also happens because when Google releases code for their internal tools to the Open Source community, these tools follow Google's standards. And some of these standards have the effect of making it hard to use code that doesn't comply with those standards in conjunction with code that does.

Syndicated 2010-12-04 23:26:39 (Updated 2010-12-04 23:28:28) from Lover of ideas

22 May 2010 »

Today's XKCD

Normally XKCD is amusing for very positive reasons. But I frequently feel a lot like the guy with the beard in this cartoon. It's really frustrating. So, today's XKCD is darkly amusing to me. Freedom is such a hard sell before people lose it. People choose convenience every time, frequently until it's almost too late to fix the problem all the while berating the people who were worried in the first place.

Syndicated 2010-05-22 00:05:25 from Lover of ideas

19 May 2010 (updated 19 May 2010 at 07:10 UTC) »

Eben Moglen Tech Talk at Google

Eben Moglen is one of the principle lawyers behind the GPL. He's also a tireless free software advocate, and significantly more photogenic and diplomatic than Richard Stallman.

He recently gave this interesting tech talk at Google about the perception of Google by entities outside it. It was really well done, and struck a strong chord with me.

I've noticed that people frequently are incapable of believing that some things Google does are for the reasons Google says they're doing them. For example (and I don't really have the time to find references just now) many people seem to think that Google Doodles, those fun, timely modifications to their main search page, are a marketing tool, when in fact they are largely done purely out of whimsy.

I suppose, in one sense there is marketing purpose. Google is projecting their image of themselves out into the world. It's brand building. But, on the other hand, there isn't. I doubt that Google Doodles started as an idea for brand building in some marketing department. I'm betting some random small group of people decided one day that it would be fun to do, and the idea sort of caught on and now it's a tradition.

But people seem to want to analyze doodles for the marketing message they contain, despite the fact there generally isn't one. The more enigmatic the doodle is, the more determined people seem to be to find the marketing message in it.

This means there is a disparity in perception between people outside Google and people inside Google. One that might serve Google very poorly in the future. It's very important that Google understand this and respond appropriately. Perception is reality and people and organizations live up to expectations. Google risks becoming what people perceive them to be unless they act to correct that perception.

Google also frequently doesn't realize how the fact that they are so large and powerful affects people's perceptions of them. Witness the brouhaha over Buzz. Google did do some somewhat wrongheaded things in introducing it, but Buzz was not anywhere near the privacy destroying aggregator that people thought it was. And the fact that people perceived Buzz in this way seemed to mystify people inside Google, even though it was predictable given Google's size and people's perceptions.

Again, this points to a need by Google to better manage people's perceptions of them, and to manage their product releases better in terms of how people perceive them.

Eben Moglen suggests, quite wisely, that one thing Google could do is to change their policy on contributing internal changes back to Open Source projects. I think this is a good idea, but I doubt it will really be enough.

I am a little worried that if Google takes this advice to heart that they will grow a PR arm that does what every other PR arm in the world does, which is to try to make sure that perception stays far more positive than reality instead of simply trying to make perception match reality. But Google should do something, since I think people think far more ill of them than they generally deserve.

Google is, in fact, the only company I know of that has a revenue stream greater than 1 billion dollars a year that I actually have a positive opinion of.

Syndicated 2010-05-18 23:32:06 (Updated 2010-05-19 06:25:53) from Lover of ideas

146 older entries...