Older blog entries for Omnifarious (starting at number 9)

22 Jul 2003 (updated 22 Jul 2003 at 21:08 UTC) »

Well, it's starting to come together a bit, and I'm needing a name. I'm building a protocol in which all objects are named with self verifying names that aren't human readable. Messages are sent to a public key, and are always signed by the sender's public key. Files are named by secure hashes of their contents. That kind of thing.

I have grand plans of using this protocol for email, instant messages, web browsing, remote filesystem and database access, and almost anything else you can imagine. I intend for the basics of the protocol to form a layer above TCP or UDP, though it should be able to be layered inside of almost anything. I intend to write layerings for SMTP/IMAP, and AIM/Yahoo/MSN/ICQ/Jabber (via a gaim plugin).

I have some of the basics working using a mixture of C++ and Python, but it's not quite ready for public consumption. One obstacle is a name. I made a post in my LiveJournal about naming it. I'd like input from people here, if they're interested. Please feel free to make posts (anonymous or otherwise) to my LiveJournal with opinions or suggestions.

I don't consider the non-human readability of the names to be an obstacle. After all, IP addresses aren't particularly human readable either.

Also, if you care to look at the source at it currently stands, it can be found at: https://svn.generalpresence.com:5131/repos/trunk/C++/pract_crypto/

Subversion is great, and MUCH better than CVS, even though it's still in alpha/beta.

Help, I need a job!

*sigh* I've been out of work since March 6th. Recently, a job I would describe as pretty close to being my ideal job managed to attract my attention. It's even in MN. :-)

http://www.atcorp.com/research/careers/mpls/index.html

If you have any other suggestions, feel free to email me. Here's a link to my resume.

I converted over completely to using Subversion for The StreamModule System. Now I can actually have anonymous read access and not worry hugely about security. :-)

For those who are curious, the Subversion URL for the repository is:
https://svn.generalpresence.com:5131/repos/trunk/C++/libNet

One of the fun things about Subversion repositories is that they are automatically web browseable.

I've been working hard recently on General Presence. I want to build a server I feel happy adding things to. Like it's fairly light and effortless to do so. We need more programmers who don't care about things like that that I have to clean up after later. Sadly, we have no money, so we can't afford to hire them.

Because the nature of the application is peer-to-peer (P2P), I've taken to calling 'servers' hubs. I envision user agents as generally being spokes, though they can act like hubs as well, they most likely won't nearly as much. I envision hubs having as many or more than 5000 active connections at a time. Some, very busy hubs, may have as many as 50000. I don't think user agents will ever have more than 5 or 10.

I want to be able to combine the efficiencies of centralization with the robustness and resistance to subjugation of a heavily distributed approach.

I'm adding a very simple, UTF-8 only XML parser to the StreamModule system.

The feature I've worked hardest on is having the lexer portion reports the positions of tokens to the parser. There are a cascade of things this allows me:

  • It enables me to write a parser that can build up an internal structure representing the XML that references the original XML.
    • This allows me to pass XML through my StreamModule system without modifying it, or only modifying those exact sections I choose to.
      • Which is vital if portions of the XML are signed. Converting to and from a canonical format is a horrible thing to do if you need to preserve message integrity at the byte level, especially if you don't have control over all the implementations that may be creating or consuming messages.
  • It enables me to minimize copying
  • It makes it easy to have the lexer and parser skip quickly over large sections of XML that the application doesn't care about.

The parser will have some shortcomings. It doesn't allow non-ascii tag names, and it doesn't allow non-ASCII whitespace to be treated as such. It also has no support of entities right now, though such is planned in the future.

I'm writing it as part of a system I'm designing to route XML messages in a P2P framework. Speed, lack of copying, and the ability to ignore message bodies were my primary needs.

Ugh! Terminal libraries are so misdesigned. GNU readline almost demands to talk to a real terminal. ncurses definitely does. They all use global variables so you can't talk to more than one at a time.

All these libraries do is parse incoming characters and send characters back out according to a set of rules about what a terminal is supposed to understand. Parsers like bison allow you multiple parser instances and allow you a fair amount of flexibility regarding input sources. Why can't readline or ncurses?!

*sigh*

I've released version 0.3.0 of The StreamModule System now. I now have a timer mechanism, and a signal handling mechanism. No signal base timer mechanism though. :-)

I've put it up on Freshmeat in the hope that someone will grab it and start asking me questions and stuff about it. I've written a tutorial and things in the hopes of making it more accessible. It would be really nice to get other people's thoughts, perspectives and code in it. :-)

Well, I've released version 0.2.1 of my StreamModule system and posted up some jobs on SourceForge. I got so many responses on my offer to help people learn C++ if they'd work on my project that I've taken down that posting for a bit, hoping that at least a few of the responses work out.

Lots of ideas about what to do next for StreamModule. My next class will be RouterModule which will take an incoming packet and route it to an arbitrary set of destinations based on an external set of rules.

Also needed are:

  • A timer mechanism
  • making the dispatcher thread safe
  • a signal handling mechanism
  • a signal based timer mechanism
  • a better telnet client that handles line mode
  • a telnet server that handles line mode (maybe the same as the client as telnet is technically a P2P protocol
  • an XML parser module the uses a SAX based parser to get the real work done
  • Some handlers for other Internet protocols like SMTP, NNTP, and HTTP
  • Replacement of EHNet++ with a library based on a version of adns with my UNIEvent stuff put on top
  • An inter-thread queue module as well as making the RefCounting class threadsafe

Whee, the TelnetModule finally has an implementation I can stomache. :-) Time to go atestin'. I should write some kind of testing module to torture modules with data divided up in unexpected ways and such. I probably won't for these tests though.

25 Jan 2001 (updated 26 Jan 2001 at 02:07 UTC) »

I'm back to Journeyer now, woohoo!

Mysteriously, I have lost my Journeyman status. Perhaps someone bumped me back to Observer, I'm not sure. Oh, well.

I wanted to post a couple of points about C and C++ not having garbage collection:

  • It's the programmer's responsibility to use an appropriate resource control mechanism.
  • Garbage collection is only the 'magic' solution for memory. What about other resources? Is the runtime system smart enough to run the GC when you're out of filehandles? How about when you're out of foos?
  • Do you really want garbage collection to be an integral feature of a language you used to right an OS?

In my opinion, a language that gives you choice and flexibility in resource control techniques is much better and a big win over a language that doesn't.

In my StreamModule system, I use reference counting for a data structure (the structure representing a sequence of bytes) that can only be a DAG by the rules under which it is built. This is exactly the appropriate technique. The reference counting overhead is incurred in a very predictable and controllable manner.

I even have a small infrastructure (smart pointer classes and a mixin class that has the counter) built up to make it easy for a certain data structure to participate in reference counting where appropriate. In fact, in the combining list tails problem that someone mentions is a perfect candidate for using this infrastructure.

If ever I wanted cycle-resistant mark & sweep style garbage collection, I could find it or implement it and use it for exactly those data structures it was appropriate for. With a little discipline in how you use pointers and a little programmer prodded compiler assistance, you can operate a non-conservative mark & sweep collector in an environment that doesn't enforce one from the top down.

In Python, Perl, and other such 'scripting' languages, garbage collection is the right answer. In Java, it's a descision I strongly question. In C, C++, Ada, FORTRAN, and other such 'systems' languages, it's the wrong answer.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!