Older blog entries for elanthis (starting at number 363)

Creating Custom C++ Output Streams

In my younger, dumber days, I’d often write whole new classes in C++ when I had a need for output streaming, such as in a class handling a TELNET connection. Such classes required a ton of

I'm here today to tell you how not to do that, but to instead write a stream class that will work with all of the standard C++ output stream features. You might even learn a thing or two about input streams, but output streams are going to be my focus.

Note that I’m only going to touch on the basics here. C++ streams have a lot of features, most of which you won’t need to customize in the the majority of circumstances, so I’m ignoring those topics.

Introduction

Streaming output in C++ is accomplished by using the

One of the primary reasons to use a stream instead of directly writing bytes to a file is that streams allow for formatting and buffering. Formatting allows you to do something like the follow:

cout 

That will write out 123 in hexadecimal, or 7b. Without streaming, you’d have to create a byte buffer, format 123 yourself into that buffer, and then call system facilities like wite() to get your output on the screen. Kind of a pain.

C programmers will be familiar with the printf family of functions. These functions server a very similar role to the C++ streaming facilities. The above line of code, in C, could be written:

printf("%xn", 123);

The C++ streams offer several very distinct advantages over the printf family of functions, however. The first, and most widely known, is type safety. If the printf call had used the %s formatter instead of %x, then the program would likely have just crashed. The second advantage is that C++ streams have built-in support for user-controllable buffering. Buffering allows output to be stored up and sent to the OS facilities in larger chunks, which can both improve performance as well as allow for some special tricks which we’ll explore later. To provide user-controlled buffering in C, new functions which use printf functions internally must be created, and gracefully dealing with buffer overflows (that is, neither crashing nor losing output) can be a serious pain. A final advantage is that C++ streams maintain state, allowing you to more easily output a large number of identically formatted values without respecifying the format for each and every one.

The printf family of functions do have some advantages. They are, in most implementations, significantly faster than their C++ counterparts. Additionally, sometimes that “advantage” of C++ streams of maintaining state can actually be a problem, particularly if you set some state and never unset it. For example, the C++ example up above sets the number output format to hexadecimal, but never reverts it to decimal. All other output on cout will be formatted to hexadecimal until reset.

Since this is an article about C++ streams and not C string formatting, we’re going to assume that you actually want to use C++ streams and not printf, so let’s get on to the meat of creating a custom output stream.

Ostream and Streambuf

The std::ostream class does all the meat of formatting your streams. It stores the output state (like whether numbers should be displayed as decimal or hexadecimal) and processes your values to convert them into the correctly formatted output. This class is really the true core of all output formatting.

There’s absolutely no need to derive a class from std::ostream, either. The ostream class handles the formatting, but it doesn’t itself actually do anything with the output. Sure, there are derived classes like std::ofstream and std::ostringstream in the standard library, but these classes don’t actually change the behavior of std::ostream in any way. They are merely convenience wrappers that make use of a derived std::streambuf class.

All the actual work of outputing formatted data is performed by std::streambuf. Every ostream has a streambuf object associated with it. When you stream data to an ostream object, it formats the data and passes the results on to its streambuf. The streambuf then does the actual interesting work of writing the result out to your screen, into a file, or into an internal buffer. When you want to change the behavior of an output stream, what you actually need to do is make a new streambuf child class.

The std::ofstream class, for instance, creates a new std::filebuf object and associates it with the opened file. The ofstream class also offers a few other convenience methods on top of the base std::ostream, but all of these methods actually interact with the filebuf object.

To associate a streambuf object with an ostream, you can call the ostream::rdbuf() method. If called with no arguments, it returns the current streambuf. If called with a pointer to a streambuf, it sets that as the current streambuf. You can also pass a pointer to a streambuf to the constructor for an ostream. For example, let’s mimic ofstream using just ostream and filebuf.

filebuf file;
file.open("myfile.txt", ios::out);
ostream os(&file);
os 

That could behaves identically to code that uses ofstream. The only difference there is that the methods like open and close must be called on the filebuf object instead of the ostream object.

There is one catch to be wary of. The ostream class will not manage the memory for its streambuf object. That’s fine for the example above, but if you had created the filebuf object using the new operator, you would have to remember to delete the pointer yourself when you’re done.

Buffered and Unbuffered Output

There are two kinds of output you can perform using the std::streambuf class: buffered and unbuffered. Buffered output is when all data is stored temporarily in a buffer. The data is only sent to the actual output destination when the buffer fills up, or when the output stream is flushed. Unbuffered output sends all data to the output destination immediately.

When you create a new streambuf instance, it is by default unbuffered. If you wish to make it buffered, you must create a buffer for it, and then tell the streambuf about your buffer using the setp method, which is protected. So, to create a buffered streambuf object using a 100 character buffer:

class mybuf : public streambuf {
public:
  mybuf () {
    setp(buffer, buffer + sizeof(buffer));
  }

private:
  char_type buffer[100];
};

And voila! Your streambuf descendent is now buffered using your 100 character array. That’s all there is to it.

Note that buffer memory is not managed by the streambuf class. If you allocate a buffer with new, you are responsible for deleting it.

Custom Unbuffered Streams

You want to write a log stream facility that sends output both to cerr (standard error output) as well as a file, mylog.txt. It’s easy enough to do either - just stream your data to either cerr or a ofstream - but you’d rather not write each stream command twice. Writing a very simply streambuf class that performs both for you is, thankfully, quite easy.

A virtual method called xsputn is called on a streambuf whenever there is data to write. You need only override that one function to create a custom unbuffered streambuf. The function takes a pointer to an array of characters and the length of the array, and is expected to return the number of characters it was able to write. Since we’re just passing this on to a couple other streams we just return the length of the buffer given us.

// your log file, lazily declared as a global
ofstream logfile;

// logbuf forwards all output to cerr and logfile
class logbuf : public streambuf {
private:
  // write a string s of length n to standard
  // error and a log file
  int xsputn (char_type* s, streamsize n) {
    cerr.write(s, n);
    logfile.write(s, n);
    return n;
  }
};

int main () {
  // open our log file
  logfile.open("mylog.txt", ios::app);

  // create our log stream
  ostream log(new logbuf());

  // be friendly
  log 

That’s the gist of what you need, and nothing more. Pretty simple, eh? We could improve things a little further. For example, our logbuf object is leaked - we never delete it. That isn’t really vital for this example, since the memory is reclaimed at the end of the function anyhow, but we should handle it properly anyway. More importantly for our little example, however, we don’t control buffering properly. Our logbuf is unbuffered, but both cerr and logfile are buffered. We would expect the

class logbuf : public streambuf {
private:
  // flush both cerr and logfile; return 0 to
  // indicate there was no error, but we're
  // too lazy to check for errors ourselves
  int sync () {
    cerr.flush();
    logfile.flush();
    return 0;
  }

  // write a string s of length n to standard
  // error and a log file
  int xsputn (char_type* s, streamsize n) {
    cerr.write(s, n);
    logfile.write(s, n);
    return n;
  }
};

There, flushing is now supported!

I’m going take this moment to explain the char_type and streamsize types used above. streambuf::char_type is a typedef for the actual character type in use, which for streambuf would be char. The wstreambuf type is identical to streambuf, except it works with wchar_t (wide character support, for unicode), and char_type is different for that class. The streamsize type is similar in purpose to size_t - it’s just a typedef for the particular type of integer your STL implementation chose, and using streamsize makes your code portable.

Custom Buffered Streams

Unbuffered streams are great, but they’re not always what you’re looking for. Say that you are writing a network stream. You don’t want to call send() over and over for performance reasons; you’d rather buffer up your output and send it all at once. Once we set the buffer with setp, the streambuf class will do all the work of putting characters into the buffer and protecting against overruns. We don’t need to implement our own xsputn at all, since the default implementation does exactly what we want. We can simply override the sync method to take our buffer contents, write them to the socket, and then clear the buffer.

class sockbuf : public streambuf {
public:
  // initialize our sockbuf with a socket
  // descriptor, and setup a new buffer
  sockbuf (int _sockfd) : sockfd(_sockfd) {
    char_type* buf = new char_type[1024];
    setp(buf, buf + 1024);
  }

  // free our buffer
  ~sockbuf () {
    delete[] pbase();
  }

private:
  // dump our buffer to the socket and clear
  // the buffer
  int sync () {
    // for brevity's sake, not doing proper error
    // handling; we return a non-zero value(error)
    // if we failed to send the full buffer contents
    int ret = send(sockfd, pbase(), pptr() - pbase(), 0);
    if (ret != pptr() - pbase())
      return 1;
    // reset the buffer
    setp(pbase(), epptr());
    return 0;
  }

  // our socket descriptor
  int sockfd;
};

We’ve got a few new functions there. The pbase() method returns a pointer to the beginning of the buffer. The pptr() method returns the current position of the stream in the buffer. So, the number of characters in the buffer is equal to pptr() minus pbase().

Unfortunately, this little class has a problem. Our buffer is only 1024 characters long, and the data is only written out when the flush method is called on the ostream using this streambuf. When that buffer fills up, any further data we try to stream is just lost. It would be ideal if we could instead grow the buffer or try to flush the data we already have in the buffer. Let’s try growing the buffer.

When our buffer fills up, the overflow() method is called. This method takes the character that didn’t fit into the buffer as a parameter, and returns either EOF to indicate failure or any other value to indicate success. We’re just going to grow our buffer by 1024 elements and then call the standard sputc() function to add the character into the buffer.

class sockbuf : public streambuf {
public:
  // initialize our sockbuf with a socket
  // descriptor, and setup a new buffer
  sockbuf (int _sockfd) :
      sockfd(_sockfd), buf(0), buflen(1024) {
    buf = new char_type[buflen];
    setp(buf, buf + buflen);
  }

  // free our buffer
  ~sockbuf () {
    delete[] buf;
  }

private:
  // dump our buffer to the socket and
  // clear the buffer
  int sync () {
    // for brevity's sake, not doing proper error
    // handling; we return a non-zero value(error)
    // if we failed to send the full buffer contents
    int ret = send(sockfd, pbase(), pptr() - pbase(), 0);
    if (ret != pptr() - pbase())
      return 1;
    // reset the buffer
    setp(pbase(), epptr());
    return 0;
  }

  // we ran out of space, so grow
  // the buffer
  int overflow (int c) {
    // allocate a new buffer and copy our
    // current data into it, then swap it with
    // the old buffer
    char_type newbuf[buflen + 1024];
    memcpy(newbuf, buf, buflen);
    delete[] buf;
    buf = newbuf;

    // now we need to stuff c into the buffer
    sputc(c);
    return 0;
  }

  // our socket descriptor
  int sockfd;

  // our buffer
  char_type* buf;
  unsigned long buflen;
};

And there we have it. Our sockbuf class can now buffer up data until flushed without losing any data.

Complex Example

Alright, let’s go ahead and totally abuse the system now. We want to use our log class from before, but we want all of our log lines to include a date and time at the start as well as a log priority, but we don’t want to have to stream the time to log over and over. We’d like to be able to write code like this:

log 

There are a few important things going on here. First, we are setting the priority by streaming out a special priority value (e.g., DEBUG, ERROR). Note that on the third line we didn’t stream a priority, but the ERROR priority from the prior line didn’t carry over. It’s a piece of state that we’ll reset when a flush (or endl) occurs on the stream.

We could do this as an unbuffered stream. We would just write an xsputn method that wrote the time and then the log message. However, think what would happen in this example:

log 

We’d actually end up with the time and priority printed four times in the single line: once before “The,” once before the user’s name, once before ” logged,” and then a final time just before the newline. We wil need to buffer our output and then only write the time and the log once for each flush.

We’re also going to assume that flush won’t be called on our log directly, but instead we’ll always use endl. That way we know that our stream will contain a newline and we don’t need to worry about one ourself.

We’re also going to actually override ostream this time. We actually need to do that to get the priority feature to work, plus it’s kind of a pain to have to create an ostream object and then call rdbuf() on it all the time, and a custom ostream allows us to hide that in the constructor.

First, the priorities. This will just be a simple enum.

enum LogPriority {
  INFO, // regular unimportant log messages
  DEBUG, // debugging fluff
  ERROR, // it's dead, jim
};

Because it’s an enum, we can create a special

Our logbuf derived from streambuf should be fairly old news by this time. In fact, it's identical to our sockbuf above, except when sync() is called we spit out the time and the log message to cerr and a logfile ofstream, whch this time around we'll make a member. We also keep the current priority level, which defaults to INFO.

class logbuf : public streambuf {
public:
  // create a buffer and initialize our logfile
  logbuf (const char* logpath) :
      priority(INFO), buf(0), buflen(1024) {
    // create our buffer
    buf = new char_type[buflen];
    setp(buf, buf + buflen);

    // open the log file
    logfile.open(logpath, ios::app);
  }

  // free our buffer
  ~logbuf () {
    delete[] buf;
  }

  // set the priority to be used on the
  // next call to sync()
  void set_priority (LogPriority p) {
    priority = p;
  }

private:
  // spit out the time, priority, and the
  // log buffer to cerr and logfile
  int sync () {
    // nifty time formatting functions
    // from the C standard library
    time_t t = time();
    tm* tmp = localtime(&t);
    char timebuf[128];
    strftime(timebuf, sizeof(timebuf),
      "%Y-%m-%d %H:%M:%S", tmp);

    // now we stream the time, then the
    // priority, then the message
    cerr 

Well, there’s that. Now we need our customized ostream-derived class so that we can stream the LogPriority values and get the desired behavior. We’ll keep a logbuf object as a member variable, which we’ll setup as the streambuf for ostream in the constructor.

class logstream : public ostream {
public:
  // we initialize the ostream to use our logbuf
  logstream (const char* logpath) :
    ostream(&buf), logbuf(path) {}

  // set priority
  void set_priority (LogPriority pr) {
    buf.set_priority(pr);
  }

private:
  // our logbuf object
  logbuf buf;
};

// set the priority for a logstream/logbuf
// this must be a global function and not a
// member to work around C++'s type
// resolution of overloaded functions
logstream& operator

And there you have it! You can now easily create log and use logstreams. You can even have multiple such streams, giving each its own log file.

int main () {
  logstream log("logfile.txt");

  log 

Closing Thoughts

C++ streams are fairly simple to implement. We took a few liberties in the examples above with sockets and memory handling, but the core concepts of writing a custom output streambuf are there.

The complete source to the logstream example is available here.

Syndicated 2007-12-10 23:34:28 from Sean Middleditch

LLVM Development

So, I decided to go ahead and sink into LLVM and Clang. Feels good to be back in the wider development community, even if time spent on Clang is eating up paid-time for my work projects.

The Clang code is huge, and the subject material is complciated, but the code is surprisingly clean and the comments are generally fairly useful. It’s completely awesome compared to hacking on “professional” PHP scripts where the original coders didn’t understand basic concepts or understand how to write useful comments or function names.

Granted, the few tiny patches I’ve sent in to Clang have so far been not quite right, but I’m still learning the guts of how a C compiler works. There’s a big gap between understanding the various effects on code generation between using a short and a long and understanding how the compiler actually generates the the code. For example, the bug I’m currently working on has to do with padding between struct fields, which is something I knew about and something I’ve worked wirh before (reordering fields to reduce the total amount of padding), but making a compiler track that padding, calculate the correct amount based on type and architecture, and so on isn’t something I’ve ever needed to know before. Writing a generic interpreted scripting engine on a custom byte-code VM and writing a standards compliant and system ABI compatible C compiler are worlds apart.

Still, actually learning how Clang works is fairly easy, if time consuming. It’s huge, but it’s well written.

I look forward to submitting a patch for the struct padding issue I’m running into, and maybe even having that patch do everything correctly. Which might be hard, given I can only test on a small handful of architectures (x86, amd64, ppc32).

Syndicated 2007-12-10 17:11:00 from Sean Middleditch

How To Write a TELNET Server or Client

Introduction

TELNET is a protocol designed way back when dinosaurs still roamed the earth, chasing cavemen and operating large mainframe computers connected to remote line printers. The protocol doesn’t see much use in mainstream computing, although it’s still popular on various IBM mainframe installations and, which you the intrepid reader of my humble blog are more interested in, text-based Multi-User Dungeons and similar online games.

Modern TELNET clients are vastly different than the line printers of yore. We have graphical terminals in which our screen can update instantly, and the drawing cursor can move about freely painting characters anywhere it wants in an assortment of colors and styles. Line printers, on the other hand, simply printed horizontally, occassionally chugging down a line and continuing onward. We generally don’t care about those anymore, though. Very few people writing a TELNET server today are expecting their client to be using a line printer. Most of those writing TELNET apps today are probably writing MUDs, or modifying MUDs, or writing a client for MUDs.

“So,” you ask, “how does one write a TELNET server or client, anyway?” The answer is thankfully quite simple. I am going to assume that a familiarity with the basic networking APIs is already possessed, but if not, a few minutes on Google should help.

Basic Concepts

For the most part, TELNET is simply nothing more than sending characters back and forth between the client and the server. Old line printers and networks were half-duplex, meaning that only one side could send data at a time, and the other side had to wait for permission to send. While the protocol still technically uses those rules, they are ignored by all MUD software, as well as most general TELNET software, so we won’t worry about those. A server which simply sends ASCII text to its client and receives ASCII text in response is a completely functional TELNET server, and a client likewise is the same in reverse.

There is more to TELNET, howerver. TELNET offers a variety of options, which range from options like enabling full-duplex mode (not really necessary these days) up to controlling the display of what a user types on his screen. TELNET does not control things like cursor positioning or text color, however. Those are a separate protocol, which I’ll touch on briefly later in this article.

From the perspective of MUD developers, possibly the most interesting feature of TELNET is the ability to control the display of what a user types. Normally, when a user types a key in his TELNET client, it is immediately displayed on his screen, and then sent to the server. This makes typing nice and quick. However, sometimes the server wants more advanced control of the display of input, such as to synchronize it with its own rendition of the screen… or to suppress it entirely, such as when a user is entering his password.

TELNET makes use of a simple control code scheme. In a way, you can this of this as being analogous to the \ escape sequences found in almost all programming languages. For example, \n produces a newline in a string, while \\ must be used in order to create a single backslash in the string. TELNET does the same thing, except instead of a \, it uses a special value called IAC (Interpret As Command), which is equal to the number 255. (You might note that 255 is the largest integer that can be stored in a single 8-bit byte.)

When operating in half-duplex mode, for example, one end of the communication must send the GA (go ahead) signal to let the other end know that it can begin sending. This is done by sending two bytes over the network pipe, IAC GA (255 249). If the client used the interupt key (control-C), the client might send the interrupt signal to the server by sending the two bytes IAC IP (255 244).

Normally TELNET is not 8-bit clean. That means that the plain text data sent between the two ends can only be 7-bit ASCII values. It is possible to put TELNET into binary mode, however, which allows the use of 8-bit values. In this case, it may be necessary to send the value 255, but not have it interpreted as a TELNET command. Just like the \ escape sequence, this is done by doubling up the special character. So, to send the value 255 and not have it processed specially, send the two bytes IAC IAC (255 255).

Option Negotiation

Being able to send 8-bit data over the TELNET connection is pretty handy. It lets you support non-ASCII character encodings, like UTF-8 or ISO-8859-1. You know how to properly send the character value 255 without confusing TELNET, but the client or server you’re talking to keeps doing funny things when you send it values over 127 (that is, any value that doesn’t fit in 7 bits). That’s because first you must negotiate the BINARY option with the remote end of the connection.

This is where TELNET option negotation comes in. TELNET has four special codes for negotiating options: WILL, WONT, DO, and DONT. These commands are a little different compared to normal commands. They start with the special IAC value just like other TELNET commands, but they are three bytes long instead of two. The third byte sent is the option code for the option you are negotiating. The BINARY option is code 0.

So, what do those four negotiation commands mean, exactly? Each has two meanings, based on context. For example, the WILL negotation either means, “I am willing to use this option, if you are,” or it can also mean, “I am acknowledging your request to begin using this option.” Let’s say that you are writing a client that wishes to enable BINARY mode. You must ask the server if it would like to do so, by sending it the sequence IAC WILL BINARY. The server will then respond with one of two commands: either IAC DO BINARY (”I accept”) or IAC DONT BINARY (”I refuse”). If the server accepted, your client is now free to send 8-bit data to the server.

However, the server is not at this point permitted to send 8-bit data to the client. The server might request it in the same fashion as the client, but with the roles reversed. On the other hand, the client could request that the server start sending 8-bit data by telling the server to enable the BINARY option. This is done by using the DO command, by sending the bytes IAC DO BINARY. The server will then respond with either IAC WILL BINARY (”I accept”) or IAC WONT BINARY (”I refuse”).

All TELNET option negotiation works this way. One end either advertises that it capable of using the option with WILL or requests the other end to use the option with DO, and the other end responds in the affirmative or negative. However, things can get a little more complicated. Let’s say we have a naive client talking to a naive server. The client wants to enable BINARY mode, so it sends IAC WILL BINARY. The server accepts, and responds with IAC DO BINARY. The client however, being naive and incomplete, doesn’t know if the server is acknowledging a prior request or initiating a new request. The client assumes it might be initiating a new request, and sends the appropriate response IAC WILL BINARY. The server, also naively written, believes the command to be a new request, and responds with IAC DO BINARY. The client and server are now sending back these two commands over and over, eating up bandwidth and not really accomplishing much.

For this reason, a complete TELNET implementation must track the state of each option for both the local end and the remote end. Each option has three states: enabled, disabled, or unknown. This can be implemented with two 256-element arrays containing an enum denoting the enabled/disabled/unknown state of the option. All elements in both arrays are initialized to unknown. An option with a value of unknown is effectively disabled, but there is more to it, and yes, the local option set also needs the unknown state. Say that you have a client talking to a buggy server that requests the BINARY option be enabled, but doesn’t actually support the option and gets into the infinite loop described above. The server sends the IAC DO BINARY sequence. Your improved, non-naive client looks at its local option array and sees that the BINARY option is set to unknown. The client now enabled the option and responds with IAC WILL BINARY. The buggy server responds with IAC DO BINARY. The client sees, however, that the option is already enabled. Therefor, it does not need to send a response. This effectively breaks the loop caused by the buggy server. Additionally, the client can look at the array of server options and effectively knows not to send a request or response to a server option that is already set to enabled or disabled.

Now, in practice, it is usually not necessary to use all those arrays of flags. Most MUD servers and clients do not, and they work just fine. This is particular because most of the options used in MUDs are “one way” options; that is, only one end of the connection ever requests them. A MUD client generally never ends an initial IAC DO ECHO (which would tell the server to echo everything the client sends back to the client), so when the client receives an IAC WILL ECHO it knows that the server is requesting to enable the option itself, and the command is not a response acknowledging that the client begin echoing data back to the server (which would pretty badly break things for the client). So long as the client never talks to a really broken server, a client could get by with just a handful of flags for the options it supports. The same goes for the server. Just be careful for any options that both the server and client use (like BINARY) to make sure you only respond when the other end is requesting the option, and not when the other end is acknowledging the option, and your application will work just fine so long as the other end isn’t totally broken.

There is a general rule that will help for implementations that don’t use the full 256-element arrays. When receiving a request to enable an option your application does not support, always refuse. When receiving a request to disable an option your application does not support, don’t respond at all.

Let’s take a brief look at that ECHO option. ECHO is option code 1. For MUDs, and in truth most TELNET applications, the server is the only end that ever performs echoing. If a client echoed back everything the server sent to it then it would probably result in another infinite loop. The server would say something, the client would send it back, the server would interpret that as a command and say something back (possibly just an “unknown command” error), and the client would echo that back to the server, which would interpret it as a command… bad stuff.

However, it’s generally pretty nice when the user types something in and whatever he typed shows up on his screen. A TELNET client will generally always prefer this, and by default it will print anything anything the user types on the user’s screen. A server will sometimes want to disable this, most commonly when it is requesting a password. However, TELNET has no option for “hide the user’s input.” Instead, we have to use a sneaky trick. If the server sends an IAC WILL ECHO, that means that it is willing to echo back everything the user types. Pretty much all clients will agree to this, and they will respond with a IAC DO ECHO. At this point, the client no longer prints the keys the user types in. The client is expecting the server to do this itself. However, nothing actually requires the server to do so. It could echo back the user’s input after transforming it (turning it into stars), echoing it verbatim, or just echoing nothing. When the server is finished retrieving the user’s password, it then tells the client that it no longer wants to echo by sending IAC WONT ECHO. The client then acknowledges this with IAC DONT ECHO, and will start displaying what the user types in again.

Note: the Windows TELNET client is notoriously broken with its handling ECHO. The client will gladly accept when the server sends IAC WILL ECHO, but when the server sends IAC WONT ECHO, the Windows client will not start echoing local characters any more. Also note that, unlike almost every other client, the Windows client only operates in character mode. That means that each character is sent to the server as it is typed, while most clients only send whole lines. There are ways to tell a client to go into character mode or into line mode, but the Windows client only supports character mode.

So, now you have option negotiation working, as well as 8-bit support with proper escaping. However, you’ve heard about this NAWS thing, which lets your client tell the server how big the display window is so that the server can do fancy layout. NAWS is option code 31. A server that wants window size information will send an IAC DO NAWS, and a client which supports it will respond with IAC WILL NAWS. But… now what?

Sub Options

Option negotation is only capable of enabling or disabling an option. However, some options, like NAWS, control features which need to be able to send more complex data using the protocol. The NAWS feature needs a way for the client to tell the server the number of rows and columns in the client’s display.

For features like these, TELNET uses the SB command, which is called a “sub option.” SB is code 250. This command is rather special. It starts with three bytes: IAC, SB, and then the option code, such as NAWS. It is then followed by an arbitrary number of bytes, which we’ll call the payload. End of the of the sub option is marked with the two byte sequence IAC SE. SE is code 240. So, what do those bytes between the initial three byte sequence and the ending two byte sequence mean? Well, it depends on the option.

NAWS send two 16-bit integers as its sub option payload. Each integer is in network byte order. The first integer is the number of columns (width), and the second integer is the number of rows (height). So, a client with 80 columns and 24 rows would, after the NAWS option has been enabled with option negotiation, send the byte sequence IAC SB NAWS 0 80 0 24 IAC SE.

One must be careful when writing code to handle sub options. A very large number of MUD servers and clients do not do this properly. Let us pretend, for a moment, that a user has some particularly large terminal… say, 255 columns and 61440 rows. The NAWS sub option byte sequence would be IAC SB NAWS 0 255 240 0 IAC SE. However, remember that IAC is 255 and SE is 240. That means that the bytes are equivalent to IAC SB NAWS 0 IAC SE 0 IAC SE. See the problem? Correlcty implemented software will parse that as a sub option with a single byte in its payload, followed by a zero byte and then an IAC SE sequence, which is illegal. Plus, the NAWS sub option would be the wrong size, which is also illegal. The correct thing for the client to do is to escapse that byte equal to 255 with a double IAC sequence, just like the \\ escape. So the correct thing for the client to send woul dbe IAC SB NAWS 0 IAC IAC 240 0 IAC SE. While it looks like the payload is 5 bytes, the server would convert the IAC IAC into a single byte equal to 255 in the buffer it stores the sub option payload in. However, many incorrectly written MUD servers do not do this; after receiving IAC SB NAWS, they then look for exactly 4 bytes for the payload (ignoring the values of those bytes, even if they contain 255), and then immediately expects IAC SE (sometimes they don’t even check that they actually get IAC SE, they simply read in two bytes and call it done). It is thus impossible to write a client that will be able to handle this situation both with correct servers (which require that the IAC be escaped) and incorrectly written servers (which require that the IAC not be escaped).

Fortunately, the scenario is rather unlikely to occur. There is little benefit in a client that display 61440 lines, even if your screen could somehow handle it. Furthermore, while the proper escaping of IAC bytes within a sub option payload is essential for some options, almost none of those are used in MUDs and thus they should never be sent to those poorly written MUD servers. However, if you’re writing new software, even for a MUD, it is a very good idea to correctly process all sub option commands. The correct way to handle NAWS is to use a buffer to read in the sub option payload (performing IAC escaping as you do so), and once the IAC SE is read, to then check that the payload buffer has exactly 4 bytes in it before processing the command.

Alright, so now you have all the low-level TELNET machinery working, and you’re even supporting cool things like window size notification. Now there’s that tricky deal with actually displaying and sending text properly. See, while I wasn’t lying when I said TELNET just sends raw text back and forth for input and output, there are a few tricks to how that text is interpreted, especially if you want to support fancy colors and stuff.

Newlines

Welcome to the shortest section of this article! TELNET newlines are expected to be the two-byte sequence CR LF. That’s byte values 10 and 13, or \r and \n. Just sending \n by itself or just \r or sending \n \r may cause some funny things to happen.

When reading and writing text files, a newline is usually represented by just a plain LF, or \n. Even on systems that store a CR LF sequence in text files, like Windows, the standard file I/O facilities will automatically translate back and forth between \n and \r \n when reading and writing text files. However, when you are displaying text to a terminal, even on systems like UNIX, the terminal might be in a mode in which a solo LF (line feed) only does what it was originally meant to: cause the cursor (the print head in old line printers) to move down a line, but not return to the start of the line. The CR (carriage return) character tells the cursor (or print head) to return to the beginning of the current line. So, in order to move down to the beginning of the next line, you’d need to send \r \n (or \n \r).

TELNET, being a protocol designed specifically for driving those old line printers, works the same way. Even on modern systems, many clients will treat a solo LF as just a line feed, and many servers will not recognize a solo \n as being the end of a command. So, if you’re writing a server and your output only uses a \n for newlines, be sure to translate those into \r \n (CR LF) when you send the data to the client. If you’re writing a client, be sure to send \r \n whenever the user hits enter.

Now, getting newlines to work properly is thrilling and all, but even more thrilling is that there’s not much more to say about TELNET itself. Sure, there are some extra commands to learn (the GA command used in half-duplex mode, which TELNET is in by default, can be handy to learn about, especially since MUDs use it for some fun tricks), and some other options that can be useful, but there’s no more actual protocol machinery to learn about. Now it’s on to colors and cursor control, which isn’t actually a part of TELNET at all.

ANSI Terminal Escapes

First, let’s look at terminal types. See, a long time ago (and, actually, right now, too) there were a bazillion different line printer and graphical terminal products on the market. Infuriatingly, pretty much every single one had its own proprietary protocol for controlling its special features. Even two terminals made by the same company would often have different (though usually similar) protocols for controlling color codes, cursor positioning, and other features. Even today, the text console on Linux uses a slightly different protocol than the text console on various other UNIX and UNIX-like operating systems, which themselves are all different from each other. Graphical terminal emulators, which is what most of us are using (and which includes your average MUD client), can have their own protocols, too. The modern xterm variation (xterm is a standard graphical terminal emulator for Linux/UNIX systems) is very slightly different than the popular terminal emulator I use on my Linux desktop, for example.

In order to properly handle all of these different terminals, a TELNET server would need to ask the client what kind of terminal they are using (yes, there is an option in TELNET for this), and then consult a library that maps common operations, like “clear the screen,” into the proper sequence of control codes for that particular terminal type. If you’re writing a real TELNET server, you’re going to need to get familiar with the termcap and/or terminfo libraries, as these provide those services.

MUD servers and clients, however, don’t need to care about such things. See, all modern terminal types, while they have slight differences, are based off of the ANSI terminal specification. This specification includes a number of common control codes, like setting terminal color, clearing the screen, or moving the drawing cursor to a specific position in the terminal window. A MUD server need only support these ANSI terminal codes, and can safely assume all clients support them as well. Most regular TELNET clients, running on modern terminal emulators, will do so. Most users of MUDs use a specialized MUD client which will interpret the control codes itself, and then translate those into whatever commands are appropriate for the display, so even a MUD client running on some non-standard terminal will still be compatible with a MUD server that only uses ANSI control codes. So, we’re just going to talk about ANSI control codes from now on.

Now that we’ve gone through three paragraphs of boring and mostly useless exposition, let’s get to the meat of things!

All control codes begin with the ESC character (\e, or 27). In general, the ESC character will be followed by a [ (left-hand square bracket), and then possibly by a command payload, followed by an ASCII letter denoting the actual escape command. For example, to clear the screen, you might use ESC [ 2 J ESC [ H. The J command does various screen-related actions, and the payload 2 is what tells the J command to clear the screen. That technically just clears the screen, though, leaving the cursor at whatever position on the screen it was already at. The H command tells the cursor to return to the upper left corner of the screen (Home).

Setting the color, and an assortment of other visual display settings, is done with the m command (Mode). The payload for the m command is one or more numeric values, separated by a comma. The value 0 means “reset the display settings to the default.” The value 31 means “set the text color to red.” So, to display the phrase “Red Baron” with the word “Red” in the color red and the rest in the default color, the server would send ESC [ 31 m R e d ESC [ 0 m _ B a r o n. (The _ represents a space.)

Remember that you can include multiple values in your payload for the m command. If you want to display something in green (code 32) and wanted to make sure that all otehr display mode settings, like background colors, were disabled, you would sent ESC [ 0 ; 3 2 m.

You can set the cursor position using that H command we saw before. Simply provide the row and column, separated by a semi-colon, in the payload. So, to move the cursor to the second row at column 20, send ESC [ 2 ; 2 0 H.

That’s pretty much the gist of ANSI control codes. You can find a fairly complete list of codes here.

Remember that not all commands have a [ after the ESC character. A decent strategy for parsing these control codes on the client end is to look at the first character after each ESC. If the character is a [ then keep buffering input until a letter character is received, then process the buffer. If the character after the ESC is not an [, then immediately process the command.

For MUD servers, it is a good idea to also include a basic ANSI control code parser, solely for the purpose of stripping such codes out of input sent by users. While you’re at it, be sure to strip out lone CR or LF characters not part of a newline, the BEP character (\a), and other character codes. Imagine a user who sends a command line to your server with “say ” followed by a couple dozen BEP characters in it - every other player in the room will be treated to a long series of annoying beeps (if their client is supports it, which some do). You can just strip out every non-printable character, which is any code less than 32 (or just is the C isprint macro from ctype.h). On a similar note, remember to always escape IAC bytes in your output that aren’t meant to be a part of a TELNET command, otherwise malicious users might find interesting ways to break other users’ clients using commands that let them send text to other players.

And that’s all, folks!

Syndicated 2007-12-10 04:48:22 from Sean Middleditch

Squirrel Wants To Be Lua

Squirrel is another language I took a look at this morning. Squirrel is essentially an offshoot of Lua, being written by a games developer who was dissatisfied with some of Lua’s shortcomings in older (pre-5.1) Lua releases.

The biggest change one will notice between current Lua and Squirrel is that Squirrel has a built-in class mechanism. Unfortunately, the class system is single-inheritance only with no mixins or interface support, so developing larger applications would not be overly easy to do with Squirrel. This is probably just fine given that Squirrel seems geared more towards embedding than application authoring, just like it’s conceptual ancestor, Lua.

I think the language comparison page for Squirrel (which only compares against Lua) best explains Squirrel. I quote:

Lua has an established and growing set of 3rd party libraries. That’s the biggest problem with Lua spin-offs : you trade in compatibility with everything written for Lua (see http://lua-users.org/wiki/LibrariesAndBindings) for some syntactic sugar and a feature or two that will be implemented in some future Lua version anyway…

True enough, while Lua might not be intended for developing complete applications, it’s got enough addons and extensions to make it possible. Squirrel is severely lacking in such things. There isn’t any standard way to do networking, for example, which is a big requirement for this project. I could write a core C module that embeds Squirrel and adds the extra routines I need, but then I might as well just use a more well known language like JavaScript on SpiderMonkey, or just embed Lua.

One thing I do however very much like about Squirrel over both JavaScript and Lua is that variables aren’t automatically declared in the global namespace. With either JS or Lua, if you mistype a variable name in an assignment, you not only silently get a new variable, but it’s a global variable. Yuck! In Squirrel, assigning to an undeclared variable results in an error.

If you’re looking for a Lua-like runtime to embed with a syntax closer to JavaScript or C++, check Squirrel out, it’s just what you’re looking for. If that syntax isn’t important to you though, just use Lua instead.

Syndicated 2007-12-05 22:26:25 from Sean Middleditch

Io Language

Next up on my the language tour is Io, a tiny interpreted pure-OO language. Io is small, really small. Lua is a bit larger, actually.

Io has some admirable design goals. From the front page of the Io website:

Io is a small, prototype-based programming language. The ideas in Io are mostly inspired by Smalltalk (all values are objects), Self (prototype-based), NewtonScript (differential inheritance), Act1 (actors and futures for concurrency), LISP (code is a runtime inspectable/modifiable tree) and Lua (small, embeddable).

I can definitely feel the impact of those languages on the design of Io, Smalltalk especially. Everything is an object and all operations are simply messages (methods) sent to objects. Then syntax is just a little foreign for a C weenie like me, but it’s not too far out there, and is something I could get used to quite quickly; it’s not anywhere as foreign feeling as Objective-C, which I feel is a disgusting monstrosity of language design gone wrong. (Seriously, use SmallTalk, or use C. Don’t even get me started on Objective-C++. That language is God’s punishment for the Sins of Mankind.)

I downloaded and built Io, and started working at getting a sample project up and running. Io is a minimal language, so addons are necessary for a lot of things, such as networking. That’s where I hit the snag - the Sockets addon appears to be wholly undocumented on the Io website, and looking at the list of available methods is leaving some questions. Searching around on the net for examples isn’t bringing much up. Io is not really in use for any large production apps yet, so it’s still got a lot of rough edges in the documentation and examples areas.

Like I said with Pike, life is too short to deal with that sort of thing. I’d love to give Io a spin, but not for the project I’m on now. I’ve bookmarked it and plan on taking another look at it in 6-12 months on my next project. Maybe it’ll be ready for some serious use then; the development is active and the community seems fairly healthy, so I expect it’ll grow up pretty quickly.

Syndicated 2007-12-05 21:04:01 from Sean Middleditch

JewelScript Is No Jewel In The Rough

I decided to look up some non-mainstream languages before moving on the list of Big Popular languages I wanted to try for this new project. I took a look at a few that were outright unsuited, and then found and spent some time looking at JewelScript. Like Pike, JewelScript is an interpreted OO language with a syntax very reminiscent of C/C++.

JewelScript has a lot of nice features. The syntax is familiar, but the addition of coroutines and a ‘var’ type in addition to the static typing make certain classes of application a lot easier to write than C++ does. Unfortunately, JewelScript also seems to be so heavily based on C++ that some of the painful parts of C++ programming are firmly a part of JewelScript programming.

The biggest turn of here is the reference system. In JewelScript, all variables are copy-by-value, just like C++. If you want two variables to refer to the same object, you must declare one of the variables as an explicit reference to the other. That’s not so bad, really, until you get to function arguments. Just like in C++, you end up having to declare many function arguments as references solely for performance reasons, and not because the argument actually needs reference semantics. Also just like C++, that can result in programming errors, so JewelScript has a const reference type, which is a slightly different set of semantics but at least allows you to get decent performance without opening yourself up to programming mistakes.

Really, though, if I felt like declaring 90% of my function parameters with a logically unnecessary const and an equally unnecessary & just to work around the performance problems of the language design, I’d have stuck with C++. JewelScript could potentially fix this behavior with the simple addition of copy-on-write behavior for objects and other “fat” datatypes. That gives the programmer the full performance benefits of a const reference without the overhead of manually declaring const references when copy-by-value semantics were what they wanted in the first place.

JewelScript also lacks a comprehensive standard libary. That’s fine in many respects, but that coupled with a too-C++-like language design make it a poor choice for my project. However, anyone looking for a language to embed in C++ that offers a very familiar syntax, JewelScript might be just what you’re looking for.

Syndicated 2007-12-05 18:18:08 from Sean Middleditch

Passing on Pike

I’m starting a new project, and I decided to give the Pike language a try. It looks like a nice language for an old C/C++ hold out like me. Statically typed but still pretty flexible, very C++-ish in syntax, has a decently sized standard library, and not too slow for an interpreted language. Bonus points for having implementations of a ton of application network protocols in the standard library, including the ones I needed.

Sadly, it just isn’t meant for me. The language debugging facilities are atrocious. If you thought C++ template instantiation errors were hell, you’ll not be too pleased with the average Pike backtrace or compilation error. It gives way too much information about things that don’t matter and nothing useful on the actual error itself. For example, if you pass the wrong argument type to a function, you’d expect something like “Argument 2 (client) expects string, got int.” Instead, you get a huge line detailing the entire signature of the function, and then a second line detailing the entire signature of the function call, leaving you to scan through and find the differences.

That wasn’t going to sour the deal for me, though. I’m used to C++, so huge and nearly useless error messages are something I can deal with. Forging on, I found some oddities in the standard library that are just not working out well for me. For example, the String type includes a trim_all_whites function. Why isn’t this trim? Extra typing is half the reason i wanted to avoid using C++ itself. The HTTP implementation forces a ton of extra string copies all over the place. The TELNET implementation is one of the most akward protocol handler classes I’ve ever seen, plus it seems to be rather buggy. These are all relatively minor things. Silly function names I can learn to live with, and it’s not like I’m not up to writing an HTTP or TELNET protocol handler that more closely meets my needs.

The real kicker, however, are the total lack of certain features… or possibly just the lack of documentation on using those features. The official Pike documentation is almost entirely lacking in examples, many functions and classes are undocumented (some of which have a nice Fixme comment in the docs, while others are just blank), and I simply can’t figure out how to do some things that I’d really expect out of a language like Pike. I’m fairly sure Pike can do them, I just can’t figure out how.

Life is too short to spend a ton of time trying to figure out undocumented features of a language, so I’m passing on Pike for now. It’s a shame, because I like the Pike language itself, I just am not willing to deal with a idiosyncratic and partially undocumented standard library if I don’t have to.

There are some other languages that are on my list of Things To Try, so I’ll report back on those when I get the chance to play with them a bit.

Syndicated 2007-12-05 07:50:50 from Sean Middleditch

Irritating Java Environment

Debian/Ubuntu has what I think is pretty dumb Java environment.

Basically, jar files are not automatically found in /usr/lib/java/ and JNI libraries are not automatically found in /usr/lib/jni, requiring you to create a goofy little shell script for every Java app you write that sets these things if you need them. Any Java app that uses external JAR files or JNI files (.e.g., SWT) is instantly made non-portable by the fact that you have to set weird system-specific path settings instead of just being able to run java -jar myapp.jar.

The justification for this seems to be, “well, users might have multiple JVMs, and /usr/bin/java alternative might not be set to the most complete/featureful one, and since we only support software packaged officially for Debian**, we just recommend that packagers include scripts that set the specific JVM and classpath and so on they need, and never ever use the essentially useless /usr/bin/java command.”

Here’s an idea: make /usr/bin/java a system wrapper around the chosen alternative that automatically sets things up so the required JAR files located in the manifest of apps are found without the Debian-specific paths and so that the library search path is set so the Debian-specific /usr/lib/jni path is used for loading JNI shared objects. Then shit will actually work. For the users who set their java alternative to point to some incomplete or non-functional JVM, tell them to kiss your ass and install a JVM that will actually work.

** And this, folks, is still the #1 usability killer in Linux. If it isn’t part of the pre-selected set of almost certainly out of date software packages shipped by the specific version of the specific distribution you’re running, the software is a complete and total bitch to install and use, even when that software happens to be something designed from the ground-up to be portable between distros (or even OSes) in binary format. Packaging systems, for all their benefits, are to many non-technical users just one gigantic artificial barrier to ease of use. The Microsoft software installation model, for all its flaws, actually freaking works when it comes time to install something released after the OS install CD you have was shipped. Linux is the easiest OS in the world to use, so long as you only use it for the things the distro package set says you can.

Syndicated 2007-12-03 03:50:07 from Sean Middleditch

Security Hole of the Day

So a major games site many of us geeks might frequent has a fun security hole. I couldn’t remember the login for my account, but whenever I failed entering the right email and password combo, I noticed it set a ?login=false query parameter in the resulting URL. Sure enough, changing the false to true results in my being logged in, with a user that has a blank name (”Welcome, !”) and no email.

The worst part is, I have no way to login other than using said hole, since there is no “forgot password” link or any other way that I can possibly figure out to get into my account, so I had to make a new one (which is free, just validates email). I suppose it’s not that serious of a problem since user accounts really don’t do anything critical other provide marketing details to the company and allow forum posting, but I’m still pretty sure that they don’t want people bypassing their “subscriber only content” restrictions.

The site isn’t even written in PHP, I think. The URLs don’t give any indication of the language, but I vaguely recall seeing ASP-ish traceback errors a few months ago when something else was broken on the site. The Good Samaritan part of me wants to kindly inform the site operators of their blunder, but given the lawsuit happy and technologically ignorant business types running things in many companies, I’d probably just get a felony charge for “hacking” for my effort. :/ So instead I just hope they realize it on their own, fix it, and maybe add in that “forgot password” link in the process - I liked my old username a lot better than my new one.

It could be worse, I suppose. One of the sites I got to clean up last year, aside from its bazillion other horrendously broken design points, with its code comments all written in (broken) Portugese, did the classic ?admin=true authentication check. At least it wasn’t a JavaScript routine with the username and password stored in the HTML.

Syndicated 2007-12-01 05:29:17 from Sean Middleditch

Apartment Found, FLOSS Work, Language Design, Rambling

Apartment hunting is over already.

Looks like I’ll be living in Aspen Chase, off Golfside between Clark and Washtenaw, right next to WCC and US-23 and I-94, plus right near all the cool shopping places and restaurants and Ann Arbor.

I move in at the end of the month. Thinking about throwing an apartment warming / alcohol cabinet stocking party sometime in early-mid January.

So, now I need to ratchet up how much or work (or where I work). I’m also really interested in getting back into FLOSS work. I’m digging through some Ghostscript bounties (two birds, one stone) and seeing if there’s anything a newcomer to the project without much 2D compositing experience can tackle. A few looks like applicable.

After that, I’m unsure. My three biggest favorite things to work on are games, low-level infrastructure and language tools, and usability. FLOSS games are not much interesting to me; I’m not sure why, but for some reason Open Source just doesn’t seem to be working so hot for game projects as it does for everything else. Possibly because artists/designers/musicians aren’t as into the Give It Away For Free thing as programmers are.

So, for low-level stuff, I was at first thinking X and drivers. Then I’m thinking that that tends to be a pain without a second set of hardware, plus the chance of breaking hardware (hopefully rare, but still a possibility, as I’ve heard), and I don’t really have the cash for spare hardware, graphics cards, etc. and I need this one machine to continue working perfectly so my regular job. So maybe that’ll be an option down the road when I have more spare cash.

That pretty much leaves general desktop app work, or work on lower-level desktop code like HAL or D-BUS. Now, while I’m a GNOME fan of their desktop design, I actually really dislike their underlying frameworks. I mean, OO in C certainly works, but… damn is it ugly. Writing desktop software is a very high-level thing to do, and really would be better with a high-level language. Sadly, C# is dead for political reasons, C++ isn’t really all that great (but it certainly blows C out of the water - compare the pleasure that is the Qt API to the glib/gtk API), Java might very well become a good choice soon what with it being Free and IcedTea coming along, but then I’m not a huge fan of Java (C# is Java “done right,” but see afore-mentioned political issues), and so on. Vala looks like a fun projects (language design, low-level framework… my favorite areas) so I might look into that very soon. I’m specially not fond of how it just translates to C (there’s several very good reasons why C++ no longer does that), so maybe giving it an LLVM backend would be spiff, plus I’ve really been wanting to play with LLVM anyway. Actually, working on the clang frontend for LLVM is another option.

There is then always the part of me that just wants to do something new and exciting, but that’s… difficult. Not so much in writing it, but finding something new and exciting and actually worthwhile. I mean, doing all the web work I do, I’d love to have a langauge dedicated solely to doing web work. PHP, Java, C#, Ruby, Perl… all of these are extremely general-purpose languages that have libraries for working on the web, but they still make things more complicated than you really need. (Ruby on Rails does purportedly make things very easy, but then, you’re not so much coding Ruby as you are in a specialized dialect built on top of Ruby - plus, having hacked on the Ruby interpreter in years past, I’m not a fan of the underlying technology, unless Matz and co have done some serious work on it in the last few years… maybe I should take a look.) Really, 90% of what a web app does it spit out HTML and run SQL queries. Those two things should be SUPER easy and the easiest way to do them should also be both the most efficient and the most secure way to do them. Just makes sense. I have ideas on how to do this, so it’s tempting to write mod_languagethatdoesnotsucklikephp… but that gets back to whether the project would really get used much and really be worthwhile or just be yet another niche language used by three people in tiny projects nobody’s ever heard of.

I’m equally tempted to do a more low-level language. D is a neat language, but the design is a little… fluid. Plus its standard library sucks, and of the two competing projects to write a new one, both feature new ways of sucking as well as little chance of ever actually being “standard” (not that anything in D is standard, since its just a dump of whatever features the lead developers think is cool at the moment). I like C, I really do, and I really hate the way that C# and Java force OOP down your throat even for things that aren’t best modeled by OOP, or for things where their object model is not quite the best fit. It would be nice to do C with an enhanced type system that makes OOP possible and easy, but makes other styles of OOP also easy, as well as providing much better high-level data structures than the way C++ does things. Basically, I’d like a language that has high-level features, but also allows low-level programming, unlike Java or C# which put everything on their custom managed runtime. To be completely honest, neither Java nor C# really helps all that much with being portable except for trivial programs, and the security benefits of managed runtimes aren’t nearly as useful as advertised except for applet-like situations (seriously, it’s not really that much harder to write secure code in C than in any managed language, from Python to C# - buffer overflows and other memory-address-based attacks are less likely, but that’s hardly the sole kind of security hole around). But still, there are a bazillion “a better C/C++” projects out there, and even if I do make The Best(tm), how useful is that really going to be in the grand scheme of things?

It thus seems best to focus on something that people will actually use, instead of yet another quasi-academic intellectual-masturbation sort of project. GNOME and LLVM are my top choices. LLVM is a little more up my alley, but GNOME work can be fun. It’s been years since my last patch to GNOME, too. Maybe it’s time to rectify that sad fact.

Not seeing any of that likely until January, though - need to earn some raw cash now and get ready to move into said new apartment in a month. My rent will be going up by $325/month, plus I won’t be splitting utilities or Internet anymore. Yay fun.

Syndicated 2007-11-30 22:23:53 from Sean Middleditch

354 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!