Older blog entries for Omnifarious (starting at number 132)

No more autoconf for endianess detection

autoconf is annoying to work with, and I think that programs that rely excessively on it for cross platform compatibility have issues of their own. Sometimes you really have no choice though.

Fortunately I recently discovered one place where I now do have a choice where I didn't before. This little code snippet can be optimized by gcc at compile time into a constant expression. That means that gcc realizes there is only one possible result and it uses that result in place of actually running the code in the function. Here is the code snippet:

inline bool is_little_endian()
{
   const union {
      ::std::tr1::uint32_t tval;
      unsigned char tchar[4];
   } testunion = { 0x11223344ul };
   return testunion.tchar[0] == 0x44u;
}

This is guaranteed to work on C99 systems, and, as I said, gcc is capable of recognizing it as a constant expression. This also means that if you have code like this:

   if (is_little_endian()) {
      do_something();
   } else {
      do_something_else();
   }

gcc will be smart enough to see that only one branch of that if will ever be taken and optimize the other completely out of your code.

Normally you'd want to use autoconf for this so you will have a preprocessor macro that will elide the code for you. The fact gcc can optimize this well means you don't have to do that to get efficient code.

Syndicated 2009-09-10 15:27:27 from Lover of Ideas

8 Sep 2009 (updated 9 Sep 2009 at 00:11 UTC) »

IPv6 addressing oddity

I've noticed an interesting oddity in IPv6 addressing...

::ffff:n.n.n.n refers to IPv4 only hosts so that a program written for IPv6 only and running on a dual stack machine can address IPv4 only hosts. There is another class of address that is similar, but not quite the same, and I don't actually understand when it would ever be used, and that class is called "IPv4 compatible IPv6 addresses" and they are of the form ::n.n.n.n.

Interestingly the IPv6 IN6ADDR_ANY address is ::, which is equivalent to ::0.0.0.0. Fortunately, the IPv4 INADDR_ANY address is 0.0.0.0 (also known as 'this host on this network' in the 'Special Addresses' section of RFC 1700) so there doesn't seem to be any real problem.

And finally, the real problem. The IPv6 equivalent of localhost or IPv4's 127.0.0.1 is ::1, and this is equivalent to ::0.0.0.1 which makes it an 'IPv4 compatible IPv6 address'. But the IPv4 address it maps to is, according to RFC 1700, some kind of local identifier for a host on a network. That seems like an odd conflict and inconsistency to me, and I'm not really sure what it means.

Of course, I've never seen any addresses in the 0.0.0.0/8 block be used at all aside from 0.0.0.0 itself, so it's likely not a real problem. But I'm still curious.

Edit 16:36: I have the answer. According to RFC 4291 section 2.5.5.1 meaning of ::n.n.n.n addresses as 'IPv4 compatible IPv6 address' is deprecated so there is no longer any overlap in meaning between the special IPv6 addresses ::1 and :: and any IPv4 address.

Well, that was the right decision, the distinction between ::ffff:n.n.n.n addresses and ::n.n.n.n addresses was confusing and unclear anyway.

Syndicated 2009-09-08 08:11:38 (Updated 2009-09-08 23:46:30) from Lover of Ideas

Ugly C++ technique that I'm still proud of thinking of

I'm quite pleased with myself. :-) I've been participating on stackoverflow.com recently. HazyBlueDot had an interesting question in which (s)he was trying to use ::boost::function to get around a broken library interface.

In particular, the library interface allowed you to register a callback function, but it did not provide you a way of giving it a void * or something to pass back to you so you could put its call of your function back in context. HazyBlueDot was trying to use boost::function in combination with boost::bind to add in the pointer and then call his own function. The only problem is that the result boost::function object then couldn't produce an ordinary function pointer to pass to the callback.

This, of course, cannot be done in a static language like C++. It requires on the fly generation of first-class functions. C++ simply can't do that. But, there are various interesting tricks you can pull to generate functions at compile time with templates in ways that can help with this problem, even if they can fully solve it.

I'm particularly pleased with my solution, which looked something like this:

#include <boost/function.hpp>

using ::boost::function;

typedef int (*callback_t)(const char *, int);

typedef function<int(const char *, int)> MyFTWFunction;

template <MyFTWFunction *callback>
class callback_binder {
 public:
   static int callbackThunk(const char *s, int i) {
      return (*callback)(s, i);
   }
};

extern void register_callback(callback_t f);

int random_func(const char *s, int i)
{
   if (s && *s) {
      return i;
   } else {
      return -1;
   }
}

MyFTWFunction myfunc;

int main(int argc, const char *argv[])
{
   myfunc = random_func;
   register_callback(&callback_binder<&myfunc>::callbackThunk);
   return 0;
}

This basically allows you to automatically generate a 'thunk' function, a normal non-member function that can be passed to the callback, that then calls another function and adds the contents of a global variable you specify as a template parameter. It doesn't fully solve the problem, but it partially solves it. And I think in this case it will do something pretty close to what HazyBlueDot wants.

Syndicated 2009-09-05 06:21:45 from Lover of Ideas

Empathy and OTR

Empathy has been starting to make it into Linux distributions as the default IM client. I think this is a mistake at this juncture, and this bug about Empathy not supporting OTR is one of the larger reasons why.

Another reason why is that Empathy seems to be connected with several different libraries and there is no clear sense as to what functionality lives where. It appears to be something of a spaghetti mess of libraries. I mostly figured this out because of repeated calls to 'code it or shut up' in response to the bug I posted.

One of my responses was good enough that someone else felt the need to cross-post a link to it in the Launchpad bug about lack of OTR support in Empathy.

I will cross-post it here:

(In reply to comment #15)

You seem strangely interested in security... provided by (by your own words) a broken security layer? Do you really think that providing broken security, and lulling people into false sense of security is better than providing no "security" at all?

OTR's brokenness is due to the fact that it is a hacky kludge on top of existing IM protocols, not because it has any security flaws. It's inelegant and ugly, but it works.

I'm all for an elegant solution. But I don't think it should take a backseat to interoperability. I know that the various IM protocols are also mostly a bunch of ugly kludges as well. But that doesn't stop them from being implemented.

And to others. I am not a Telepathy developer... but seriously guys, flaming developers while not being ready to get yourselves on the line? If you find it useful and especially if you find it critical, do it yourself. Otherwise, feel free to keep using Pidgin until you get this critical feature, which Thilo considers broken by design.

I think there's room for other improvements before encryption, because I, and many other home users, find it unnecessary. Encryption is not important for majority of people on this world.

I am worried because Empathy appears to be getting a huge userbase and being used as the default IM client for a number of distributions without having a feature I think is incredibly important and should've been built in at the start, almost especially because most users don't really care about it.

Most people will not care about encryption. Most people also do not care about ACID database semantics. But anybody who made a database lacking the latter feature (i.e. Microsoft Access) would be roundly and justly flamed. Especially if they managed to somehow get that database into general use.

There are a whole host of features that users do not care about but are critical pieces of infrastructure. One of the things that most pleases me about Adium is that the developers understood and so many of my friends who have no clue or desire for encryption end up using it anyway because they use Adium.

If other clients provide you security, use those. Or use email+GPG for even more security. Filing a request is fine. Posting a comment supporting the request is fine. Attacking people like some of you did is not fine.

Email encryption is nearly a lost cause. But with Adium and a couple of other popular IM clients supporting OTR, widespread IM encryption was beginning to happen. I don't think activists in Iran should have to worry about which IM client their friends are using in order to avoid being snooped on. I don't think their choice of IM client should be able to be used to single them out for special treatment by their government. All new IM clients should just do the right thing out of the box.

Widespread support for good encryption is not something I care about because I am especially paranoid about my own IM conversations. It's because I care about the pernicious effects of all IM conversations being potentially public knowledge.

Someone else goes on later to suggest that Empathy support some horrible idea like TLS over XMPP. Which, in addition to being an awful idea for any number of reasons, also fails to address the issue of support for any protocol aside from XMPP.

In order for encryption to be useful in a communications system, everybody has to be able to use it whether they want it or not. It should be a first-class feature designed in from the very beginning, not tacked on as an afterthought (something that OTR in pidgin fails at) and certainly not treated as unimportant because only a few really want it.

Syndicated 2009-08-31 21:23:40 (Updated 2009-08-31 21:25:34) from Lover of Ideas

30 Aug 2009 (updated 30 Aug 2009 at 23:09 UTC) »

Is this really faster?

This:

unsigned int clipdigit(unsigned int * const v)
{
   unsigned int digit = (*v) % 10;
   (*v) /= 10;
   return digit;
}
is turned into this:
.globl clipdigit
	.type	clipdigit, @function
clipdigit:
.LFB11:
	.cfi_startproc
	movl	(%rdi), %ecx
	movl	$-858993459, %edx
	movl	%ecx, %eax
	mull	%edx
	shrl	$3, %edx
	leal	0(,%rdx,8), %eax
	movl	%edx, (%rdi)
	leal	(%rax,%rdx,2), %edx
	movl	%ecx, %eax
	subl	%edx, %eax
	ret
	.cfi_endproc
.LFE11:
	.size	clipdigit, .-clipdigit

As a small hint/bit of explanation, 232 - 858993459 = 3435973837 = 235 / 10 + 2.

Is mull really that much faster than divl on x86_64 machines?

I was expecting to get code more like this rather straightforward bit:

.globl clipdigit
	.type	clipdigit, @function
clipdigit:
.LFB11:
	.cfi_startproc
	movl	(%rdi), %eax
        movl    $10, %ecx
        xorl    %edx, %edx
        divl    %ecx
        movl    %eax, (%rdi)
        movl    %edx, %eax
	ret
	.cfi_endproc
.LFE11:
	.size	clipdigit, .-clipdigit


It turns out in testing that the second clip of code is much, much slower than the first clip. The strange mull method is about 5 times faster than the straightforward divl method. Wow, divl seems really broken if it's that slow.

Syndicated 2009-08-30 20:38:53 (Updated 2009-08-30 22:39:02) from Lover of Ideas

C++ on the iPhone

While I do not recommend that anybody develop anything for the iPhone, I was recently investigating something about it.

I would like to repeat that I do not recommend that anybody develop anything because Apple's policies make it questionable as to whether or not your app will ever make it to the app store at all. They have no compunctions about refusing apps for the most bizarre of reasons, and even worse, refusing apps because either Apple or AT&T perceives them as somehow competing.

If you want to develop for a mobile phone platform, go for the Android. That is a clear and open market.

That being said, I did have reason to investigate C++ on the iPhone recently, and I came across these 3 well-written articles by someone who's tried to develop C++ on several different mobile phone environments. I would like to point to them here because other people looking for this information should be able to find it easily, and so I can find it easily.

  • C++ on iPhone: Part 1 - in which its discovered that yes, constructors for global variables are indeed run before main starts and their destructors are called after main ends.
  • C++ on iPhone: Part 1a - Yeah, yeah, the last one didn't seem at all iPhone specific, but really I did run it on an iPhone and it worked just like it ought to.
  • C++ on iPhone: Part 2, Exceptions - Why yes, exceptions do indeed work, and there's even some hints that there's been an attempt at integration between Objective-(C/C++) 2.0 exceptions and C++ ones.
  • C++ on iPhone: Part 3, Run Time Type Identification - Yep, this works too.
  • Of BOOL and YES - BOOL is an evil Objective-C type that's really a typedef and a holdover from the days when C didn't have a bool type. Don't forget that there are many values that mean YES besides YES, so compare against NO if you have to compare. (IMHO, you should just always assign to a real bool type as the iPhone C compiler is C99 compliant and that does have bool).

So yes, it appears the iPhone does C++ just fine. That guy was promising to write more on how Objective-(C/C++) and C++ mixed. But I don't think he ever got around to it.

Syndicated 2009-08-22 12:10:58 (Updated 2009-08-22 12:20:38) from Lover of Ideas

21 Jul 2009 (updated 22 Jul 2009 at 01:11 UTC) »

Amazon randomly burns people's copies of 1984

The publisher of several books by George Orwell decided that they didn't like the fact that they'd published them electronically. Many people had bought these books for their Kindle. Mysteriously, these books completely disappeared from people's Kindle book readers.

In my humble opinion, people who bought a Kindle deserve exactly what they got, and I hope Amazon does it again. If you buy into DRM in any way you are asking for stuff like this to happen to you. The reasonable response is not to complain bitterly about how unfair it is, but to not buy DRM enabled products.

People seem in a terrible rush to trade away rights that are essential in the rush to convenience. They spare little thought for what they're doing and then act surprised at the ultimate result.

At the recent Convergence I was on a panel about copyright. People there persisted in calling copyright a 'property right' and referred to the vast network of weird and wonderful rights that are patents, trademarks and copyrights as 'intellectual property'. I object strongly to the conflation of trademark, patent and copyright into 'intellectual property'. The rules around each are very non-property-like and very different from each other.

And Techdirt comes to the rescue again with an article about how in many ways copyright is very much not a property right.

Syndicated 2009-07-21 22:25:18 (Updated 2009-07-21 23:37:23) from Lover of Ideas

20 Jul 2009 (updated 21 Jul 2009 at 06:09 UTC) »

Another great Linux game

I haven't played it yet, but I would like to note that Frictional Games has released Penumbra for Linux.

Blog of Helios wrote a nice blog entry detailing why this is such a great game. Unlike World of Goo, I'm not sure how well it will work on older hardware.

It's really nice to start seeing publishers of really good games start supporting Linux. The smaller publishers in the games industry tend to make the most interesting games, and it's the smaller publishers that have been doing Linux ports. Even though the larger publishers make less interesting games, their games tend to be more popular. I hope that the success smaller publishers have with porting leads larger publishers to start doing the same thing.

Games are an exception to my rule about using all Open Source if I can help it.

This game is a bit tricky to install on Linux, mostly because of library dependencies, especially on a 64-bit system. Frictional could use some install advice from the nearly trivial to install World of Goo game.

Syndicated 2009-07-20 19:12:53 (Updated 2009-07-21 05:14:28) from Lover of Ideas

Amazingly fun commercial game for Linux!

It's World of Goo. It's like a mad cross between Fantastic Contraption and Lemmings.

The best part (in my world anyway :-) is that the game is available for Linux. I wish more game companies would start doing this. It isn't that hard, and there is a market for it. Nearly as much of a market as for Mac games.

The game has gotten some fantastic reviews. It is light-hearted and bizarre, and the physics simulation based puzzles are highly entertaining. Other comparisons that come to mind are the work of Tim Burton and the video game Worms, more for artistic style than anything about how gameplay actually works.

Syndicated 2009-07-17 22:51:57 (Updated 2009-07-17 23:00:19) from Lover of Ideas

27 Jun 2009 (updated 27 Jun 2009 at 18:08 UTC) »

Programmer's block

I've been working on coming up with a nice C++ (or, actually, C++0x) interface to Skein hash function.

Skein has an interesting tree mode in which it's possible to parallelize the hash function calculation to a significant degree. I wanted to write a general interface for this so I could make a command line utility that used it to test it against sha256sum command.

Applying a tree hash to an existing file is a no-brainer. But I wanted to be able to handle much more general cases in which the leaf data may not be available on a random-access basis. In particular if the file is coming in on stdin or something similar.

I was having difficulty coming up with a general extensible interface for this. Partly the interface for the system for handling leaf data needed a way to allocate chunks of leaf data to work on, and then release them. This would allow for a sliding window type approach to fetching leaf data.

My biggest and most recent breakthrough was realizing that the leaf data objects were like ::std::auto_ptr objects. I didn't want to force heap use, so I needed the data about a leaf to be copyable. But I didn't want to have to have any kind of silly reference counts or anything like that. So that meant I needed it to be moveable, not copyable. Just like auto_ptr. But auto_ptr is a klduge in C++. In C++0x there is a very nice concept called rvalue references that let you implement move semantics very cleanly.

It took me awhile to realize I wanted move semantics. I kept on beating on the interface and coming up with usage scenarios that were just awkward and broken. Once I figured it out, things went a lot easier.

Here is a link to what I finally came up with: skeintreepp.hpp.

Syndicated 2009-06-27 07:22:29 (Updated 2009-06-27 17:42:23) from Lover of Ideas

123 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!