Older blog entries for aleix (starting at number 36)

Be careful with packed structures!

If you use the C language, you may have probably wanted to pack your structures so no alignment is done by the compiler. This can be useful for example to build network packets. You can find the basics of packed structures using GCC, here or here.

If everything seems clear, why am I writing this? Today, a coworker has found a bug related to packed structures. The issue was with internal structures (i.e. a substructure). A year ago, or so, I knew that substructures were not packed even if its enclosing structure is packed, but it seems I forgot about it, so I have decided that it was worth writing it here so I do not forget it again (I’m sure I will).

I will take the example found in GCC documentation (only since version 3.4.0). Suppose you have the following code:

struct my_unpacked_struct
{
  char c;
  int i;
};

struct my_packed_struct
{
  char c;
  int  i;
  struct my_unpacked_struct s;
} __attribute__ ((__packed__));

struct my_packed_struct my = {
  .c = 10,
  .i = 20,
  .s.c = 30,
  .s.i = 40
};

If we generate the assembly for this (I have omitted some things not needed for the example), we will get:

        .globl _my
        .data
_my:
        .byte   10  <--- c
        .long   20  <--- i
        .byte   30  <--- s.c
        .space 3    <--- 3 bytes of alignment
        .long   40  <--- s.i

As you can see, the compiler has not aligned the internal structure, but the enclosing one. So, what you need to do if you want the internal structure also packed is to pack my_unpacked_struct:

struct my_unpacked_struct
{
  char c;
  int i;
} __attribute__ ((__packed__));

Now, we get what we initially expected:

        .globl _my
        .data
_my:
        .byte   10  <--- c
        .long   20  <--- i
        .byte   30  <--- s.c
        .long   40  <--- s.i

Packing the whole structure my_unpacked_struct is fine if you do not use it anywhere else, but it would be great to use variable attributes (we have used type attributes so far), so we could only pack the internal substructure variable like this (it doesn’t work):

struct my_packed_struct
{
  char c;
  int  i;
  struct my_unpacked_struct s __attribute__ ((__packed__));
} __attribute__ ((__packed__));

Update 2007/07/25: read the first comment to understand why the variable attribute is not working in this case.

By the way, in the example I have initialized the structure my using designated initializers.

And remember, be careful with packed structures!

Syndicated 2007-07-24 18:07:25 from axelio

Erlang R11B-5 in Fink

My friend jao has asked me to update the Erlang package in Fink to the latest version. It is still under validation, so until it is committed you can use this file. To install it, just type:

$ tar zxvf erlang-otp-R11B-5-fink.tar.gz
$ sudo cp erlang-otp-R11B-5-fink/*
          /sw/fink/dists/unstable/main/finkinfo/languages
$ fink index; fink rebuild erlang-otp; fink install erlang-otp

It’s better if you backup your old Erlang Fink files, just in case.

Update 2007/07/06: committed to Fink unstable.
Update 2007/06/28: added unixodbc2 and ncurses5 dependencies.

Syndicated 2007-06-19 22:30:15 from axelio

ECL in Fink

For anyone interested in ECL (Embeddable Common-Lisp), I have created the package for Fink. It is still under validation, so until it is committed you can use this file. To install it, just type:

$ tar zxvf ecl-0.9i-fink.tar.gz
$ sudo cp ecl-0.9i-fink/* /sw/fink/dists/local/main/finkinfo
$ fink index; fink install ecl

Update 2007/07/06: committed to Fink unstable.

Syndicated 2007-06-19 08:26:50 from axelio

Packing and unpacking bit structures in Python

Last week, I released the first version of the BitPacket Python module which allows you to pack and unpack data like the struct and array modules, but in an object-oriented way. At work I needed an easy way to create network packets and at that time I did not know the existence of the struct and array modules, so I googled a bit and I found out the BitVector class for a memory-efficient packed representation of bit arrays, which I decided to use for my purpose.

I implemented three classes, BitField, BitStructure and BitVariableStructure (the lastest two are derived from BitField). A network packet would be represented by the BitStructure class, which at creation does not contain any field, and the idea is that any BitField subclass might be added to it.

I’ll will show you the most basic example. Suppose, you need a simple network packet like the one below:

+---------------+-------------------+
|  id (1 byte)  |  address (4 byte) |
+---------------+-------------------+

You could easily create a network packet using BitStructure, like this:

>>> bs = BitStructure('mypacket')
>>> bs.append(BitField('id', BYTE_SIZE, 0x54))
>>> bs.append(BitField('address', INTEGER_SIZE, 0x10203040))

and print its contents:

>>> print bs
>>> (mypacket =
>>>   (id = 0x54)
>>>   (address = 0x10203040))

In order to unpack an incoming packet, we could use the variable created above or a new one without default values:

>>> bs = BitStructure('mypacket')
>>> bs.append(BitField('id', BYTE_SIZE))
>>> bs.append(BitField('address', INTEGER_SIZE))

In order to unpack an incoming array of bytes, we would do the following:

>>> data = array.array('B', [0x38, 0x87, 0x34, 0x21, 0x40])
>>> bs.set_stream(data)

We can then access to the packet fields by their name:

>>> print '0x%X' % bs['id']
0x38
>>> print '0x%X' % bs['address']
0x87342140

There are a lot more possibilities to pack and unpack bit field structures by using BitStructure and BitVariableStructure. You can see all of them in the module’s online documentation.

Syndicated 2007-06-16 13:52:32 from axelio

Honeymoon and code poetry

code poetryDuring my honeymoon in Egypt (yes, I got married three weeks ago!) I had the chance to visit the new Bibliotheca Alexandrina. Before the guided tour, we spent some time walking around the different exhibitions inside the library building. In one of them, I found a set of punch cards and the great thing is what I could read in them:

“It gives a feeling of completeness”

“A great code is probably as rare as diamonds”

“Like a good poem, there are levels of insight to be gained at each reading”

“Beautiful software reflects a profound understanding of the world in some way”

“Like a good lecture, it leaves me with a desire to use what I’ve just learned”

I don’t know if this was a common behavior in the punch cards days or if people were used to it (if anyone knows I would be really interested in it). I found it really refreshing, it seems that in those days people loved what they were doing, they loved to write software.

Syndicated 2007-05-26 13:36:43 from axelio

Easy acronyms generation

At work, we needed a way to easily generate a list of acronyms of our documents, so I wrote an script (tex-acronyms.py) that parses the acronyms in a specified LaTeX file and tries to find a description (specified as a nomencl definition) in the files (*.tex) located in a directory. The script will create two files: one with acronyms and their descriptions and the other with conflicts, that is, acronyms with no or duplicate description.

The following example is a sample of an acronyms list file. Note that the LaTeX document must use the nomencl package, since the acronyms are defined using the nomencl syntax.

\nomenclature{ADDR}{Address}
\nomenclature{ANSI}{American National Standards Institute}
\nomenclature{API}{Application Programming Interface}
...

It is also possible to provide a list of words to be excluded in the exclude_acronyms.txt file, that must be located in the data directory by default (a different file can be specified using -x). For example:

CHAPTERS
CLOSED
DUPLICATE
FIRST
FIXED
BOTH
BUS
...

It is easy to find words to be excluded because they will be treated as errors (and written to the errors file) as no definition is available for them.

Finally, it is possible to parse the input files recursively passing the -r argument. This will parse all the files included with \input.

A possible program call could be:

tex-acronyms.py -r -d ~/acronyms -i article.tex -o acronyms.tex 
                -e acronyms.errors

where the acronyms found in the article.tex file (and its dependencies) will be matched against the acronyms found in ~/acronyms/*.tex. The matched acronyms will be written to acronyms.tex and the errors in acronyms.errors. Now, you just need to include acronyms.tex in your document.

Update 2007/02/23: GlossTeX does the same (and much more) but à la TeX way.

Syndicated 2007-02-22 15:01:50 from axelio

Designated initializers

Last year, I discovered, thanks to the book “C, A Reference Manual“, a great C99 feature: designated initializers. Designated initializers allow you to initialize components of an aggregate (structure, union or array) by specifying their names within an initializer list.

Arrays initialization

What most people normally use to initialize an array is the following idiom:

int v[4] = { 4, 2, 1, -5 };

in which you need to initialize each component of the array sequentially. Designated initializers allow you to specify which component of the array you want to initialize. Thus, we could write the line above as:

int v[4] = { [1] = 2, [2] = 1, [0] = 4, [3] = -5 };

Note that we have specified the component indexes which has allowed us to initialize the array with our desired order. If we do not initialize all the components, those not initialized will get 0 values. We can also mix both methods, so the line below would be also correct:

int v[4] = { [1] = 2, 1, [3] = -5 };

in which the component not referenced goes right after the named one.

A possible use of this kind of initializations would be a mapping between a list of identifiers and a list of strings.

// The public interface

typedef enum {
  id_one,
  id_two,
  id_three
} id_t;

extern char const* string_by_id (id_t id);

// The private implementation

static char const* strings[] =
{
  [id_one] = "identifier one",
  [id_two] = "identifier two",
  [id_three] = "identifier three"
};

char const*
string_by_id (id_t id)
{
  return strings[id];
}

Structures and unions initialization

Designated initializers are also useful to initialize components of structures and unions by their name. In this case, the component to be initialized takes the form .c, where c is the name of the component. So, suppose we have the following structure:

struct point { float x; float y; float z; };

we could initialize each component of a struct point variable like this:

struct point my_point =
{
  .x = 0.34,
  .y = 0.98,
  .z = 1.56
};

With unions, we will use the same method, so having the following union:

union integer
{
  unsigned char int_8;
  unsigned short int int_16;
  unsigned long int_32;
};

we can initialize it by any of its components:

union integer value = { .int_16 = 24000 };

Finally, we can merge both cases, so we can have arrays of structures or unions that can be initialized using designated initializers:

struct point pointvector[3] =
{
  [0].x = 0.34, [0].y = 1.78, [0].z = 3.18,
  [1] = { .x = 3.5, .y = 6.89 },
  [2] = { .y = 2.8, 1.23 }
};

Syndicated 2007-02-19 09:02:46 from axelio

Split directives into multiple files

Yesterday, I found out that I needed an application to split the virtual hosts of an Apache 1.3 configuration into separate files so I could use them in the sites-available/sites-enabled Debian’s Apache 2 way. I googled just a bit and I did not find anything so I did my own one (vhost-split.py).

Just pass it the configuration file and the script will generate a bunch of files named with the ServerName variable found in each virtual host. Note that repeated entries will generate separate files (www.mydomain.com, www.mydomain.com-1, …). The script will also report commented entries.

Syndicated 2007-02-15 18:57:09 from axelio

Unit Testing tools

Most of developers know how great and useful is unit testing. What is even greater is the bunch of libraries you can find out there for almost any language around (C, C++, Objective-C, Java, Pyhton, Perl…).

All these tools are great (I have used some of the ones I have mentioned), but what happens when you start working in a big and commercial project and your bosses have decided to buy a fantastic tool (I’m not going to mention the one I will talk about) which costs a lot of money and they have never try it? What happens is that you, the developer, start working in this fantastic tool and start feeling sort of depressing.

We started our project a year ago and by now, I have had to send so many e-mails to the unit testing software company that I can’t remember. The tool looked promising at first: unit testing, static analysis, user interfaces to speed up the process, cross platform (we use a cross compiler) and many other things. I started to use the GUI without much success, I don’t even remember how it looked like. After a couple of days we decided (we had it already in mind) to create the tests by hand (with help of some elisp code). After a month or so, we had our first problem: the parser for static analysis did not support some of the C99 syntax, first e-mail and first example to show the problem. It took like two months or more to fix the problem, this means that we had to code without using some of the syntax we wanted to use, and of course we had to remember it each time. After some months of tranquility, second problem: a header file provided by the unit testing software had some errors, so second e-mail with a fix suggestion. This time (or even before) I started thinking: “why have I to use this tool? I’m only using the basic macros all the free libraries already give me and no one in the company is looking at the metrics generated by the software”. Nevertheless, I kept on using the tool. A month ago (and after 6000 € paid for support), third problem: I got linking errors using the static analysis again. I prepared another test and sent it to the company again. They started by sending me an update that did not fix the C99 issue I mentioned above, and of course didn’t fix the current one. Then, I recognize it, I made a mistake and I saw a new issue when there really was no one, despite that the current problem was not fixed. This morning, I received a new version of the executables which were supposed to fix the problem. Again, no luck, no problem resolved. May be I made a mistake again, but no one has complained by now. Is any one trying the fix before they send it? Am I too exigent?

My conclusion is: whenever you can, avoid the tools you don’t know and use the ones you do. Commercial tools may not worth the money.

A co-worker told me that one can write its own code to perform unit testing, and his is totally right. If the problem is that the free libraries are not certified or whatever, write your own code and it will be certified as it will be the rest of the code.

Happy testing!

Syndicated 2006-11-09 18:26:52 from axelio

28 Oct 2003 (updated 11 Nov 2003 at 21:25 UTC) »

I'm glad to announce a new (but very small) free software project. It is gXMMS, a simple GNOME2 panel applet to control XMMS.

To those who like WePS, you'd like to know that I'm almost finished with the integration of the new user system and some other features. I think you'll really like it.

My plans now are finishing this version of WePS and release a new version of SCEW. And of course fix the bugs in gXMMS. And then focus all my attention in an other thing... I'll probably explain it in another entry.

27 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!