Older blog entries for hereticmessiah (starting at number 10)

Well kids, I've finished my fourth year project. Managed to lose all the documentation and had to write it all out again last night. Got it written anyway. The diff (one-way) algorithm was an important part of it used in a versioning database the app included.

Now it's on to distributed systems. I have to write a client-server application with CORBA. It's up to me what I write, and the only requirement is that I have to use C++ and Java, one for the client and the other for the server. Yet again, it's up to me what I use. Oh, and did I mention that one has to run on Windows and the other on *nix?

It won't be so bad. I'm thinking of writing a version of Othello. Danka gave me the idea. It's been a long time since I've played the game and I did a search for an implementation. I thought it'd be a nice project to write an Othello server over UDP. I've never really done anything with UDP, and I thought it'd be nice to try.

I mentioned this to her, and she said that I should write it using CORBA. Sure, it's not exactly the best use of CORBA, but it'd get my DS assignment out of the way.

Thanks to Bram for his comments on the diffing algorithm. Much appreciated.

28 Feb 2004 (updated 1 Mar 2004 at 18:25 UTC) »

Am I an idiot or what! There were a few stupid errors in that diffing algorithm I threw up (give me a break: they don’t have python installed here in college). Here’s a corrected version. Again, any comments are welcome.

def GetDiff(oldLines, newLines):
    """Constructs a diff to get from newlines to oldlines."""
 
    # Construct our special "hash" table. Using primes for the table length
    # would be better, I know, but odd numbers are simpler.
    tblSize = len(newLines) * 2 + 1
    tbl     = tuple([] * tblSize)
    for iNew in range(len(newLines)):
        iBucket = hash(newLines[iNew]) % tblSize
        tbl[iBucket].append(iNew)
 
    # Holds our diff information.
    result = []
 
    # Find the common subsequences and differences.
    iOld = 0
    iMax = 0
    while iOld < len(oldLines):
        iBucket = hash(oldLines[iOld]) % tblSize
 
        # Start with no match
        nMax = 0
        for iNew in tbl[iBucket]:
            # if nMax > 0 and iNew < iMax + nMax:
            #     continue
            nCur = 0
            try:
                while newLines[iNew + nCur] == oldLines[iOld + nCur]:
                    nCur += 1
            if nCur > nMax:
                nMax = nCur
                iMax = iNew
                iEnd = iMax + nMax
                if iEnd == len(oldLines) or iEnd == len(newLines):
                    break
 
        # Have we found a matching subsequence?
        if nMax == 0:
            # Nope.
            result.append(oldLines[iOld])
            iOld += 1
        else:
            # Woohoo!
            result.append((iMax, nMax))
            iOld += nMax
 
     return result

There old version is in my last entry.

Part of my final year project is a version control system. It’s not the most important part, but it is important.

I searched about for delta compression algorithms, but all of them were either inappropriate or would take too long for me to implement (not to mention overkill for a college project!). So I came up with my own line-oriented one. Last night I wrote a quick implementation of the algorithm out on paper in python. I haven’t tested it out yet, so I’m not sure about its performance. The end product will be written in C++.

def GetDiff(oldLines, newLines):
    """Constructs a diff to get from newlines to oldlines."""

# Construct our special "hash" table. tblSize = len(newLines) * 2 + 1 tbl = tuple([[] for i in range(tblSize)]) for i in range(len(newLines)): pos = hash(newLines[i]) mod tblSize tbl[pos].append(i)

# Holds our diff information. result = []

# Find the common subsequences and differences. iOld = 0 iMax = 0 while iOld < len(oldLines): pos = hash(oldLines[iOld]) mod tblSize

# Start with no match nMax = 0 for iNew in tbl[pos]: nCur = 0 while nCur + iNew < len(newLines) and \ newLines[nCur + iNew] == oldLines[nCur + iOld]: nCur = nCur + 1 if nCur > nMax: nMax = nCur iMax = iNew

# Have we found a matching subsequence? if nMax == 0: # Nope. result.append(oldLines[iOld]) iOld = iOld + 1 else: # Woohoo! result.append((iMax, nMax)) iOld = iOld + nMax

return result

If anybody has any thoughts on it, I’d love to know.

Reading two books on Swarm Intelligence: Swarm Intelligence and Swarm Intelligence: From Natural to Artificial Systems. Zzzz...

9 Jul 2003 (updated 9 Jul 2003 at 15:03 UTC) »

Got a H1 overall in my third-year finals (which is about as good as you can do), but I have to repeat Computer Science ’cause I screwed that exam up completely.

Feeling rather depressed right now, but for other reasons besides my exams. All personal, all chemical (I’m unipolar, y’see). :-(

On the other hand, talideon.com is up and running again, which is something deserving of celebration. I must cheer up...

Building a new network stack for the WIMU basestation and nodes. Is it just me, or do electronic engineers have some sort of mental block against coding properly?

16 Jun 2003 (updated 16 Jun 2003 at 10:14 UTC) »

This week I’m working on getting one of the WIMU units here to talk to a base station via an RF link. It’s a chance to try out the radio network stack I’m working on at last! Also developing a GUI for it, so it’s time to brush up on my Tcl!

I took a look at Eric Kidd’s XML-RPC for C and C++ library, but I think I’m just going to code the diary/.plan file synchroniser with sockets myself. It’s too awkward to set up the Expat and the XML-RPC libraries here with the administrative restrictions on the Solaris boxen here at work. I hate reinventing the wheel, but seeing as all I need is a little cog and not F1 slicks, it’s probably for the best.

Ok, here’s an idea that’s after popping into my head.

I keep a .plan file here in my Unix account at work. Mostly, it’s just to keep track of what I’m doing development-wise.

Now, I also have my advogato log, but it’d be more convenient for me to update my .plan file and use the XML-RPC Interface to update my Advogato log with it.

Hmmm... sounds like a good idea. Now all I have to do is find the time to do it!

Was bored earlier and converted my signature generator for mutt to C and added a couple of features. It’s available here.

1 older entry...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!