Older blog entries for hereticmessiah (starting at number 16)

Did I mention that I started working again about a month and a half ago? Probably not. Well, I have.

I. Have. A. Gmail. Account!

As part of Job Search 2004, I’m resubscribing to the Open lists (you’ll know them if you’re an Irish developer) to see what jobs are about. Here goes nothing!

7 Jul 2004 (updated 7 Jul 2004 at 21:14 UTC) »

Woohoo! I’ve got the lads at Digital Crew to set up a datasource for me, so now I can work on getting my linklog up and running. About time too!

I’ve decided I’m going to write that Advogato Poster myself. I found XMLRPC-C back last in June 2003, and I’ve been hacking with wxWidgets recently. The basic app shouldn’t be all that difficult to knock together, I think, but I just need a box to build it all on. Seeing as my project box in college hasn’t been wiped by the admins yet, I think I might do it there.

And meanwhile, having finished my degree, jobsearching in Cork...

Saw Shrek II: Excellent. Wait till after the the credits, seriously. :-)

Until I finish my new blogging engine (file-based: we don’ need no steeking RDBMSs) and throws it up on talideon.com, I’m thinking of posting up my blog here for a while.

Question though: is there anything Win32 frontends out there for posting via the XML-RPC interface?

Well kids, I've finished my fourth year project. Managed to lose all the documentation and had to write it all out again last night. Got it written anyway. The diff (one-way) algorithm was an important part of it used in a versioning database the app included.

Now it's on to distributed systems. I have to write a client-server application with CORBA. It's up to me what I write, and the only requirement is that I have to use C++ and Java, one for the client and the other for the server. Yet again, it's up to me what I use. Oh, and did I mention that one has to run on Windows and the other on *nix?

It won't be so bad. I'm thinking of writing a version of Othello. Danka gave me the idea. It's been a long time since I've played the game and I did a search for an implementation. I thought it'd be a nice project to write an Othello server over UDP. I've never really done anything with UDP, and I thought it'd be nice to try.

I mentioned this to her, and she said that I should write it using CORBA. Sure, it's not exactly the best use of CORBA, but it'd get my DS assignment out of the way.

Thanks to Bram for his comments on the diffing algorithm. Much appreciated.

28 Feb 2004 (updated 1 Mar 2004 at 18:25 UTC) »

Am I an idiot or what! There were a few stupid errors in that diffing algorithm I threw up (give me a break: they don’t have python installed here in college). Here’s a corrected version. Again, any comments are welcome.

def GetDiff(oldLines, newLines):
    """Constructs a diff to get from newlines to oldlines."""
 
    # Construct our special "hash" table. Using primes for the table length
    # would be better, I know, but odd numbers are simpler.
    tblSize = len(newLines) * 2 + 1
    tbl     = tuple([] * tblSize)
    for iNew in range(len(newLines)):
        iBucket = hash(newLines[iNew]) % tblSize
        tbl[iBucket].append(iNew)
 
    # Holds our diff information.
    result = []
 
    # Find the common subsequences and differences.
    iOld = 0
    iMax = 0
    while iOld < len(oldLines):
        iBucket = hash(oldLines[iOld]) % tblSize
 
        # Start with no match
        nMax = 0
        for iNew in tbl[iBucket]:
            # if nMax > 0 and iNew < iMax + nMax:
            #     continue
            nCur = 0
            try:
                while newLines[iNew + nCur] == oldLines[iOld + nCur]:
                    nCur += 1
            if nCur > nMax:
                nMax = nCur
                iMax = iNew
                iEnd = iMax + nMax
                if iEnd == len(oldLines) or iEnd == len(newLines):
                    break
 
        # Have we found a matching subsequence?
        if nMax == 0:
            # Nope.
            result.append(oldLines[iOld])
            iOld += 1
        else:
            # Woohoo!
            result.append((iMax, nMax))
            iOld += nMax
 
     return result

There old version is in my last entry.

Part of my final year project is a version control system. It’s not the most important part, but it is important.

I searched about for delta compression algorithms, but all of them were either inappropriate or would take too long for me to implement (not to mention overkill for a college project!). So I came up with my own line-oriented one. Last night I wrote a quick implementation of the algorithm out on paper in python. I haven’t tested it out yet, so I’m not sure about its performance. The end product will be written in C++.

def GetDiff(oldLines, newLines):
    """Constructs a diff to get from newlines to oldlines."""

# Construct our special "hash" table. tblSize = len(newLines) * 2 + 1 tbl = tuple([[] for i in range(tblSize)]) for i in range(len(newLines)): pos = hash(newLines[i]) mod tblSize tbl[pos].append(i)

# Holds our diff information. result = []

# Find the common subsequences and differences. iOld = 0 iMax = 0 while iOld < len(oldLines): pos = hash(oldLines[iOld]) mod tblSize

# Start with no match nMax = 0 for iNew in tbl[pos]: nCur = 0 while nCur + iNew < len(newLines) and \ newLines[nCur + iNew] == oldLines[nCur + iOld]: nCur = nCur + 1 if nCur > nMax: nMax = nCur iMax = iNew

# Have we found a matching subsequence? if nMax == 0: # Nope. result.append(oldLines[iOld]) iOld = iOld + 1 else: # Woohoo! result.append((iMax, nMax)) iOld = iOld + nMax

return result

If anybody has any thoughts on it, I’d love to know.

7 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!