21 Aug 2006 (updated 22 Aug 2006 at 17:21 UTC)
»
I don't really ever get into hyper angst about code that works - I mean I get annoyed and rewrite stuff, but usually after a few months of having to work with it and work around issues that have come due to shifting goal posts. But this thing I wrote for Kaz, well, its really got up my nose, and here is why:
Classes should have a single job
The main class was supposed to be a line on the graph (I will have to rant about issues of drawing graphs at a later point in time), but that's what it is, so it is a collection of coordinates and associated line data (such as title, colour and origin information). On top of this there are associated class functions such as loading, saving, normalization and peak detection (plus any number of extra analysis tools). Around this there is a wrapper class to hold a collection of these objects, which also contains functional elements to be able to call all of the sub elements in turn, such as finding peaks on all the lines. On top of this there may also be the interface to the application.
My mistake was to make all of the above into a single class. What this means is that firstly, from the outside there isn't a single public function (there should be). There isn't because the object works in either one of two modes - as the collection of lines object or as the single line object, not as both at once (so its not recursive). Because of this, the single line object has to produce public methods to the collection object so that it can do its job. But because the collections object is actually the same one that means its public all the way to the outside. That's not right.
Data Structures should not be misused
There are some sub classes, including a coordinate object. The coordinate object is most basically two double values - something like
class coordinate {
public double X;
public double Y;
}
So, there's nothing wrong with that (there are also associated methods with this object which is why it's an object not a struct). However, in the bit of the code which has to return a pair of double's (to be precise, the part which looks for maximum and minimum values of X) it is not appropriate to return a coordinate where c.x is the minimum and c.y is the maximum. Either a new structure should have been made, or a hash table, or another associative array
Well Named Variables decrease risk of Bug
My last point is one that I'm still working on but doesn't actually affect the Chemistry App for Kaz (see point about me being good below) - its one where good choice of variable name's can decrease the probability of error.
It happened to our junior programmer at work this week - or rather he wrote it and I fell for the trap. He was doing some image manipulation and had some sub images that he was positioning. To do so he was dealing with the height of the sub image. However, the local variable height was not the height of the sub image - it was some random number that he had assigned further up the script and was still in scope. What got me was that a) the variable height was in scope and when I came to write the next line, it was the most obvious match from the intellisense, and b) the actual height variable was much more complicated in name. What it resulted in was a bug in code which (if we weren't looking that carefully at each line of code in turn, but writing many lines before test) could have ended in a bug which may have been near impossible to locate.
So I've been making a special effort this past week to use proper variables in code. Isn't that good of me?
Weirdness to investigate this week
C#: Threads continue execution after main() exits, and those threads can spawn new threads, but any events associated from main will not execute after main() has died (or so it seems...)
Good news
I just passed the RSGB Foundation test. Woo.