Read the Tabs vs.
Spaces article that ariya posted. He
is right about point 1 being the core of the argument, but
the problem is that if you use spaces instead of tabs, then
it becomes hard for others to read your code. I personally
use 8 space tabs because that is the FreeBSD style(9)
guide lines say. This may sound strange to adopt a
project's style guide for your own code, but if we could all
agree on a single style, then everyone would have less
issues with this argument. Or you could always switch to Python which forces style
on you.
I personally agree the 8 space tab stops are good. If
you ever get so deeply nested that you can't fit your code
on one or two lines, then you need to create more functions
for that piece of code. The general rule that if you tab in
more than three of four times from your base function then
you need to rethink the function is a good thing. If you
write with 2 space tab stops, then it's easy to write
functions that have about 20 loops in them (that only puts
you have way across the screen) without even thinking about
it. If you had 8 space tab stops, you'll have issues going
beyond 6 nested loops.
I wrote an MARC binary to ascii conversion program last
night, but I won't release it till I split it into
functions, because the one big function goes a whole four
indents in from the base of the function. For me this is
too much, and writing more smaller functions makes the code
easier to read.
Oh well, just ranting a bit about coding style.
Hmmm, should I rant about the whole binary vs. XML for
machine exchange? The reason systems are getty so bloody
slow is because they decided to trade a faster to
read format for an easier to [human] parse format. If
programers continue to decide to go for solutions like
these, we will continue to need faster computers,
but it doesn't have to be that way.
I was impressed with how easy to parse the MARC format
was without giving up extra space and without dealing with
endianness. To deal with endianness, they simply encoded
the numbers in base10 ASCII. Of course, with python it was
too easy to parse the "binary" MARC format to a list of
dictionaries.
Now for a bit about python. I always forget to use
try/except instead of if statements when it's more
appropriate. One example is if you are adding a data
element to a dictionary, and you may have duplicate tags.
There are a few ways to deal with this. Simply start out
using lists for your data elements (which is probably what I
should do), or you convert it to a list once you get more
than one. An example of the first is:
try:
rec[tag].append(data)
except KeyError:
rec[tag] = data
except AttributeError:
rec[tag] = [rec[tag], data]
The second one would be like:
try:
rec[tag].append(data)
except KeyError:
rec[tag] = [data]
Now the latter one in some ways makes more sense, as then
you don't have to find out if it's a list or not, and handle
them differently, but it also means a bit of extra work in
the case that multiple tags are the exception rather than
the rule.
Oh well, enough mussing, now hopefully the 45gig IBM
75GXP drive I ordered will be waiting for me today when I
get home. I was also lucky to get a couple 128MB PC133
DIMMs for only about $40 each. They were generic stock, but
were CAS2 timing. What luck! Of course, I only happen to
be using them in PC100 capabile hardware, but I'm debating
about ordering a couple more.