Older blog entries for dan (starting at number 132)

Some pre-Christmas cheer - shameless plugs

At around the end of October we were going to have a whole series of posts about all the new stuff we're doing at $WORK, including

  • how to build an nfsroot web server farm based on Debian and using the fine services of Elastichosts ,and
  • the great people at Loadstorm and how they make web app load testing cheap and easy (TBH, not much of an article in this one even, it really /is/ easy)

but due to a marginally missed deadline and then the knockon effect of coping with all the new stuff they put in at Sagepay, I've not had time to write much. But in the meantime, suffice to say that the services linked from this post have amazed me not only by working properly (ok, in itself not that amazing) but by their swift, helpful and clueful email tech support to my queries. I'd like to add Bytemark to that list, and let it be generally known that I feel bad about the stuff there that we'll be decommissioning shortly - but hopefully we can put some business their way again in future.

Anyway, the new systems are now all up (although not actually running all the new code as yet) - maybe in the New Year I can describe in more detail the installation process. It needs documenting somewhere, anyway...

Syndicated 2010-12-21 16:01:36 from diary at telent netowrks

Streaming media with Sinatra for lightweights

I started looking at all the UPNP/DLNA stuff once for a "copious spare time" project, but I couldn't help thinking that for most common uses it was surely way over-engineered. What is actually needed, as far as I can see, is

  1. a way to find hosts that may contain music
  2. a way to search the collections of music therein for stuff you might want to listen to
  3. a way to get the bits that encode that music across the network
  4. a way to decode them, and push them to a DAC and some speakers

And it seems to me that DNS Service disovery ought to cover the first requirement quite adequately, HTTP is perfectly suited to pushing bits across a network, and once you've got the bits to the client then everything else is trivial. So this only leaves "search the collection" as an unsolved problem, and it surely can't be too hard to do this by e.g. sending an XQuery search to the collection server and having it return a XSPF playlist of matching files.

Probably the only reasons I haven't done this yet is that I don't know the first thing about XQuery, and I can't see a RESTful way to send XQuery to a server without misusing POST, because from the examples I have seen it looks too big to fit in a GET query string. So I'm letting it all mull in my mind in the hope of coming across a truly succint search syntax that does like being URL-encoded. In the meantime, though, because even though I don't need to discover my music but I still want to play it in the living room anyway, here's my one hour hack:

Syndicated 2010-12-06 22:01:00 from diary at telent netowrks

Let me (ac)count the ways: Sagepay Admin API vs Ruby Enumerable

At $WORK we accept credit card payments through Sagepay Server - a semi-hosted service that enables us to take cards on a service that looks like our web site but without actually having to handle the card numbers. Which is nice, because the auditing and procedure requirements (google PCIDSS for the details) for people who do take card numbers are requirementful and we have better things to do.

Anyway, for reasons too grisly to go into, I found myself yesterday writing some code, in Ruby, that would talk to the snappily named "Reporting and Admin API". (It used to be called "Access", but, just like Mastercard once upon a time, apparently got renamed). It's not particularly difficult, just a bit random. You create a bunch of XML elements (note, no root node) indicating the information you want plus the vendorname/username/password triple that you'd use to sign in to their admin interface, then you contatenate them being sure not to introduce intereleemnt whitespace, then you take an md5 hash of what's left, then you delete everything inside the <password> tags and substitute <signature>md5 hash goes here</signature>. Then you surround it all with <vspaccess> and </vspaccess>

If that sounds like doing XML by string munging, that's pretty much exactly what it is, but you don't want to do it using an XML library like anyone sane would do, because that might introduce whitespace or newlines or something which will upset the MD5 hash. Why didn't they use something standard like HTTP Digest authentication (or even Basic, since it's going out over HTTPS anyway)? No, I don't know either. At the least they could have specified that the hash goes somewhere other than in the body of the message it's supposed to be hashing.

Anyway, some Ruby content. The sagepay R&A call for getTransactionList takes optional startrow and androw arguments but doesn't say what the defaults are if they're not supplied: inspection says that unless you ask for something else you'll get the first fifty, and it's not completely unreasonable to suppose this is because you'll get timeouts or ballooning memory or some other ugliness if you ask for 15000 transactions all at once. So, we probably want to stick with fifty or whatever they've decided is a good number and do further queries as necessary when we've dealt with each block. But if we have to handle this in the client it's going to be kind of ugly.

Fortunately (did I say we were getting to some Ruby content? here it is) we don't have to, because of the lovely new support in 1.9 for external Enumerators. An Enumerator is an object which is a proxy for a sequence of some kind. You create it with a block as argument, and every time some code somewhere wants an element from the sequence it executes the block a bit more until it knows what value to give you next. This sounds trivial, but it makes control flow so much simpler it's actually pretty gorgeous, because the control flow in the block is whatever you need it to be and the interpretere just jumps in and out as it needs to. Just call yielder.yield value whenever there's another element ready for consumption and what you do between those calls is up to you.

This is kinda pseudocodey ...

offset=0
Enumerator.new do |yielder| # this arg name is convention
  loop
    doc=get_fifty_requests_starting_at(offset)
    doc.elements.each do |element|
      yielder.yield element # control goes back to the caller here
    end
    if doc.length > 0 then  # there are probably more elements to get
      offset+=50
    else
      break # end of the results
    end
  end
end
and this is kinda too long to be illustrate the point quite as effectively, but does have the benefit of actually doing something useful: https://gist.github.com/662821

If you find it useful, I am making it available under the terms of the two-clause BSD licence. If you want to extend it, send patches. If I need more of the API methods I'll be extending it too. If either of the two preceding things happen and cause it to grow up I'll move it into a proper github project and make it play nice with gem/bundler/all that goodness

Syndicated 2010-11-04 20:51:27 from diary at telent netowrks

Anhedonic Android

I'm having another go at Android development: this happens every so often when the memories of Java verbosity are sufficiently dulled by distance that I start thinking "it can't have been that bad, can it?". And of course, it turns out every time that yes, it can.

Anyway, should you be doing Android programming and getting a runtime error of the form Your content must have a TabHost whose id attribute is 'android.R.id.tabhost' when your latest changes shouldn't have involved a tabhost anyway, my advice is not to spend too long looking for the cause until after you have run ant clean and reinstalled on the target device or emulator. Because my experience is that the error then usually goes away with no code changes required.

Another tip to lessen the monkey-clicking: use the am command to launch your app automatically after a successful build/install. I don't know ant nor do I want to learn it right now (include XML rant by reference here) so I've added to the creaking edifice witha small Makefile

PATH:=/usr/local/lib/android/tools:$(PATH)

bin/onelouder-debug.apk: $(shell find src -name \*.java) $(shell find res -name \*.xml ) ant debug

go: bin/onelouder-debug.apk adb -e install -r bin/onelouder-debug.apk adb -e shell "am start -a android.intent.action.MAIN -n fm.onelouder.player/.Onelouder"

clean: rm -rf bin gen ant clean

Syndicated 2010-09-16 10:39:36 from diary at telent netowrks

RESTless spirits

From stackoverflow.com

I have read up many articles on Rest, and coded up several rails apps that makes use of Restful resources. However, I never really felt like I fully understood what it is, and what is the difference between Restful and not-restful. I also have a hard time explaining to people why/when they should use it.

If there is someone who have found a very clear explanation for REST and circumstances on when/why/where to use it, (and when not to) it would benefit the world if you could put it up, thanks! =)

Content-Type: text/x-flamebait

I've been asking the same question lately, and my supposition is that half the problem with explaining why full-on REST is a good thing when defining an interface for machine-consumed data is that much of the time it isn't. OK, you'd need a really good reason to ignore the commonsense bits (URLs define resources, HTTP verbs define actions, etc etc) - I'm in no way suggesting we go back to the abomination that was SOAP. But doing HATEOAS in a way that is both Fielding-approved (no non-standard media types) and machine-friendly seems to offer diminishing returns: it's all very well using a standard media type to describe the valid transitions (if such a media type exists) but where the application is at all complicated your consumer's agent still needs to know which are the right transitions to make to achieve the desired goal (a ticket purchase, or whatever), and it can't do that unless your consumer (a human) tells it. And if he's required to build into his program the out-of-band knowledge that the path with linkrels create_order => add_line => add_payment_info => confirm is the correct one, and reset_order is not the right path, then I don't see that it's so much more grievous a sin to make him teach his XML parser what to do with application/x.vnd.yourname.order.

I mean, obviously yes it's less work all round if there's a suitable standard format with libraries and whatnot that can be reused, but in the (probably more common) case that there isn't, your options according to Fielding-REST are (a) create a standard, or (b) to augment the client by downloading code to it. If you're merely looking to get the job done and not to change the world, option (c) "just make something up" probably looks quite tempting and I for one wouldn't blame you for taking it.

Syndicated 2010-08-23 20:04:19 from diary at telent netowrks

Do not meddle in the affairs of Wizards

From github

A less resource-heavy way to do realistic regression tests (and eventually load tests) than controlling an actual web browser a la watir.

  • Interact with your web site using Firefox.
  • Capture the requests sent with Tamper Data, and export as XML
  • Replay them from the command line
    • with realistic timing
    • with SSL support
    • with a 'rewrite' step that lets you programmatically change the request data before sending it (e.g. to switch hostnames from production to test, or vice versa)
    • using the single-threaded low-overhead goodness of EventMachine

For an event this autumn that I'm probably not allowed to tell you about, $WORK needs a web site that deals with 50x as many transactions as the current box. Current plan is to move it into the cloud and add memcached for everything that might conceivably benefit, but step one in performance tuning is, of course, to get a baseline.

And it's an excuse to learn EventMachine

Syndicated 2010-08-07 16:50:20 from diary at telent netowrks

Using a public routed network on a Vigor 2700

If you have a tech-friendly ISP (like mine ) your DSL service might not only have a static IP address (one that doesn't change each time you reconnect) but several of them. In my case, 8 (five usable).

If you have a Draytek Vigor router, you can configure it to know about these using the 2nd subnet support - this is what I did before I moved, with the 2600 I had at the time

If you have the specific Draytek Vigor 2700 model that I have (and I don't know how wide a problem this is) you may attempt to follow these instructions but find that the configuration options for second subnet are missing. The option for DHCP relay agent is missing too. I tried a bunch of stuff including factory reset, firmware upgrade, and "phone a friend" to resolve this before eventually grasping the nettle and fiddling with firebug and HTML "view source"

The situation seems to be that the router is entirely capable of doing both these two things (if you're reading this, it must be) but javascript variables govern whether the HTML configuration interface actually lets you, and for no reason I can think of these variables (called HIDE_LAN_GEN_2NDSUBNET and HIDE_LAN_GEN_DHCPRELAY) are, on my router, set to true. So, login to the router, pull up the firebug console, enter

parent.HIDE_LAN_GEN_2NDSUBNET=false
parent.HIDE_LAN_GEN_DHCPRELAY=false
and then choose "This frame", "Reload" from the rightclick menu in the main frame, and you should find they magically reappear and you can configure them appropriately. This will almost certainly take you less time to do than it did me to work out.

Syndicated 2010-05-05 14:57:31 from diary at telent netowrks

Broke again, now back again

Extended outage, yes. I moved house. That wasn't itself the problem: the problem was that the machine which runs this blog decided after the move that it didn't want to run for more than a few hours at a time without panicing randomly.

After replacing the mouse (no, really, the USB-serial interface it was plugged into seemed to be sporadically disappearing from view), hoovering out the inside of the case, reseating all the cards, and various other forms of voodoo, it seems to have been made stable by a couple of BIOS tweaks I found on overclocking forums: upping the northbridge voltage and dram voltage by 0.1V. Apparently this is a common problem on the Asus P5Q-EM when it has all four ram slots full. And no, I'm not overclocking it.

Anyway, I have a stable (so far) system again and blogging service will resume soon. First up: it must be late last year that I started playing with Ruby, so it must be about time for the 6-month review.

Musical Interludes

Part II of the financial thingy is on its way, along with a subjective evaluation of the merits of rspec (I quite like it), but in the meantime have a look at http://github.com/telent/pacioli which is where the work in "progress" is going. I say "progress" because that's a daytime job and, well, something else came up at work.

In the meantime, please feel free to wander across to http://telent.posterous.com/ which is where I'm describing my attempts to put an embedded "Mini2440" Linux system in the Firebrox pedalled sound system

Syndicated 2010-02-24 16:35:12 from diary at telent netowrks

The Programmer's Guide to Financial Book-keeping, Part I

Once upon a time I knew enough about bookkeeping to implement a a rudimentary accounting system for the consulting business I was running at the time. Then I got a real job, and after that I forgot most of it. Recently I've had to relearn it all, and as the accountancy/bookkeeping web pages that I've found on the Internet are decidedly mixed (an honourable mention here for the Gnucash manual, which is actually quite good), this time I'm writing it down.

The intended audience for this is chiefly me and people like me: computer programmer types who have to make their systems talk to accounts departments and accountants. If you are looking for more information on bookkeeping or accountancy from a professional perspective, it is less likely to be useful.

It should not be necessary - though it probably is - to state that I hold no professional qualifications and have had no training in the field, and if you want proper advice you'll have to pay for it from someone entitled to give it. This information is offered as-is, and no warranties as to its correctness, usefulness or completeness are offered.

Feedback welcome - see the page footer for details.

Definition

Let us define bookkeeping as: the collection and processing of financial records for an entity, with the object that interested parties can learn (1) as of a specified time, how much money (and other valuable stuff) it owns, against how much it owes to other entities; (2) over a specified period of time, how much has come in and how much has gone out. Bookkeeping deals not just with money but with all kinds of valuable stuff: cash, shares, financial instruments, land, saleable equipment, stock in trade, etc etc - in the rest of this post I'll be lumping it all together as "value".

End results

In the UK, the end processes of bookkeeping/accountancy for a company or other trading entity are usually produced annually -

  • the Balance Sheet - a document of type (1), which lists the assets (stuff we've got) and liabilities (stuff we owe) broken down by category, at the end of the trading year. We start with assets, listed in order from most liquid (e.g. money at hand or in the bank) to least liquid (things we own that would be complicated to sell), then we subtract liabilities (usually ordered from short-term to long-term), then the bottom line is what we're worth. This is often referred to as the Accounting Equation:
    Assets - Liabilities = Equity
    although other people will say that Equity is really what the company owes its owners (e.g. the shareholders) so the equity will appear as a liability account and the equation is "Assets = Liabilities". Mathematically it makes no difference.

  • the Profit and Loss account, or P&L - also known in the US as the Income Statement. This is a document of type (2) which lists what's come in and what's gone out over the course of the year.

We probably also want quarterly reports for VAT (that's "Sales Tax" in other countries), and ad-hoc reports for credit control (we need to know who owes us money so we can chase them) and management accounting.

Derivation

Obviously, if your trading entity is you and you alone and there's no regulatory requirement on you to show anyone else the figures, you can choose any categories you like. But for most of us, there are accepted rules about the breakdown that people want to see, what you're allowed to assign to which categories, and what you'd actually want to assign to which categories (which might be a question with different answers depending on whether you're trying e.g. to maximise profit for the investors or minimise it for the taxman). This kind of decision is what you have an accountant for: keeping the numbers is what you have a bookkeeper for. So, look on the difference between those two roles as a policy/mechanism distinction (and a big difference in hourly rate: don't pay an accountant to do a bookkeeper's job)

Accounts and transactions

So, with the aid of an accountant we can establish how we need to categorise our assets and liabilities for the reports we need to produce. Each category (or sub-category, or sub-sub-category) is an account: each transfer of value from one account to another is a transaction. A transaction is usually associated with a source document (for example, a purchase order, or an invoice, or a receipt) - the so-called paper trail is not necessarily kept on actual carbon laminate these days, but it's still important. In essence, what we do is record the transactions.

Credit and Debit

We record each financial transaction as a flow of value from one (or several) accounts into another (or several others). Historically, bookkeepers don't get on with the concept of negative numbers - this is possibly because it can be confusing to have your "Income" account get steadily more negative as the year goes on (we'll come back to why this happens), or maybe just because the principles of double-entry bookkeeping were invented in a time and place (Renaissance Italy) that hadn't really yet heard of negative numbers. Whatever. But the upshot is that they made their own words up instead: the account that loses value is said to be "credited"; the account which gains value is "debited".

This is, of course, completely bass-ackwards from the perspective of normal people, though it has been claimed that the problem is that we're backwards. When the bank send you your statement of account it's printed from their point of view, not yours. So, if you deposit £50 in the Royal NatMid, in their eyes that creates a liability to you (after all, it's money they have but you own). The more money you give them the more they can transfer (debit) to Assets/BranchSafe or Assets/Vault or Assets/SubPrimeMortgages, but they have to credit that transaction to Liabilties/AP/YourNameHere. So the effect is that we perceive being in credit with the bank as a good thing: they see it as a bad thing. It's just a matter of perspective.

(I am slightly suspicious of this explanation. "Credit" and "Debit" are both from Latin roots: /creditum/: "a loan, thing entrusted to another", and /debilitum/: "thing owed," neut pp. of /debere/ "to owe". In the end they're just words, but it's still confusing enough to be just more fuel for my scepticism towards the claim that negative numbers are avoided because they cause confusion. Maybe that's just me.)

Double-entry

The principle of double-entry accounting is that the value credited in a transaction must equal the value debited - value cannot be created or destroyed. The name comes from paper-based systems: if we have two accounts affected by a transaction, we must enter the transaction details into both. Using a computer, of course, we can enter it once and it will appear in both, but that's not the point. We are interested in the principle of "conservation of value", not so much in the mechanism of how we achieved that in the old days.

A simple example: our shop runs out of float in the till, so on Monday we must visit the bank and get some more cash. We record this as a transfer from the bank account (which is credited) to the till account (which is debited).

Here we're moving value between two asset accounts: they both represent monies that we own - just in different places. So it's pretty easy to see that "conservation of value" holds true. But the principle of double-entry bookkeeping is that the debits and credits in any transaction must always balance, so the astute reader will now be wondering how we do that for a transaction that actually makes the company money. For example, if we do some work and get paid, then the value in the transaction is clearly going into the bank account, but where is it coming from?

Income and expenses

The answer is that we create "Income" accounts which serve as a proxy for the outside world as it affects our company. So, if we get $200 for configuring Joe's web server, our bank account is debited $200 and the outside world, as represented by our Income (or Income/Sales, or whatever subcategorisation we want to use) account, is credited $200. Expenses accounts serve a similar but opposite role: we pay for stuff (like stationery, utilities, salaries) that makes us poorer (our assets are credited) and the outside world richer (our expenses are debited).

Income and Expense accounts are key to the P&L statement that we will produce at the end of the year, because they act as summaries of our interactions with the world - which is what P&L is all about. The USAnian name for them "Income statement" hints as much.

Accruals

Another key concept is accruals. In most businesses there is a delay between when we provide something of value (e.g. do some work) and when we actually get paid: there is also often a delay between when we receive something of value and when we have to pay for it. In a cash accounting system there's nothing we can do about that, but in an accruals system we can create "accounts receivable" and "accounts payable" for these sums which are "in the post". This allows our accounts to say that we are worth $4000 because we have that amount expected to come in from Michael next week, even though we haven't got it in the bank yet. So, this makes payment a two-stage process: first we send an invoice and transfer $4000 from Income/Sales to Assets/AR/Michael, then when he pays it four weeks later (or perhaps four months later if he's a public sector body) we transfer $4000 from Assets/AR/Michael to Assets/Bank. We haven't actually made any new money in that second transaction, but at least it now exists in the bank and not just on paper.

Accounts Payable is similar but opposite. We order office furniture on account, it gets sent with an invoice, and we log that transaction as a transfer from Liabilities/AP/IKEA to Assets/Furniture. When the invoice is due (or three weeks later if you have really good credit control) we send them a cheque and we do another transaction from (crediting) Assets/Bank to (debiting) Liabilities/AP/IKEA, which hopefully reduces the balance of the latter account to zero.

Most of the examples later in this post ignore accruals in much the same way and for the same reason as Kernighan and Ritchie ignore error checking: it slightly obscures the pedagogical point, but that doesn't mean you won't do it for real.

Sale of goods

If you're selling services, the transaction is Income/Sales->Assets/Bank. That's simple. If you're selling goods, though, (1) you have to buy them first

£6 cr. Assets/Bank = dr. Assets/Inventory/Widgets

and then (2) when you sell them you are selling at a different price.

£10 cr. Income/Sales = dr. Assets/Bank
£6 cr. Assets/Inventory/Widgets = dr. Expenses/Cost of sales

The net effect is to increase Income by a tenner and Expenses by an unwell cephalopod (that's "sick quid" to you. Sorry). Thus both effects of the transaction will be represented on the appropriate P&L rows.

VAT / Sales Tax

VAT in the UK is not really ever money we have earnt, it's just money we are collecting on behalf of the nice people at HMRC. So, if we are registered for VAT we must collect it on each sale into a holding account which we send them later, but it's not "ours" and doesn't show in Sales.

£20 cr. Income/Sales + 3.50 cr. Liabilities/VAT = 23.50 dr. Assets/Bank

Watch out for the credit/debits in that transaction. We should end up with cash in the bank (a debit), some of which is owed to the VAT man (credit). If they don't sum to zero, you've done something wrong.

Similarly we can also claim back VAT on purchases from our VAT-registered suppliers

11.75 cr. Assets/Bank = 10.00 dr. Assets/Inventory + 1.75 dr. Assets/Input_VAT

At the end of the quarter, we pay HMRC what we owe them, less what they owe us

1.75 cr. Assets/Input_VAT + 1.75 cr. Assets/Bank = 3.50 dr. Liabilities/VAT

Note that this is not reflected in any Expense account - it shouldn't be, because it wasn't in an Income account to start with

Year end

We've already talked about producing the Balance Sheet and P&L. The other action we take at end of year is to close the accounts: in the case of Income and Expenses, we will want to start the following year with a clean sheet. How to do this: after producing the end-of-year reports, move the entire contents of Income and Expenses accounts into a summary "Retained Earnings" account, debiting and crediting as appropriate.

Contingent concepts

We have not talked about: journals, day books, cash books, general ledgers, T accounts, and trial balances. Most of these are historical practices that are necessary in manual systems either because the latency of entering everything directly in double-entry form is high (so transactions are initially recorded elsewhere instead), or because there is no automatic checking that the accounts are in balance, or because obtaining summaries of groups of accounts (answering queries like "what's the total AP for all suppliers") isn't a trivial bit of SQL.

Where next?

This is Part I of a two-part series. In the second part I'm going to write about my experience implementing all this in Ruby, but that will have to wait until I've done the actual implementation.

Hopefully though, this post should provide you with a view of the principles such that you can google for anything else you see and you have a framework to hang it on.

Syndicated 2010-02-09 14:58:49 from diary at telent netowrks

123 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!