chump is currently a single line assembler and disassembler. This is fine for now when used to enter instructions on the fly into KMD. The next milestone will hopefully be able to assamble programs. This requires some additions into the system.
Firstly the system will have to cope with lables reather than strict numbers. Numbers are easy to process because you can find and scan them quite easily but lables which might not have been defined yet could be a little more tricky.
Forward lables (i.e. branches or loads etc. pointing to lables later in the code) are even more difficult. Many instruction sets have several branch types for different distances and consume different ammounts of space. The first pass will not be able to know how far the target is. Taking the worst case stratergy and reassembling individual lines in later passes is probably the best way to do this but this does not get over the issue of instruction sets like ARM where it is not the distance but the number of significant bits. This could lead to infinate loops.
The first pass should recognise all lables and read all instructions into a list.
Pass two takes the instructions and assambles them. If relative forward looking instructions need an address forward the worst case size is taken. All relative instructions are marked as "to_be_reassabmled".
Pass three reassambles each "to_be_reassambled" instruction and when it finds a smaller version it replaces it (thus changing the address of later instructions and lables). After each optimisation all relative instructions which look over the optimised instruction (forwards only) will be reassambled (again).
I havent started thinking about macros or preassembler operations. There are obveous places (between pass 1 and 2) where they fit in.