Today was mostly PCI cleanup day, as the vast majority of the day was spent cleaning up most of the PCI mess for sh on 2.6. Managed to hunt down a rather obscure bug with the PCI auto code (cloned from MIPS, whhich cloned it from PPC, etc.), which was causing the dreamcast BBA to not respond properly. This ended up being a problem when poking the mem BAR to read its size and then writing out a new auto configured value for the BAR. As a workaround, we have to write back the pre-configured address that it comes up with after being powered on. It's not exactly obvious why this is the case, especially given the fact that the BAR value isn't even the same as the address range that we use for accessing the board later on. Presumably a good chunk of this still needs further debugging. The odd thing is that this doesn't end up being a problem for the I/O BAR, but even when using PIO for device programming, the MAC still comes back garbled. Given the fact that some similar issues are showing up on 7751R, I suspect there's still something else that needs fixing. Most annoying.
On another note, it looks like the rest of my sh updates made it in in time for 2.4.22-rc1, which means that -rc1 should now be in pretty good shape for both sh _and_ sh64. This is definitely good, since the sh stuff was way out of sync for way too long. This should at least cut down on how much time I'll have to spend maintaining the 2.4 stuff. Now it's just 2.6 that needs more attention.. particularly for sh64, which I still need to port *grumble*.
Yesterday was mostly spent working on a new API for the SH-4 store queues. This ended up going pretty well, except now there's still some address translation issues to work out. Namely, we have to have an implicit mapping from the store queue virtual address to the associated physical address. Doing this by hand works fine, but then we lose the mapping when the TLB is flushed. Alternatives here are either wiring the TLB entry (which also entails moving the rest of the SH stuff over to array access of the UTLB), or putting together a pte and shoving it in the page tables by way of update_mmu_cache() or something similar. Wiring the entry is the best way to go if we end up using the queues quite heavily at a given point in time, since we can keep the translation around and survive a context switch which may flush the TLB on an ASID counter wrap around. However, if we aren't using the queues, it makes no sense to keep the TLB entry locked down. As such, I'll probably have to go with both methods, or some sort of hybrid between them. Will have to look into this more later.. back to profiling the TLB flushing overhead for now, and then off to start implementing array access and porting over my sh64 tlb interface for doing mostly the same thing.