The server that I was running the computations hard locked sometime during the winter break. Apparently it ran out of disk space while another user was running simulations on it. Wasn’t able to access the machine till I returned to Miami.
Since I had no access to machine with large amounts of memory, I spent some time trying to figure out what was wrong with the training software. Still wasn’t able to find the problem, must be missing something simple.
Upon return to Miami, did the following:
- Fixed the server, apparently it ran out of disk space from log files created from other user’s run.
- Researched building a database for taxonomies.
- Built a database using the BioSQL schema after discovering that Genbank files track phylogeny through recursive ranks.
- Wrote a Python script to fetch the Genbank file for each of the 625 fasta-format genomes and load it into the BioSQL database.
- Began revising taxonomic classifier, ~80% done.
Next things to do:
- Run the taxonomic classifier.
- While waiting for taxonomic classifier results, tear apart training classifier and figure out whats wrong.