Much thanks to bellin for mentoring my py-bsddb learning. Today I'm going to wrap up the extra POS (that's Parts-Of-Speech) extraction code and start work on the word to POS mapping code.
It's offical, I'm insane!
Maybe in a week, I'll have the bsddb wrapped up for the grammar checker and I can start writing the "Actual" grammar checking code!
I found a couple of good resources on grammar checking, I guess this is a good time to drop some references...
- Natural Language Understanding, parsing techniques
- Speech and Language Processing..., parsing and unification techniques
- The New Webster's Grammar Guide, my english grammar rule resource
I'm probably going to go insane writing this project, but I think my resources are good and, of course, I'm prototyping all of this in python first, doing a C binding later. I want to ensure that my ideas about tackling this problem are close to being feasible.
I opted for a non-statistical solution b/c I don't have a beowulf cluster lying around to compute N-Gram/HMM values for words.
Yeah, I'm going to go completely insane.
I guess I'm just glad to finally be working on something that has entirely attracted my attention. I mean to say, that some of my other projects are still up for work (Lymric, Scout, etc) but, this problem is really academic and I can but only hope that I'll actually follow through, and complete, this effort.
Oh, and JEdit is the sweet nectar.