5 Feb 2013 sness   » (Journeyer)

Naïve Bayes Classification

Naïve Bayes Classification: "The underlying idea is to use individual words present in the text as indications for what category it is most likely to belong to, using Bayes Theorem, named after the cheerful-looking Reverend Bayes.

Imagine that you received an email containing the words “Nigeria”, “Prince”, “Diamonds” and “Money”. It is very likely that if you look into your spam folder, you’ll find quite a few emails containing these words, whereas, unless you are in the business of importing diamonds from Nigeria and have some aristocratic family, your “normal” emails would rarely contain these words. They have a much higher frequency within the category “Spam” than within the Ham, which makes them a potential flag for undesired business ventures.


'via Blog this'

Syndicated 2013-02-05 17:24:00 from sness

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!