More permanent stuff at

03 October 2008

Bayes Filtering in Javascript

I use google reader for my rss feeds. Something I'm surprised they haven't added yet are "predictive labels" that attempt to classify your feeds depending on how you've trained it.

Arguably, this would only be useful if you have a lot of feeds and wish to weed out noisy posts (like when a technical blogger starts making political posts--ugh!).

Thinking it would be a fun hack I set out to build a naive Bayes classifier in Javascript. It turned out to be easier than I thought.

Some observations:
1. My math is rusty.
2. Fancy mathematical diagrams sometimes don't translate so easily to code.
3. Edge cases still suck.

The fruits of my efforts are included in an iframe below. If you prefer, you can visit the actual page.

The next step would be to inject this into google reader somehow to predictively classify a post. I'm not sure of the best way to do this. Greasemonkey could handle capturing when a post is tagged, but physical storage would be required to store the probability graphs. That puts me squarely in the realm of traditional Firefox plugins. I'll have to do some more investigating.

Fun learning project though.



MostThingsWeb said...

Very well done! Being that this is public domain, I will be slightly modifying it. Just wanted to let you know:

Again, great work.