Online Variance Calculation in Python

Since I couldn’t easily find this code anywhere, I figured I’d post it here for quick reference:

"""
2012.1.25 CKS
Incremental calculation of both the mean and variance.
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance
"""
import unittest
 
## Dumb slow mean/variance formulas.
 
def mean(seq):
return sum(seq)/float(len(seq))
 
def variance(seq):
m = mean(seq)
return sum((v-m)**2 for v in seq)/float(len(seq))
 
## Incremental mean/variance formulas.
 
class [...]

Persisting Django’s Test Database

After running Django unittests, it may sometimes be useful to manually inspect the test database after unittests are complete. I was working on a complicated network model, trying to resolve an elusive bug in a unittest, and in this specific instance, I thought running a manually written SQL query would be a bit more helpful [...]

Restricting Write Access to Django Admin

Django has a great auto-generated administration interface. It’s saved me countless hours of development time and makes basic data maintenance a breeze. Although it’s designed to be extensible, it’s not without it’s quirks and hurtles. Fortunately, most of these can be overcome with a little code inspection. One of these quirks is the superficial inability [...]

How to Extract a Webpage’s Main Article Content: The Unicode Edition

When I originally wrote html2text.py, my focus was only on extracting English text from webpages, so I didn’t give much thought to handling Unicode. Ignoring anything but ASCII would suffice. However, I was recently commissioned to extend the script’s functionality to use Unicode, so it could extract text in nearly any language found on the [...]

Google App Engine Patch Accepted

It’s a trivial fix, and it took them over a month to get around to it (hey, I’m sure they’re busy), but Google’s App Engine team finally accepted my patch. With that, the development datastore can be specified with a relative path, making app-engine related scripts easier to share with others who might not have [...]

How to Extract a Webpage’s Main Article Content

The Idea
I had an idea to make a personalized news feed reader. Basically, I’d register a bunch of feeds with the application, and rate a few stories as either “good” or “bad”. The application would then use my ratings and the article text to generate a statistical model, apply that model to future articles, and [...]

A Simple Pylons Wordnet Name Generator

To familiarize myself with the Pylons framework, I wrote a simple app that uses Wordnet to randomly generate very obscure names in the format “adjective noun”. Try it out below by clicking “Generate!”.