Search This Blog


Tuesday, March 19, 2013

Pragmatic Chaos

I've known my husband now for over 7 years and without fail, he can tell me exactly what type of movie I'm going to be interested in watching (and likewise avoiding), and I'd have to say that I'm about as good for him.

And we've been doing that for years.

What I don't understand is what is it that we're seeing in each other's viewing habits that Netflix can't learn with it's Pragmatic Chaos prediction programming?  It's certainly not based on the viewing history- we've entered that for all the movies that we've ever seen into the ratings.

Netflix, in turn, continues to amaze me by making recommendations that are either 1 or  1.5 stars on our expected pleasure scale (out of 5...5's the best).  It just doesn't seem to have the programming to learn that if we're not interested and ignore it all together- that means that the prediction program failed.

Or, the one that really makes me shake my head, are the recommendations in the " New Recommendations for you!" category...and it's all stuff that we've already seen- AND RATED!  Though, I have to say, at least they only make recommendations back to us in the "new" category that were things that we marked down that we liked.

These seems more like hindsight and prediction, to me.

And I have a hypothesis about why their algorithm doesn't work:  They're only asking one dimension of the films or only letting you rate an entire show as a whole.  For example, with Murder, She Wrote, that's 13 seasons of 26 episodes can you make an great prediction on what I'm going to like based on a flat star rating of over 300 shows?  You can't honestly expect that people like each episode the same amount.  And one of the first rules in statistics is that means (average divided by the number of parts) are COMPLETELY inaccurate ways to measure anything except a very tight bell curve.

What they really need to do is allow the user to rate different aspects of the shows- like "Costumes 1-5" or "Music 1-5" or "Plot 1-5" or "special effects 1-5". This would dramatically improve the reporting accuracy and consequently improve the predictive accuracy moving forward.

I know that this means a lot more space in their servers to store the extra ratings, but, honestly, their current method is about as accurate as asking someone, "on a scale from 1-5, how would you rate your entire childhood?"

No comments:

Post a Comment