Monday, January 11, 2010

Visualization of Netflix Data

NYT posted some visualizations of the top Netflix rentals for 12 cities, by zip code.  Unfortunately, the raw data is not available.  Netflix has a history of releasing their rental data/algorithms for contests where participants have attempted to improve these algorithms, so hopefully they'll be releasing more data on consumer preferences for general use.
The city with data available that is closest to where I live is Dallas, Tx.  Before taking a look at the Netflix rentals, let's take a brief look at some demographic characteristics, by zip code, of this same region for reference (this is from the ERsys 2000 Census data, which is outdated).

Here's household income in Dallas:



















(darker zip codes = higher mean household incomes)

Here's level of education in Dallas:

(Darker blue = higher mean education level)




Here's an age breakdown:


















(Darker green = higher mean age; light green / tan = lowest category (under 25))

Finally, here's the map for  race/ethnicity:































<>
Obviously there's some correlation going on here.  Now without reading too much into what might be driving correlations between these maps and the Netflix rental number below, pay attention to the areas of the map near the Hwy 75 marker in the center of the map (near University Park) and in the Northwest quadrant where education/income levels are higher versus the Southeast quadrant (East of 35 and South of 30) where education/income levels are lower:

Mad Men, Season 1:

















Frost-Nixon:



Tyler Perry's The Family that Preys:


















Milk:

















Hancock:

















Religulous:

















W:


















There's a lot more going here than simply looking at the census demographic data versus the Netflix maps suggests, but if we had the raw data, these patterns would be fun to investigate further.

No comments:

Post a Comment