Understanding Netflix and Amazon Under the Hood

How does coding work to place recommendations to potential customers? How does Netflix know whether or not they should suggest you a movie? How does Amazon recommend similar products based on users similar to you? This is the focus of the first reading from Programming Collective Intelligence by Toby Segaran.

Before I even reached this part of the lesson, I had to begin by installing Homebrew, Pip and a few Python libraries. It turns out this was harder than expected. In addition to Python tutorials, I would love to find a great all inclusive video about how to use Python through terminal on Mac. I found much help googling each of my errors and problems, but I would love to see an overview of the whole. I fear I am following it to a limited extent, but would be a much stronger coder if I had a better feel for the big picture.

My next step was reading the second chapter of Programming Collective Intelligence slowly, but without following along by coding at the same time. Machine learning and coding at this level are concepts I want to really understand — not simply memorize the code — so I preferred to take my time and ingest the information.

As I read the material, I slowly began to understand more why vectors and matrices are needed for machine learning as well as other types of coding like these recommendation systems. Last week when I started to look at the math on its own, it was very hard to grasp why this type of math would be needed. It would be remiss to say I fully understand linear algebra, but I feel like it sunk in slightly more this week.

My next step was re-reading the chapter and duplicating their code slowly to understand their version of the code and how it worked. I did get it to run after fighting with typos and even more infuriating, indentation errors! Here’s the code:

I’m still at the early stages in using Python so instead of starting from scratch again, I decided to try replacing their code with numpy and see if I can duplicate it. Here is the code below. You can see where I commented out the original code and replaced my version.

I worked for many hours on this and although it runs, I know the code is incorrect because the numbers do not align to the original code. Part of my problem is fully understanding Python — it is only week number two for me coding in the language. I decided to go to office hours and Patrick helped show me why I was getting errors like “read only”. when I tried to get their list into an array. It turns out that what I was referring to as a list is really a dictionary, and what I thought of as an array is really a list! Don’t even get me started on the tuple! Patrick helped me understand all of the above and Python a little better with the code below:

I also spent quite a few hours ingesting Frank Rosenblatt’s 1958 paper on the perceptron, The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain. Although I spent many hours on all of the above work, I know I’m not fully comprehending everything yet, but I feel optimistic that it’s slowly sinking in.

Jamie's Mad Scientist Lab

Home / Fall 2016 / Understanding Netflix and Amazon Under the Hood

Understanding Netflix and Amazon Under the Hood