Use Algorithms.io to Provide Automated Movie Recommendations


<a id="Introduction">Introduction</a>

In this use case we’ll show how to use the item based collaborative filtering algorithm in our system to automatically recommend movies using the popluar "Movie Lens" data set (this data set is widely used in many machine learning examples). We'll select a movie (Die Hard 2) that we know a specific person (in this case our nephew) likes. We can then compare that to a list of movies preferred by kids his age who also like Die Hard 2 to find out which other movies to recommend.

By applying this technique, we can use data to make a better decision on what new movie to recommend. This technique allows us to tailor our recommendation to the specific likes of a person based on the behavior of other people with similar preferences.

The use case described below can apply to ANY product recommendation. Product recommendations are one of the more popular applications of machine learning due to its success in increasing sales for ecommerce companies.

<p>In this example we'll walk through the following steps:

  1. Selecting Sample Data (Input)
  2. Choosing and Running An Algorithm (Execution)
  3. Interpreting the Results (Output)

<a id="Problem">Problem</a>

Your teenage nephew, Peter, is coming to stay with you for the weekend.[^1] Not being entirely sure how to entertain him this entire time, you'd like to be able to rent several movies that he might be interested in.

Being a teenager, you know his attention span is short, so you are really concerned about selecting movies that will keep him entertained. You know Peter has talked non-stop about the movie Die Hard 2 so you decide to look for movies similar to this.

You go online and find a data set of movies titled "Movie Lens". This data set includes a list of 15,000 people who like the movie Die Hard 2. For the purposes of this example we'll assume that all of the people in this data set are kids around Peter's age. The survey shows movies other than Die Hard 2 that these kids prefer. The most popular movies from this list have the highest likelihood of being enjoyed by Peter.

With this information you are ready to use Algorithms.io to find out what movies to rent.

<a id="Input">Input</a>

To get started, download the Movie Lends data set here: Movie Lens Dataset.

As mentioned above, this data set includes records for approximately 15,000 customer. Each of these customers is known to like the movie Die Hard 2.

The two key variables to evaluate are customer similarity, and preference ranking.

Determining Customer Similarity

In this case, we are going to assume that all of the customers in the Movie Lens data set are similar to Peter based on a filter applied to the data before downloading. When implementing this technique for your own customers, it is important to consider what criteria you will use to determine which customers are similar to others. If you choose to use information for customers that are not similar, then your recommendations have a much higher likelihood of being irrelevant. Irrelevant recommendations will not drive the desired customer behavior (often a decision to buy).

Determining Customer Preferences

How might the customer preference data have been collected? Customers indicate their "preferences" in many different ways. Arguably the most definitive method is by making a purchase online. Based on purchase history you can infer the products a user prefers. Another methods include analyzing page view data, clickstream data, and customer surveys.

Upload Data to Algorithms.io

The post command instructions are here: Upload Data

Convert Movie Lens Data File to Proper Format

The collaborative filtering algorithm that we'll use to make the movie recommendation takes a specific type of data input. Before it can process Movie Lens file, we'll need to transform that file into the proper format.

This algorithm is included as Step 1 of the Recommendation Engine.

For more detailed instructions on how to use the data file preparation algorithm go here: Collaborative Filtering File Preparation.

<a id="Execution">Execution</a>

Running Recommendation

Running the recommendation is done by selecting the algorithm you wish to try from Step 2 of the Recommendation Engine. We provide several collaborative filtering algorithms for you to try.

If you are working with large data sets, or need faster results, we recommend working with item-based collaborative filtering. This method creates and item-item relation matrix that can be persisted and called against to provide recommendations without having to retrain your model every time.

<a id="Output">Output</a>

The results of the recommender will show movie titles, year the movie was made, and a similarity score. The similarity score is based on how kids in the Movie Lens data recommended that specific movie. Movies that are recommended more have a higher score.

The results will be returned in json structure as shown below:

 [
     {"output": {"api": {"Authentication": "Success"},"data": {"recommendation": 
    [{"id": "Dangerous Minds 1995", "value": "0.95810586"},
   {"id": "Star Trek V: The Final Frontier 1989","value": "0.9529066"},
   {"id": "Young Guns 1988", "value": "0.95277506"},
   {"id": "Die Hard: With a Vengeance 1995","value": "0.95277506"}]

We can see that Dangerous Minds, Star Trek V: The Final Frontier, and Young Guns 1988 are the top three results. You go ahead and rent

Luckily Peter's tastes match those of other kids his age so the recommended movies are a hit, he enjoys both movies and you become his favorite relative.

<a id="Application to Other Use Cases">Application to Other Use Cases</a>

This technique can be used to recommend other products or services. As mentioned at the beginning of this use case, Ecommerce relies heavily on automated recommendations to show potential customers products they are more likely to purchase. Personalizing product recommendations while a visitor is on the site has been shown to significantly increase purchases.

Often times recommendations are made in combination with clustering algorithms. Clustering can be used to break a large customer set down into small groups with similar characteristics. Using similar clusters of individuals as the basis for recommendations increases the likelihood that those recommendations will be relevant to new customers.

If you'd like to learn more about how you can implement a product or service recommendation for your business please contact us at support@algorithms.io and use the title "New Recommendation Inquiry".

Sources

[^1]: Example based on Mahout In Action, Sean Owen, Robin Anil, Ted Dunning, Ellen Friedman. October 14, 2011. You can find this book on Amazon here: Mahout in Action

Last 7 days, UTC

Average Latency
102ms

Average Uptime
100.0%

Current Status
Online

  • No information
  • Outage
  • Disruption
  • Normal
Average latency determined from Mashape to API Response time

Simple & Straightforward Pricing

Pay as you go. No long-term contracts.

Freemium

$0

additional fees may apply

Algo Credits

750 / day

$0.1000 per extra

1 GB Storage Up to 1GB of Storage

Subscribe

basic

$9.00

per month

Algo Credits

25,000 / mo.

$0.0500 per extra

Subscribe

Dataset

HTTP 200 : string DELETE/dataset/id/{id}
Delete

Delete a dataset


Test console
Description
Parameter
string

id of dataset to delete
Example: 1111

id

Required

HTTP 200 : string GET/dataset/id/{id}
Get One

Get the content of a dataset by an id.


Test console
Description
Parameter
string

An id of the dataset you want to retrieve
Example: 1111

id

Required

HTTP 200 : string GET/dataset
List All

List all of your datasets and their attributes such as the id


HTTP 200 : string POST/dataset
Upload

Upload a dataset


Test console
Description
Parameter
binary

A file you want to upload

theFile

Required

Step 1 - Prepare Data File

HTTP 200 : string POST/jobs/run/29
Data File Preparation

This function formats a data file to be compatible with the recommendation algorithms. Currently you will need to use a cURL command to execute this function as it does not work with the GUI. cURL details are here: http://catalog.algorithms.io/catalog/algo/id/29?category=/Recommenders.


Test console
Description
Parameter
string

json is the only valid option
Example: json

outputType

Required

string

sync is the only valid option
Example: sync

method

Required

string

The CSV file to prepare a recommendation for
Example: [1111]

datasources

Required

string

The column name in the csv file for the user field
Example: user

field_user_id

Required

string

The column name in the csv file for the item field
Example: item

field_item_id

Required

string

The column name in the csv file for the preference field
Example: pref

field_preference

Required

Step 2 - Run Recommendation

HTTP 200 : string POST/jobs/run/14
Item Based - Log Likelihood

Item based recommenders make recommendations based on similarity of items. If item X is similiar to item Y, and person A likes an item X, then person A may also like item Y. They use a defined statistical method (in this case Log Likelihood) to determine the similarity of items based on a set of characteristics or "features".


Test console
Description
Parameter
string

json is the only valid option
Example: json

outputType

Required

string

sync is the only valid option
Example: sync

method

Required

string

Name of the item to get a recommendtion for. Name must match column header in data file.
Example: [2092]

datasources

Required

string

item is the only valid option
Example: item

type

Required

string

Name of the item to get a recommendtion for. Name must match column header in data file.
Example: The Office

item

Required

HTTP 200 : string POST/jobs/run/19
Item Based - Pearson Correlation

Item based recommenders make recommendations based on similarity of items. If item X is similiar to item Y, and person A likes an item X, then person A may also like item Y. They use a defined statistical method (in this case Pearson Correlation) to determine the similarity of items based on a set of characteristics or "features".


Test console
Description
Parameter
string

json is the only valid option
Example: json

outputType

Required

string

sync is the only valid option
Example: sync

method

Required

string

Dataset ID #. Data set must be in .CSV format
Example: [2902]

datasources

Required

string

item is the only valid option
Example: item

type

Required

string

Dataset ID #. Data set must be in .CSV format
Example: The Office

item

Required

HTTP 200 : string POST/jobs/run/18
User Based - Log Likelihood

User based recommenders make recommendations based on similarity of users. If person A is similiar to person B, and person B likes an item X, then person A may also like item X. They use a defined statistical method (in this case Log Likelihood) to determine the similarity of a new user to previous users based on a set of characteristics or "features".


Test console
Description
Parameter
string

json is the only valid option
Example: json

outputType

Required

string

sync is the only valid option
Example: sync

method

Required

string

Dataset ID #. Data set must be in .CSV format
Example: [2902]

datasources

Required

string

user is the only valid option
Example: user

type

Required

string

Name of the item to get a recommendtion for. Name must match column header in data file.
Example: The Office

item

Required

HTTP 200 : string POST/jobs/run/40
User Based - Pearson Correlation

User based recommenders make recommendations based on similarity of users. If person A is similiar to person B, and person B likes an item X, then person A may also like item X. They use a defined statistical method (in this case Pearson Correlation) to determine the similarity of a new user to previous users based on a set of characteristics or "features".


Test console
Description
Parameter
string

json is the only valid option
Example: json

outputType

Required

string

sync is the only valid option
Example: sync

method

Required

string

Dataset ID #. Data set must be in .CSV format
Example: [2902]

datasources

Required

string

user is the only valid option
Example: user

type

Required

string

Name of the item to get a recommendtion for. Name must match column header in data file.

item

Required

You must have an API key to test this API!


Mashape allows developers to find, consume, and distribute cloud APIs just like Recommendation Engine.

Login to your account or signup: Create Account

or
   Signup with GitHub

By signing up you agree to our terms of service.