
Bechdel Project
Northwestern University, EECS349 Machine Learning Final Project
Katherine Steiner, Maggie Kidd
Abstract
For our machine learning final project, we're interested in understanding what combinations of movie attributes make movies more or less likely to pass the Bechdel Test. A film passes the Bechdel Test if it:
- has at least two named female characters,
- who speak to each other,
- about something other than a man.
Using the Bechdel Test Movie List's API and the Open Movie Database's (OMDb) API, and some preprocessing (more details in our report), we were able to get a list of movies which pass or fail the Bechdel Test. We also got each movie's title, year of release, country of origin, short plot description, production company, rating (e.g. PG-13), genre, writers, directors, and IMDb rating (e.g. 8.1/10).
Generally speaking, we expect that a movie that is "similar" to a movie which passes the Bechdel Test will also pass the test, and vice versa. We partitioned our data set, 70% for training and 30% for testing, and used Weka ML package to explore different algorithms. The algorithm that resulted in the best accuracy percentage was KStar, which correctly classified movies as passing or failing the Bechdel Test 75% of the time. One particularly interesting attribute was the gender of the director. In the training set, 88% of films with a female director passed the Bechdel Test while only about 50% of films with a male director pass.

In the graph above, red represents instances that pass the test and blue are examples that did not. More graphs can be found in our full report.
Links
Contact Katherine or Maggie by email.
Download our full report.