Bechdel Project

Northwestern University, EECS349 Machine Learning Final Project

Katherine Steiner, Maggie Kidd

Abstract

For our machine learning final project, we're interested in understanding what combinations of movie attributes make movies more or less likely to pass the Bechdel Test. A film passes the Bechdel Test if it:

has at least two named female characters,
who speak to each other,
about something other than a man.

The test is important because it highlights different trends in the movie industry and brings attention to the fact that not every movie depicts women as multifaceted people. As women in a male-dominated industry, we appreciate visibility and representation in movies and we think it would be interesting to look at how our chosen attributes contribute to the likelihood that a movie will pass the test.

Using the Bechdel Test Movie List's API and the Open Movie Database's (OMDb) API, and some preprocessing (more details in our report), we were able to get a list of movies which pass or fail the Bechdel Test. We also got each movie's title, year of release, country of origin, short plot description, production company, rating (e.g. PG-13), genre, writers, directors, and IMDb rating (e.g. 8.1/10).

Generally speaking, we expect that a movie that is "similar" to a movie which passes the Bechdel Test will also pass the test, and vice versa. We partitioned our data set, 70% for training and 30% for testing, and used Weka ML package to explore different algorithms. The algorithm that resulted in the best accuracy percentage was KStar, which correctly classified movies as passing or failing the Bechdel Test 75% of the time. One particularly interesting attribute was the gender of the director. In the training set, 88% of films with a female director passed the Bechdel Test while only about 50% of films with a male director pass.

In the graph above, red represents instances that pass the test and blue are examples that did not. More graphs can be found in our full report.

Links

Contact Katherine or Maggie by email.

Download our full report.

Download our training and testing datasets.