|Home - Hal and Dee at the Movies||Mail Hal C F Astell - Site Map|
Change in the IMDb Top 250
If you're used to this data and just want to look at the current year's grab, here's 2020.
When I moved to the US in 2004, I wasn't allowed to work until the government gave me permission. That took six months to grant and I took advantage of the free time to delve into classic film courtesy of the newfound wonder (to me) of cable television and, in particular, Turner Classic Movies. While I watched as much as I could generally, I also tried to have some focus to ensure I was finding an appropriate grounding.
There are many lists of "the greatest films of all time". I maintain archived copies of a bunch of Top 100 Lists here at Dawtrina.com, for example, and there are plenty of others out there to play with. These are generally static lists, created by a single person or a focused group of people, and that's fine. However, there's another list that's been around for a long while that is constantly updated and it's voted for by the largest audience of film fans there is: people who frequent the Internet Movie Database.
The IMDb Top 250 is a fascinating, albeit flawed, creature and I grabbed a static copy sometime in mid-2004 to work through. I've tried to keep up by watching new films that enter the list, though I've never managed to watch everything. For a while, my ratings since 2004 highlighted that I was between the 200 and 210 mark, though I'm a little lower nowadays, in the 170s.
IMDb do attempt to ensure a strong list by applying rules and algorithms to get weighted ratings. They don't disclose all the details of how they do this, but the core formula is below (source here):
The following formula is used to calculate the Top Rated 250 titles. This formula provides a true 'Bayesian estimate', which takes into account the number of votes each title has received, minimum votes required to be on the list, and the mean vote for all titles:
weighted rating (WR) = (v ÷ (v+m)) × R + (m ÷ (v+m)) × C
Please be aware that the Top Rated Movies Chart only includes theatrical features: shorts, TV movies, miniseries and documentaries are not included in the Top Rated Movies Chart. The Top Rated TV Shows Chart includes TV Series, but not TV episodes or Movies.
Put simply, they filter down to feature films that have received a certain number of votes (which used to be 25,000 and looks like it still is), then reject what appears to be bad data. They do a pretty good job.
Some flaws are still obvious, of course. This is based on popular voting, so it's open to the tyranny of the majority. It's not too surprising to find a strong bias towards recent pictures, especially big Hollywood blockbusters which leap into the list on release and then slowly (or quickly) drop back out again.
What I found over time, though, is that it holds up pretty well, with my average rating for the IMDb Top 250 higher than that for the AFI's 100 Years... 100 Movies list. From an entirely personal and an 80%-ish complete standpoint, the IMDb list is 'better' than the AFI's (and, to a varying degree of completion, the other few dozen lists I'm tracking). That still seems odd to me, but data doesn't lie.
So, in order to keep an eye on this data, I started grabbing a fresh copy of the IMDb Top 250 every New Years Day, starting in 2013. That allows me to see how that data changes annually. I'm sharing that data on pages here for wider reference:
Here's a summary table:
Here's what these data elements mean.
The mean, median and mode are ways of calculating averages.
The mean is what most people would call the average. It's calculated by adding up all 250 values and dividing by 250.
When all 250 values are sorted in order, the median is the value in the middle. In other words, there as many films in the list from 1994 moving forward in time as there are from 1994 moving backwards.
The mode is the most frequently represented value. In other words, according to this list, 1995 is currently cinema's golden year.
The Hal and Dee numbers represent how many of the 250 films my better half and I have rated (which means we've seen them since 2004) and the mean of our ratings. My rating system ranges from 1 (lowest) to 7 (highest).
The rest of the elements reflect change since the previous year:
Same is the number of films which stayed in the same spot as the previous year. Up is the number that moved up. Down is the number that moved down. New are the number of films in this year's list that weren't in the previous year's — however, some of them may have been in the list prior to that.
Variation marks how much the list has changed overall over the previous year. It counts how many places in the list each film moved (either up or down) and calculates the mean of that.
The change in mean tells us that the films represented in the IMDb Top 250 get newer each year. That's not surprising as new movies are released all the time. The mean is now 1987. The median ought to get newer too, and it is doing that over time, but it seems to get stuck a lot. It spent three years at 1988 and three more at 1993, but has now risen to 1994.
The mode is interesting. Cinema's golden year, according to this list, is 1995 and has been for the last seven years. It's represented by eight films, which means that it's only just holding on from 2014 and 2019 with seven each. 1957 and 2014 drop from seven to six to sit alongside 2000, 2001 and 2009. 1957 is often seen as world cinema's greatest year, while Hollywood's golden year of 1939 is now only represented by two films.
Our ratings suggest that my wife and I both prefer the oldest list that I grabbed in 2004 and it's gradually become a little less valuable to us since then. However, the drop in the last year is notable: after a 6.67 in 2004, my average ratings dropped about a tenth of a point and stayed there from 2013 to 2018, varying just a little, but they dropped to a new low of 6.49 in 2019. That's a big drop and I believe that robs it of the crown of most valuable list (to me) that it's held ever since I started tracking it.
It's perhaps also worth mentioning that my better half generally rates film higher than I do, but my ratings of IMDb Top 250 films have always been higher than hers. I've wondered about that, but, looking wider, it seems that I rate both higher and lower than her, praising or panning, while her ratings clump a little more in the middle.
Ups and Downs
Unsurprisingly, the up and down numbers suggest that a lot more films drop every year than rise. This is surely because, while some films do move up the list, it's much more common for them to be moved down by new entries, which also often move down too, even faster.
Large changes are more represented by drops than climbs, with six films dropping at least 35 places. By comparison, nothing rose that much, the highest climb being the 29 places for Come and See. And that's just films still in the list. Three others dropped out of the list entirely from high places, including Bohemian Rhapsody, which entered at 134 last year and is now gone entirely.
The variation is the most interesting number for me at the moment, as this was a big year of change, underlined by 22 of the bottom 28 films from last year leaving the Top 250. That's a major anomaly, given that the list has been settling over the last few years.
Given that each new year brings new great films, we might expect previous decades to be represented less and less over time and that's generally true. There were 22 new entries this year, which is more than usual. For the last list of the decade, almost 40% of the entire thing (98 films) were released this millennium, 49 each from the 2000s and the 2010s. No decade except the 2010s increased its numbers this year.
It's especially rare for decades prior to the nostalgia point (currently the 1980s) to increase their representation in the list. Since 2013, the 1920s have increased three times and the 1950s, 1960s and 1970s once each. The 1930s and 1940s haven't increased once over that timeframe.
Back in 2004, 152 directors (or directing partnerships) were represented in the Top 250. The most frequent name was Alfred Hitchcock, with nine films in the list, and Stanley Kubrick was nipping at his heels with eight. Most of the frequent directors were from the classic era, including Billy Wilder with six; Akira Kurosawa and Howard Hawks with five; and Ingmar Bergman, Frank Capra, Charles Chaplin and John Ford all with four. Of more recent names, Steven Spielberg had six while Francis Ford Coppola and Quentin Tarantino had four each.
That breakdown is just as varied in 2020 but it's shifted notably newer. Now, there are 156 separate names, but the top ones are now mostly new. Martin Scorsese is now the most represented with eight films, followed by Stanley Kubrick and Christopher Nolan with seven. Hitch is down to six films, the same number as Hayao Miyazaki and Stephen Spielberg. Classic directors are more represented at the three to five level.
The Top Ten
One interesting note is that the top ten today is not only unchanged from 2019, it's almost unchanged from 2013, being comprised of the same ten films, just in a slightly different order, The Dark Knight gradually shifting upwards and Pulp Fiction gradually shifting downwards.
However, it's notably different from 2004. For a start, the top spot is different, The Shawshank Redemption taking over from The Godfather somewhere in between 2004 and 2013. What's more, only half of the films in the last decade's list were there in 2004. The Dark Knight, of course, has an excuse, not being released until 2008, but Pulp Fiction was in 16th place, The Good, the Bad and the Ugly 20th, 12 Angry Men 21st and Fight Club way down in 41st.
Last update: 1st January, 2020