Because people with job titles like “data journalist” have far too much time on their hands, I give you this guy’s thesis: he used a computer program to analyze 100 classic works of literature, and ended up with a list of words that can point fairly reliably to the author’s gender.

Ben Blatt used relied-upon computer programs and methods in order to analyze classic literature and compile a list of common words. He then used it to create a ratio of how often female authors used the words compared to male authors, and voila: a list. Or rather, two lists, each containing the words with the largest imbalance between use by different genders.

Blatt went a step further and analyzed the books for the pronouns “he” and “she” and found what can pretty much be expected at this point. Which is to say, there are several male-centric and -authored books (The Hobbit, The Old Man and the Sea, Lord of the Flies) in which the word “she” makes up less than 1% of pronouns analyzed (The Hobbit only uses it once), while the lowest percentage of the word “he” in a book written by a female is 29% (The Joy Luck Club).

TL;DR – Our society accepts that male stories deserve to be heard with or without female involvement, while the opposite is absolutely not true.

If you find this interesting and would like to fall further down the rabbit hole of data journalism, Ben Blatt has a book from Simon & Schuster entitled Nabokov’s Favorite Word is Mauve. So check it out!

