New Article on Creative Writing at the Atlantic Monthly

We’re very excited that our new culture analytics essay on creativity and MFA programs, as well as issues of gender and racial diversity in publishing, has been published at the Atlantic Monthly.  We use text mining techniques to reveal some troubling patterns in creative writing today, and tell a new story about institutions and creativity.  In the next week or so, we’ll be publishing more details and findings about our experiment.


The .txtLAB Guide to Writing a Bestseller

Here is a humble 1-page guideline that we produced after studying a sample of 10 years worth of the bestselling novels according to the NY Times Bestseller list. It was used as part of the Devoir Challenge in which some local Montreal writers were asked to try to write stories “like an American bestseller.”

One of the most interesting things we found when we sampled this past year’s bestsellers was that nothing much seems to have changed. In fact, the only really strong difference we detected was more emphasis on technology (more texting, phones, email, laptops, photographs, screens, and video). At the same time, there was less bitterness, genuineness, learning, and faith, and sadly more murders, police, lawyers and detection.

One question we were left with is just how stable this vocabulary is over time. Do bestsellers really reflect their times, and if so, what is the relevant time-frame (a year, a decade, a generation)? Or maybe they just consist of a relatively consistent set of tropes (action, police procedures, etc) recycled into a variety of insignificant sub-plots. More work to be done there.

How to write like a Bestseller

***Things to focus on:

Try to use many more characters than normal (about 30% more per novel).

Try to use more dialogue, about 50% more than you would normally.

Try to focus more on people, pronouns and actions:

  • More than 50% of the unique grammatical patterns in Bestsellers involve proper names
  • This is another popular formulation: gerund – to – verb, as in “going to run”

Try to focus on the following themes:

  • police and law (investigate, gun, kill, shot, file, lawyer, evidence)
  • technology (phone, photo, cell, text, program, scan, camera, screen, tape, button, but not “telephone”, that is more indicative of serious books)
  • conflict oriented words (problem, challenge)
  • facial expressions (nod, frown, sigh, grin, blink)
  • simple actions (grab, rip, gasp, ring, shook, crash, pull, get)
  • greater certainty (absolutely, totally, especially)
  • oddities: pretty, coffee, showers, porches


***Things to avoid:

Try to avoid using sentences longer than 11 words on average.

Try to avoid over-emphasizing nouns instead of proper names, in other words, think people not things (or even worse, abstractions).

Also try to avoid using nouns around conjunctions.

Some of the more popular grammatical patterns of serious literature involve nouns, adjectives, prepositions and determiners (such as every, few, this).

Try to avoid the following themes:

  • complex emotions (shame, weeping, pity, abandon)
  • nostalgia (children, childhood, mothers, fathers)
  • nature (sea, winter, trees, desert, branches, mountains, spring, clouds)
  • imagination (pretend, imagine, dream)
  • the act of writing (write , wrote, language, books)
  • tentativeness (sometimes, perhaps)

oddities: tea, coughing, meat, soap, socks


Send us your rejected manuscripts!

Donate your ideas to science not Facebook! Every day we post valuable information to social media sites. These platforms use that information to learn all sorts of things about us. But they never share that learning back with us. We want to start a new culture of idea donation in which the knowledge gained is given back to you the participants.

Have you ever written a novel or a short story and had it rejected? We want to hear from you!

In this project, we’re interested in understanding what makes a work of creative writing pass through the gatekeepers of editors, publishers, and marketing departments. What are they selecting for? Does anyone on the outside really know?

We have lots of examples of things that have been successfully published. New novels, new short stories appear every day. But what we can’t see are the piles and piles of rejected manuscripts from which the successful ones are chosen.

Knowing how these choices are made can help aspiring writers of the future navigate these filters better (assuming it isn’t random, but we might learn that too!). But it can also give us a critical lens through which to view the often narrow ways we think about creativity today.

If you want to donate your manuscript to our project, please send it as word document along with a copy of your rejection letter/email (if you received one) to:

Of course, if your material becomes a part of our published study, we will not include any information about your name or identity – everything will be kept anonymous.

And we will definitely share our findings with you.


Richard Jean So

Andrew Piper


The Devoir Challenge

When the books editor of Le Devoir, Catherine Lalonde, called to ask if my lab would supply a data-driven guide on how to write like a bestseller, I enthusiastically said yes. But I expected everyone else would say no. Surely writers will be allergic to data. And surely Quebecois and Canadian writers won’t want to write like an American bestseller! But this turns out not to be the case. The volunteers lined up, including this year’s Giller Prize winner.

The reason I love this experiment is because it challenges our assumptions about data, creativity and culture. Understanding the tropes and tricks of bestselling writing offered a way for these writers to play with words and conventions. Writing a story, in this view, doesn’t start with the imaginary blank page (the way creative writing is often depicted in movies). Instead, it starts with explicit knowledge about how words always precede us before we begin to create something new. Data can be an instigation.

The same could be said with the cultural mash-up of asking francophone writers to write like American bestsellers. Its an exercise in mental travel, something we do physically here all the time, since so many of us live so close to the U.S. border. These kinds of cultural border crossings are important. They are about trying to think our way into the conventions of other people. The world would be better off if all of us did this more often. For our part in the lab, we’re going to look at more than just U.S. bestsellers next time. What are the different popular cultures of reading that exist around the globe? This is something we want to know more about.

The results of the experiment have been delightful to read — funny, clever, urgent. They take some of the bestseller’s love of emergency and give it a thought-provoking spin. One is about a writer trying to break through the constraints of writing by talking to herself. One is about a girl storming her home after a terrible day like it’s Star Wars. Another reads like a classic mystery in miniature, wealthy manor and all. One is about a man shifting his gender towards being a woman, and finally, the most recent is a complex allegory about sheep and an obsession with coffee and lost property (“sheepish” is wryly translated by André Alexis literally into French as “moutonnière,” once again showing us his brilliant thinking through animals). Each story, in its own way, boils down to a sense of identity in peril, something out of kilter or uncertain. You can still hear the pulse of Quebec beneath the thrum of l’américaine.

But did they succeed, you might be asking yourself. For the curious, we went ahead and asked the computer to predict which of the five stories sounded most like an “American bestseller”. As you will see, three of the five stories succeeded, with “Annie courait” by Daniel Grenier the most likely to be a bestseller. This doesn’t mean the others aren’t excellent in their own way. It just means that M. Grenier was able to mimic the conventions in incredibly droll ways. Then again, this could be one test where failing is a good thing!

If we take a quick look at Grenier’s story, we see how he does all the right things. He focuses on body parts like heads and faces; he conveys a sense of urgency through phrases like “La porte allait se refermer d’une seconde à l’autre” or “Soudain, elle fut stoppée net dans sa fuite.” He uses a lot of dialogue and has short, choppy sentences (“Rien. Silence Radio.”). And of course, there is a gun.

But he also plays with these mundane rules, too. The dialogue is actually Annie talking to herself. And her obsession is with breaking through a door — the door of “8,000 signs,” which we gradually learn is the story she can’t finish. This is a story about constraint, the constraint of a newspaper imposing strict word limits, about being handed a list of do’s and don’ts that were generated by a computer, about all those little voices in our head telling us what we should do in life. “You are going to do more with less, Annie,” she says to herself pointing the gun at the table of multiple columns of the 8,000 signs.

This is the breakthrough we are all hoping for: the discovery of something new and exciting, more from less.

The Devoir Challenge

Story Score
Annie courait par Daniel Grenier 83%
On ne rit pas par Monique Proulx 65%
Millionnaire fauché par Stéphane Dompierre 33%
Les sécrétions magnifiques par Marie Hélène Poitras 71%
Au Mouton Grincheux par André Alexis 46%

* Scores are based on the probability that the computer expected the story to be a bestseller. Results are based on a sample of 44,270 passages of bestselling and random novels.


Quantifying the Weepy Bestseller

We have a new piece appearing in The New Republic today. In a number of recent book reviews, literary critics and novelists arrive at the consensus that to be a great writer, one must avoid being “sentimental.” One famous novelist describes it as a “cardinal sin” of writing. But is it actually true? Using a computer science method called “sentiment analysis,” we tested this claim on a large corpus of novels from the early twentieth century to the present, and found the opposite. Writers who win book prizes and get reviewed in the New York Times are not any less sentimental than novelists who write popular fiction, such as romances or bestsellers. The only group for whom this was not true were the 50 most canonical novels ever written since about 1950. Our analysis tells us that if you want to write one of the most important books of the next half century, then you should tone down the sentiment. But if you want to be reviewed in a major newspaper, sell books, or win prizes, go ahead and emote away.

But the larger point for us is the way our cultural taste-makers are often wrong or extremely biased in their assumptions about what matters. We found that a computer, ironically, can paint a more nuanced picture of what makes great literature.

Here is a an excerpt:

If you want to be a great writer, should you withhold your sentimental tendencies? The answer for most critics and writers seems to be yes. Sentimentality is often seen as a useful way of distinguishing between serious literature and the not-so-serious, probably best-selling kind. “Sentimentality,” James Baldwin wrote, is “the ostentatious parading of excessive and spurious emotion…the mark of dishonesty, the inability to feel.” While sentimentality is false, grandiose, manipulative, and over-boiled, high literature is subtle, nuanced, cool, and true. As Roland Barthes, the dean of high cultural criticism, once remarked: “It is no longer the sexual which is indecent, it is the sentimental.” This sentiment (yes sentiment) has been around since at least the early twentieth century and is still a subject of debate in the review pagesof numerous media outlets today. But is it true? Whether you are for subtlety or against sentimentality, is this a good way to think about writing your next novel?

Read more here.