I recently did a podcast with the BookNet group in Canada that focuses on the intersection of technology and books. They were interested in our research focusing on prizewinning and bestselling novels. My main emphasis in the discussion was to focus on the way computers can be useful for different kinds of audiences: for publishers to better understand the books they are selecting and marketing; for readers to better understand the books they want to enjoy but also engage with more critically and/or analytically; and for writers who want to use data to create new works that are aligned with existing markets in fresh and novel ways.
This past weekend I participated in an interview with Jeanette Kelly on the CBC to discuss our new work on using computers to predict bestsellers and prizewinning novels. In it I discuss the Devoir challenge in which local Quebec writers try to impersonate a bestseller using our data and our successful attempt at predicting this year’s Giller Prize winner that was foiled by my misjudgment of committee behaviour.
Here is a link to the interview.
Here is a humble 1-page guideline that we produced after studying a sample of 10 years worth of the bestselling novels according to the NY Times Bestseller list. It was used as part of the Devoir Challenge in which some local Montreal writers were asked to try to write stories “like an American bestseller.”
One of the most interesting things we found when we sampled this past year’s bestsellers was that nothing much seems to have changed. In fact, the only really strong difference we detected was more emphasis on technology (more texting, phones, email, laptops, photographs, screens, and video). At the same time, there was less bitterness, genuineness, learning, and faith, and sadly more murders, police, lawyers and detection.
One question we were left with is just how stable this vocabulary is over time. Do bestsellers really reflect their times, and if so, what is the relevant time-frame (a year, a decade, a generation)? Or maybe they just consist of a relatively consistent set of tropes (action, police procedures, etc) recycled into a variety of insignificant sub-plots. More work to be done there.
How to write like a Bestseller
***Things to focus on:
Try to use many more characters than normal (about 30% more per novel).
Try to use more dialogue, about 50% more than you would normally.
Try to focus more on people, pronouns and actions:
- More than 50% of the unique grammatical patterns in Bestsellers involve proper names
- This is another popular formulation: gerund – to – verb, as in “going to run”
Try to focus on the following themes:
- police and law (investigate, gun, kill, shot, file, lawyer, evidence)
- technology (phone, photo, cell, text, program, scan, camera, screen, tape, button, but not “telephone”, that is more indicative of serious books)
- conflict oriented words (problem, challenge)
- facial expressions (nod, frown, sigh, grin, blink)
- simple actions (grab, rip, gasp, ring, shook, crash, pull, get)
- greater certainty (absolutely, totally, especially)
- oddities: pretty, coffee, showers, porches
***Things to avoid:
Try to avoid using sentences longer than 11 words on average.
Try to avoid over-emphasizing nouns instead of proper names, in other words, think people not things (or even worse, abstractions).
Also try to avoid using nouns around conjunctions.
Some of the more popular grammatical patterns of serious literature involve nouns, adjectives, prepositions and determiners (such as every, few, this).
Try to avoid the following themes:
- complex emotions (shame, weeping, pity, abandon)
- nostalgia (children, childhood, mothers, fathers)
- nature (sea, winter, trees, desert, branches, mountains, spring, clouds)
- imagination (pretend, imagine, dream)
- the act of writing (write , wrote, language, books)
- tentativeness (sometimes, perhaps)
oddities: tea, coughing, meat, soap, socks
When the books editor of Le Devoir, Catherine Lalonde, called to ask if my lab would supply a data-driven guide on how to write like a bestseller, I enthusiastically said yes. But I expected everyone else would say no. Surely writers will be allergic to data. And surely Quebecois and Canadian writers won’t want to write like an American bestseller! But this turns out not to be the case. The volunteers lined up, including this year’s Giller Prize winner.
The reason I love this experiment is because it challenges our assumptions about data, creativity and culture. Understanding the tropes and tricks of bestselling writing offered a way for these writers to play with words and conventions. Writing a story, in this view, doesn’t start with the imaginary blank page (the way creative writing is often depicted in movies). Instead, it starts with explicit knowledge about how words always precede us before we begin to create something new. Data can be an instigation.
The same could be said with the cultural mash-up of asking francophone writers to write like American bestsellers. Its an exercise in mental travel, something we do physically here all the time, since so many of us live so close to the U.S. border. These kinds of cultural border crossings are important. They are about trying to think our way into the conventions of other people. The world would be better off if all of us did this more often. For our part in the lab, we’re going to look at more than just U.S. bestsellers next time. What are the different popular cultures of reading that exist around the globe? This is something we want to know more about.
The results of the experiment have been delightful to read — funny, clever, urgent. They take some of the bestseller’s love of emergency and give it a thought-provoking spin. One is about a writer trying to break through the constraints of writing by talking to herself. One is about a girl storming her home after a terrible day like it’s Star Wars. Another reads like a classic mystery in miniature, wealthy manor and all. One is about a man shifting his gender towards being a woman, and finally, the most recent is a complex allegory about sheep and an obsession with coffee and lost property (“sheepish” is wryly translated by André Alexis literally into French as “moutonnière,” once again showing us his brilliant thinking through animals). Each story, in its own way, boils down to a sense of identity in peril, something out of kilter or uncertain. You can still hear the pulse of Quebec beneath the thrum of l’américaine.
But did they succeed, you might be asking yourself. For the curious, we went ahead and asked the computer to predict which of the five stories sounded most like an “American bestseller”. As you will see, three of the five stories succeeded, with “Annie courait” by Daniel Grenier the most likely to be a bestseller. This doesn’t mean the others aren’t excellent in their own way. It just means that M. Grenier was able to mimic the conventions in incredibly droll ways. Then again, this could be one test where failing is a good thing!
If we take a quick look at Grenier’s story, we see how he does all the right things. He focuses on body parts like heads and faces; he conveys a sense of urgency through phrases like “La porte allait se refermer d’une seconde à l’autre” or “Soudain, elle fut stoppée net dans sa fuite.” He uses a lot of dialogue and has short, choppy sentences (“Rien. Silence Radio.”). And of course, there is a gun.
But he also plays with these mundane rules, too. The dialogue is actually Annie talking to herself. And her obsession is with breaking through a door — the door of “8,000 signs,” which we gradually learn is the story she can’t finish. This is a story about constraint, the constraint of a newspaper imposing strict word limits, about being handed a list of do’s and don’ts that were generated by a computer, about all those little voices in our head telling us what we should do in life. “You are going to do more with less, Annie,” she says to herself pointing the gun at the table of multiple columns of the 8,000 signs.
This is the breakthrough we are all hoping for: the discovery of something new and exciting, more from less.
The Devoir Challenge
|Annie courait par Daniel Grenier||83%|
|On ne rit pas par Monique Proulx||65%|
|Millionnaire fauché par Stéphane Dompierre||33%|
|Les sécrétions magnifiques par Marie Hélène Poitras||71%|
|Au Mouton Grincheux par André Alexis||46%|
* Scores are based on the probability that the computer expected the story to be a bestseller. Results are based on a sample of 44,270 passages of bestselling and random novels.