#MeToo March in 2017 in Hollywood.

Protesters march against sexual assault and harassment in November 2017 in Hollywood.

AP Photo/Damian Dovarganes

Nation & World

Challenge of archiving the #MeToo movement

7 min read

Most Schlesinger Library collections involve papers, not hashtags and tweets

For decades, Radcliffe’s Schlesinger Library has been the nation’s leading repository for a range of primary-source materials documenting the lives and legacies of women in America. Its collections are crammed with letters and posters, journals and photographs — the physical records of an individual, a family, a social action, a political campaign.

Today, newer collections often arrive with hard disks and thumb drives; “papers” now include emails. But until recently, social media had not figured largely. Then came a cultural moment that shook the nation and helped transform the way the library collects and curates material in a communications age when hashtags can muster millions and tweets are commentary, conversation, and official declaration.

In October 2017, sexual-abuse accusations against film mogul Harvey Weinstein triggered a tsunami of harassment and abuse allegations leveled at men from every corner of American society. The moment also sparked the wide use of #MeToo by survivors who shared their stories of abuse online and demanded change.

“The discussions of how we would be accountable for collecting this movement began almost immediately because of our long history of collecting materials that document gender and labor organizing,” said Jane Kamensky, the Schlesinger’s Pforzheimer Foundation Director and Jonathan Trumbull Professor of American History.

“There was no clear individual to whom we could reach out to acquire the #MeToo collection, so we knew … that we were really going to have to create the collection ourselves,” said Jane Kamensky, the Schlesinger’s faculty director.

File photo by Kris Snibbe/Harvard staff

Jane Kamensky.

But the librarians were confronted with the novel question of how to gather and preserve material related to a movement that was born and partly still exists in a virtual world. Their answer: Gather websites, tweets, online articles, and other electronic material related to the topic in a publicly accessible digital archive.

“There was no clear individual to whom we could reach out to acquire the #MeToo collection, so we knew we would have to work differently to document the movement, and that we were really going to have to create the collection ourselves,” said Kamensky.

The Schlesinger’s digital services team had some experience creating an archive focused solely on virtual material. In 2007, they launched the 10-year project “Capturing Women’s Voices” to collect blogs and websites detailing American women’s lives, philosophies, and engagement with politics. They also had access to tools developed by Documenting the Now, a community archiving project developed after the police killing of Michael Brown in August 2014 in Ferguson, Mo.

Above all, they had a guiding mission to include everything related to #MeToo they could find. Philadelphia activist Tarana Burke is credited with creating the original movement in 2006 as a way to support survivors of sexual violence. But a decade later, the social media hashtag became a rallying cry and the spark that ignited a wave of political, social, and legal battles, and backlash.

Boston Women’s March.

Rose Lincoln/Harvard Staff Photographer

Adult and child wearing pink hats at Women's March in Boston.

“We had a social-media driven revolution whose pushback was almost simultaneous, as opposed to the way that we often think of revolution and counterrevolution,” said Kamensky, “and we realized we had a chance to collect the whole political spectrum around a hot-button issue of gender and sexuality.”

The commitment to that collecting ethos is reflected in the library’s #metoo Digital Media Collection, which opened to researchers on July 1. The online archive contains more than 32 million tweets, 1,100 webpages, and thousands of articles reflecting a range of perspectives.

Amid the collection of websites is a link to a piece by the editors of the Boston Review, who defend their decision to maintain ties with the magazine’s fiction editor, prize-winning author Junot Diaz, after sexual misconduct allegations emerged against him. An article posted on Time.com described an open letter signed by more than 200 women who work on national security for the U.S. stating they had survived sexual harassment and assault or knew someone who had.

The wide-ranging tweets capture both the support for and opposition to the movement.

In November 2017, the Women’s Funding Network posted a tweet encouraging people to remember how pervasive sexual harassment is in society.

In 2018, one year after she encouraged people to share their stories of abuse using the #MeToo hashtag, actress Alyssa Milano honored the day with a repost of her original tweet.

In September of 2018, user Mark Alan Chestnut reposted a tweet from conservative commentator Candace Owens:

The same month, FOX News contributor Lisa Boothe tweeted:

Earlier this year, someone with the Twitter handle “Josh the Leftist” posted:


The project’s 71 hashtags include everything from #BelieveChristine, #believewomen, #timesup, and #ustoo, to #himtoo, #confirmkavanaughnow, #MeTooLiars, and #metoohucksters.

“The community of American women is our community, and this was a key moment in their history,” said Jennifer Weintraub, the Schlesinger’s head of digital collections and services, who worked on the project. “That doesn’t mean that we are pro #MeToo or against #MeToo, it just means that we document it.”

A steering committee made up of historians, lawyers, and data experts from across Harvard helped Kamensky and the Schlesinger staff think through the challenges associated with capturing the movement’s digital footprint. The library’s digital team, aided by an S.T. Lee grant from the Harvard Library and funds from Harvard Business School, identified a range of relevant hashtags and created a system to capture them that has become largely automated. Twitter’s terms of service dictate that the Schlesinger can only provide users with tweet IDs, a number that identifies each unique tweet, but researchers can load the numbers into an online app that will restore or “rehydrate” the tweet’s original content.

Moving forward, Kamensky said she is eager to investigate teaching and research opportunities related to the archive through a partnership with the Harvard Data Science Initiative, which will provide grant funding to explore how the digital data will “engage in conversation” with the library’s more traditional holdings, and to see what kinds of scholars it can attract. (The collection has already been put to use, supporting the argument in Susan Faludi’s New York Times opinion piece that the “believe all women” hashtag is an invention of the right and a corruption of #believewomen.)

“We had a social-media driven revolution whose pushback was almost simultaneous, as opposed to the way that we often think of revolution and counterrevolution.”

Jane Kamensky

Kamensky said she could envision researchers using the data to explore everything from social movement organizing through social media, to whether people posting in the U.S. are paying attention to news from other parts of the world, to the ways the #MeToo movement has been driven by original content versus retweets. “People are going to ask questions of it and produce answers from it that will be far-reaching in scope and scale,” said Kamensky. “A sophisticated analyst of big data will be able to see the movement through this corpus in ways that I can’t even fathom.”

But amassing such a big trove of digital material raises its own questions around privacy and who has the right to access information posted online. Kamensky worked with the steering committee to create an ethics statement for the archive that includes their recommendations on the principled use of the data and the acknowledgement that the library is abiding by “social media providers’ terms of service in distribution of any data that is collected.”

For people worried about what researchers will do with their tweets, Kamensky has a simple suggestion: Read your user agreement.

“To post something on Twitter may feel like a conversation with your intimates at a virtual table in a bar,” she said, “but in fact, it’s a form of publication.”