The outbreak of Coronavirus in 2020 disrupted billions of people’s lives around the world in almost every way possible.
For those of us lucky enough to work in digital marketing, our work lives continued to show some semblance of normalcy, thankfully. That’s not to say that our industries haven’t also been affected by this pandemic.
Most of our clients come to us to create newsworthy, compelling content that earns online news coverage and backlinks through digital PR outreach. Anecdotally, our team has experienced differences in the last few months on how coverage is earned for our clients.
But, being the data-driven agency that we are, we wanted to know: How has the online news publisher landscape changed since COVID-19?
In a groundbreaking study, we analyzed over 16 million online news articles across the web to determine just how much coronavirus related stories are dominating the news cycle, across various verticals and writers.
Read on to find out which industries have been affected by COVID-19 pandemic and which publishers are still publishing non-COVID-19-related content.
What Percent of Authors Are Writing About COVID-19?
First, we wanted to look at the percent of writers across the board talking about COVID-19 during this timeframe. When did coronavirus content really start to show up, and how much is COVID-19 content dominating the news?
We found there was a huge spike in the percentage of authors overall writing about coronavirus during the week of March 1st. Well, what happened on March 1st this year? As it turns out, a Washington state man became the first American to die from coronavirus, making the illness more of a threat than previously thought.
Which Verticals Publish COVID-19-Related Articles the Most?
Coronavirus-related stories continued to rise through March and then began to settle down toward the end of April.
Just as some industries like travel and sports have seen sharp declines offline, coronavirus similarly impacts the content of certain niches as well.
While verticals such as music, fashion, and books seemingly have been mostly unaffected by the pandemic, you can see the real damage of the health, finance, and travel industries based on the sheer percentage of stories being published online about coronavirus in those niches.
Based on this data, you might have the best opportunities to pitch content in the home & garden, food & drink, and sports verticals than you would competing with breaking coronavirus news in the finance, health, and politics spaces.
Percent of COVID-19 Related Coverage, by Online Publisher
For this analysis, we wanted to look at which online publishers are covering the coronavirus almost exclusively at the expense of other content, and which have had more room in their editorial calendars for non-COVID-19-related articles?
In the above graphic, we selected 19 common publishers to demonstrate the sheer difference in the percentage of news stories related to coronavirus across different publishers. To explore this more in depth, you can see the data on more than 400 unique online publishers and determine which ones might have the most pitching opportunities for your content.
Most Common Keywords Appearing in COVID-19-Related Articles, by Vertical
What words come up most frequently in content during a global pandemic?
We analyzed each article across several verticals to determine the most common keywords in content. For example, in the Religion And Spirituality sphere, you might normally expect “church,” “service,” and “prayer” to show up as popular keywords. But, because coronavirus has impacted the nature of content in this vertical, you also see “government” and “virus” show up, indicating a link to the pandemic coverage.
Articles for this report were sourced from The GDELT Project, a monitor of broadcast, print, and web news. We examined the titles and body text of 16,222,217 news articles written in English and published in the United States between January 1 and April 22, 2020 and counted articles if they included any mention of “coronavirus” or “COVID-19” as having covered the coronavirus to some extent. We did not differentiate between articles explicitly about the coronavirus and those that mentioned it in the context of another primary topic.
All data were preprocessed and aggregated via Google BigQuery and web URLs were sourced from BigQuery’s GDELT tables. These were subsequently analyzed using Python 3 standard libraries, Newspaper3k, and other specialized libraries for text extraction and analysis. Topic verticals are based upon the IAB Content Taxonomy 2.0 and were determined algorithmically on a per-article basis by two separately developed multilabel text classifiers. One of these was based on the support vector machine implementation in Scikit-learn and the other is a multinomial logistic regression classifier with document vectors implemented in fastText. These models were both tuned to output a probability distribution of classifications for the articles, and an article was included in a given vertical here if its probability was greater than 0.5.
All percentages used in the report represent a proportion of the articles indexed within GDELT and are therefore not based on an exhaustive count of every article published on a given domain within the studied time period, although GDELT likely approaches indexing all meaningful news content on a majority of high-traffic news websites in the United States.
Fair Use Statement
We’d like to kindly ask that if you choose to share the above project in part or whole, that you link back to this page so that the creators of this study can be recognized for their work and so that your audience can view the methodology in full.