Leveraging Data in the Fight Against COVID-19 Disinformation

Posted on May 20, 2020 by Guy Mor

read the article

It’s often been said that the coronavirus is not only a pandemic, but also an infodemic as well. From claims that the virus was deliberately manufactured by the CIA to folklore cures like drinking bleach cures the virus, there’s no shortage of disinformation being spread about COVID-19. 

As the Director General of the World Health Organization, Dr Tedros Adhanom Ghebreyesus, said in a recent speech to the public: 

“Fake news spreads faster and more easily than this virus, and is just as dangerous. That’s why we’re also working with search and media companies like Facebook, Google, Pinterest, Tencent, Twitter, TikTok, YouTube and others to counter the spread of rumours and misinformation. We call on all governments, companies and news organizations to work with us to sound the appropriate level of alarm, without fanning the flames of hysteria.”

How do these fake news stories successfully reach mainstream media and capture the hearts and minds of the public? And what role should data play today in the responsibility of the media and social media platforms to filter out this type of disinformation? 

So, Let’s Start With What is “Fake News”?

You’ve probably heard the term fake news before, and we’ve covered how Webhose datasets can be used to develop algorithmic models using natural language processing (NLP) to detect fake news.

But the term “fake news” should more accurately be broken into two major categories:

  • Misinformation – Information that is mistakenly false or inaccurate information.For example, the teenagers in Macedonia who share fake news articles on Facebook with the intention of making a profit.

  • Disinformation – Information that is false and deliberately created to harm a person, social group, organisation or country. One of the most successful disinformation campaigns was Operation Infection, that claimed that the AIDS virus had been engineered in a US laboratory and tested on homosexual prisoners as part of a biological warfare campaign.  Its goal is to change the perception of reality. It is perhaps the most pernicious of all types of fake news.
  • How Data Powers Web and Media Monitoring Services 

    With the rise of the internet, today’s disinformation is more complex than ever. A news story that could once reach an impressively wide circulation of 100,000 can now reach millions. Media monitoring now plays an important role in fighting the spread of disinformation by making it easier to distinguish between fact vs. fiction. Collecting the data has to be automated, and a combination of AI, NLP, and human analysis must be employed to identify misinformation.

    A few organizations have risen to the occasion in the midst of this crisis:

    • FirstDraft, a nonprofit aimed at fighting misinformation, developed their own news engine using a database of reliable news sources related to the coronavirus, along with education to reporters and the public. 
    • SocialTruth, a EU-based project, focuses on aggregating large volumes of datasets enriched with metadata to validate their reliability. Later they will use the data to focus on developing algorithms that identify fake news.
    • CoronaCheck, a collaboration between Cornell University and Eurecom, an engineering school in France, provides a search engine where users can check claims about the spread of COVID-19 through a database of articles in English, Italian and French.  

    Here at Webhose, we’ve dedicated ourselves to providing high-quality, structured data to media and monitoring organizations. Our data, along with other open source datasets, is being used to power FakeNewsCorpus. It’s a database of over 9 million news articles for researchers all over the world to use as a basis for their machine-learning and deep learning algorithms that will automatically detect fake news. 

    Winning the Battle Against Disinformation 

    Disinformation campaigns have been used as a weapon against enemy regimes for many decades and show no signs of disappearing any time soon. And since the internet has made it easier than ever to create and share these stories, disinformation campaigns have now spread to more than 70 different countries. With social media, lies today spread faster than the truth and become more real every time the lie is repeated. But one shining ray of hope is the technology we have to automatically gather web data. With this data, researchers all over the world are developing AI, machine-learning and natural language processing models to identify fake news. And this is what will significantly contribute to winning the battle against disinformation. 

    Want to learn more about gaining access to high-quality data for your news or media monitoring service? Schedule a call with our data experts today!