The Rising Popularity of News APIs
Since the debut of Google’s News API almost two decades ago, web and media monitoring companies relied on the ability of a few major news APIs like Google, Bing and Webhose to keep up with the massive amount of news articles continually being published online. This was along with a lot of other affordable and even free web scrapers that could be used for smaller, simpler data projects.
Traditionally, it was mainly web and media organizations that relied on these news APIs for aggregating the latest news data for competitor analysis, understanding the latest trends in the industry, and brand monitoring. With the rise of artificial intelligence and its integration into almost every industry, however, the need for news data has expanded dramatically.
Once Upon a Time, in a Digital Galaxy Far Away
These major players had a number of pros and cons. The Google and Bing News API were delivering search results of major search engines that had already stood the test of time. These search results were available in over 35 languages. But since the results it delivered were biased search results and not comprehensive news data, they didn’t necessarily include niche or new sites. They were also beholden to the relevancy algorithm, meaning that at any given time the results could change. These news APIs had another major con: they weren’t affordable at scale.
Fortunately, along came Webhose, a news API dedicated to crawling news data from the web. That meant it could deliver high-quality, structured, relevant news data regardless of how many relevancy algorithm updates had recently taken place (Google has had 3,200 in 2018 alone). Not only did it crawl news data in over 80 languages, it allowed you to search and use advanced filters on top of the basic keyword search such as by category, location, organization, individuals, sentiment, or a specific language, and the output is a machine readable structured JSON or XML format. Admittedly, not all organizations saw the need to go to a third party for their news data, especially if they were looking for specific, smaller, news datasets.
The Evolving Market for News APIs
As markets mature, the products and services within them gravitate towards specializing to suit different customer’s needs. This is what is happening today with the explosive growth of APIs in general – there are application programming interfaces (APIs) for almost every industry – whether finance, healthcare, or news. It’s what we suspect is happening to the news API market in particular with the growth in the AI , ML and NLP market – which according to Forrester – is estimated to be worth $37 billion by 2025. The use cases for news APIs, once which were limited to web and media monitoring and research have now expanded to include financial analysis, AI and ML for organizations of all sizes.
It’s not only that these organizations need the latest news data. They are also increasingly relying on news APIs to deliver accessible, structured data at scale – that sometimes includes historical data going back over a decade. We’ve discussed the need for comprehensive, high-quality, large datasets and how they are the foundation of accurate ML and AI models in a previous post.
Let’s take a look at the following specific examples:
- DataRobot, an end-to-end AI platform, used Webhose’s news data to change its clickbait algorithm to one that predicts virality using correlations between specific keywords. The new algorithm enabled journalists and publishers to focus their efforts on creating content that is well-researched and informative, rather than “clickable.”
- BlueWhiteRed is an organization that uses Webhose’s news data to fight the existing filter bubbles that current news platforms have created. Every hour, it adds news stories with more than 10,000 likes on Facebook to its platform to be sorted into political classifications. The organizations’ mission is to give readers the bigger picture of what’s going on in the news with unbiased data – so readers can make up their own mind about the important political issues of the day.
The Leading News APIs on the Market Today
As the market evolves, a variety of news APIs have been developed to meet a wider range of needs.
The Bloomberg News API and New York Times API, for example, are both premium publisher new APIs offering high-quality news data from a reliable source. Both offer historical data as well – the New York Times API goes back as far as 1851. The catch is that they’re limited to their own source. Zyte (formally Scrapinghub), for example, offers managed and self-managed services that can be ideal for small datasets or specific data. On the other hand, it can be problematic once an organization wants to scale.
Let’s not forget APIs that deliver search engine results for the major search engines, like SerpWOW (which includes coverage of Google, Bing, Yahoo. etc.). These results are from major search engines and should be included for any organizations looking to cover this type of data from the web. As we’ve mentioned in this post and in previous blog posts, however, these search engines still rely on their own search algorithm, rather than providing the most relevant news data according to an organizations’ specific needs. Instead, the news data it retrieves is based on each search engine’s algorithm that decides which results are going to come up highest, overlooking relevant news data that might include newer or niche sites.
Watching and Waiting as the News API Market Unfolds
The news API industry has come a long way since the early days of news scrapers that are free or very affordable. Although the market and its offerings have since matured, many news APIs are still limited in the data they crawl. From its earliest days, Webhose has continued to fulfill its mission of transforming raw data straight from the web and delivering the highest quality, structured data to organizations that need it. This includes 25 TB of historical data that can easily be accessed to develop more accurate and unbiased AI and machine learning models.
If you want to learn more about Webhose’s capabilities, check out our advanced News APIs below.