What Is a News API?
A news application programming interface, or News API, is a way for an online news source and an application to communicate with one another.
Many commercial News APIs exist, like Bloomberg, the Financial Times, and the Google News API (replaced by the Custom Research API). These APIs allow organizations and individuals to automatically extract data from news sites, so that it can be later used for a wide range of use cases.
The data received can be particularly important for organizations that rely on receiving data connected to live updates from sports events, social streams, stock reports, weather updates and transportation. For organizations that deliver this type of data, it is essential that their API is able to collect, extract and deliver data continuously from the web.
News Data APIs are also essential to many organizations for brand monitoring, gaining deeper insight into the voice of the customer, product and market analysis, and keeping abreast of the competition. In the retail world, brands need to both understand the customer’s dynamic needs and keep track of the different options consumers have to best maintain their competitive edge. Media and web monitoring companies need to rely on accurate news to track mentions of their customers and alert them to these mentions in real-time. For these types of organizations, having a tool that allows them to monitor mentions of hundreds of different customers in real-time gives them an important edge.
Another important and fairly new use case for News APIs is hedge funds and investment management firms who need up-to-the-minute news related to stock performance to discover trends and for predictive modeling. This type of alternative data, when combined with historical stock performance and earnings reports, can offer untapped potential for investors. Global crises, company scandals, presidential elections, the firing or retirement of a CEO – these are all events that can quickly cause a stir in the stock market – and having an automated method of collecting news for financial companies is critical when managing thousands of portfolios of hundreds of stocks.
In addition, enterprise organizations and researchers often need large datasets to build machine learning or natural language processing (NLP) models and algorithms. Many cannot rely on the free public solutions that are currently on the market, as the sample size is too small or the dataset is not specific enough. Large and specific datasets increase the accuracy of the algorithms and models. For example, a team at Yale University needed large datasets of both right-leaning and left-leaning news sources to design their own personalization engine.
Another obstacle many organizations face when selecting a News Data API is affordability, especially when an organization is starting to scale. For example, Google Custom Search is expensive, with a rate of $5 per 1000 queries and a limit of 10,000 queries a day. For organizations who need massive amounts of data for NLP or ML models, these pricing plans are not feasible. In contrast, Webhose’s Firehose service is ideal for enterprises who need full access to open web data in all languages. This includes news, blogs, discussions and reviews. To give you an idea of the massive amount of data they include, posts from the Firehose solution are delivered to the enterprise organization in XML format and uploaded in 10 megabyte Zip files every other minute on a dedicated FTP site.
The most advanced News APIs are able to deliver full text results in machine-readable format. Many APIs of specific news sources, for example, do not offer the full text of the articles that appear in their search results. Advanced APIs return the full text, title, and a range of other details from the news article.
Since the full text of the articles, including comments and headlines, are provided, it can lay the groundwork for later sentiment and text analysis.
Advanced APIs also offer granular search filtering capabilities. In other words, users can query for news data according to specific brand or product mentions, sentiment or ratings reviews, or from a particular source, language, author or publication date. Queries can also be run on specific individuals, locations, keywords, and organizations – ideal for organizations that deliver their customers brand monitoring, media monitoring and financial news analysis services. Another capability that can be particularly useful for these types of organizations is the ability to filter results according to performance score, meaning articles that were shared, liked or went viral on social media. This can be particularly valuable for organizations that want to deliver sentiment analysis and other insights to their customers.
Advanced News APIs like Webhose deliver accurate, relevant, up-to-date information from over 10 million online news sources in over 250 languages. Its web crawlers are able to collect data from both major new sources and niche ones, several times a day and deliver it to organizations in a structured, machine-readable format – including JSON, XML, RSS or an Excel file – ready for analysis.