How to Create a Custom RSS Feed for Content Monitoring

Posted on March 3, 2016 by webhose

Imagine that you had the ability to track what’s being said, felt and published about a given topic, industry or brand. Whether you’re in marketing, sales, search engine optimization, management or just a curious person, there are some major benefits to staying on top of the latest discussions, trends, issues and developments happening in your...

Continue reading

Posted in API

The Top 10 Data & Analytics Articles of 2015

Posted on January 12, 2016 by webhose

The online world of data and analytics is fast approaching epic portions. It’s easy to get overwhelmed. Why? Because, not only has big data been big business in 2015 … but posts, articles, podcasts, webinars, and resources abound. Some are worth your time. Some … are not. To help you dig through the very best...

Continue reading

Posted in Big Data

To crawl or not to crawl, that is the question

Posted on August 24, 2015 by Ran Geva

In order to write an efficient crawler, you must be smart about the content you download. When your crawler downloads an HTML page it uses bandwidth, memory and CPU, not only its own, but also of the server the resource resides on. Knowing when not to download a resource is more important than downloading one,...

Continue reading

Posted in Technology

Vertical aggregation & Pattern matching crawlers

Posted on November 27, 2014 by Ran Geva

After bashing various crawling techniques, I would like to describe the technique we use here, at webhose.io, a technology that was developed over the past 8 years. Our crawlers were developed with the following demands in mind: Efficient on server resources, i.e CPU & bandwidth Fast in fetching and extracting content Easily add new sites...

Continue reading

Posted in API