The Race to Achieve 100% Coverage of the Web

Posted on September 19, 2016 by ohadf

In our new report, we deconstruct the all-too-familiar race to achieve 100% coverage of the web. Data acquisition efforts usually rely on one of three approaches – build an internal web crawling capability, rely on data providers, or implement a combination of both. The goal is to tap into as much structured web data as...

Continue reading

Posted in Big Data

Guide to Structured Web Data Consumption: How to get instant access to news, blogs, and online discussions

Posted on September 1, 2016 by ohadf

Hundreds of entrepreneurs, researchers, and data scientists contact us daily with questions about accessing structured web data. We put together our answers our new guide to Structured Web Data Consumption. The consumerization of web data It’s easy to fall into the trap of building a proprietary crawling and data structuring solution tailored to a particular...

Continue reading

Posted in API

100% coverage of the Web

Posted on March 9, 2016 by webhose

Well that’s the holy grail. To be able to tap into World Wide Web as a whole is something that anyone dealing with data would like to have, but is far FAR from achieving (except maybe for the NSA, we don’t know). The idea behind Webhose.io is that when you need data from the web,...

Continue reading

Posted in API