How Alternative Data is Reshaping Finance

Posted on May 24, 2018 by webhose

According to a report recently featured on the Financial Times (PDF), hedge funds are expected to spend upwards of $600m on digital datasets this year, and up to $1bn by 2020. What’s going on? Why are investment firms hoarding all this data, and what types of data are piquing their interest in particular? Read on...

Continue reading

Posted in Big Data

Why (and How) to Monitor RSS Feeds in 2018

Posted on March 27, 2018 by Guy Mor

Rich Site Summary (RSS), as a web technology, has been around since the turn of the last century. But is it still relevant in 2018, is it going to stay around for much longer, and how can it still be useful in today’s online landscape? Our answers to these questions are yes, yes, and read...

Continue reading

Posted in API

3 Predictions for Web Data in 2018

Posted on December 12, 2017 by eranl

2017 was a turbulent year: With Donald Trump shaking up the American political system, cryptocurrencies causing riptides throughout financial markets, and advancements in artificial intelligence sparking both anticipation and anxiety in the scientific world, the passing year seems to have been dominated by a sense of uncertainty and a sea change waiting to happen at...

Continue reading

Posted in Big Data

Web Data Extraction Guide: 11 Questions to Ask

Posted on August 31, 2017 by eranl

The following is an excerpt from our new Web Data Extraction Playbook. We’ll be publishing the second part next week, or you can grab the full guide here. The internet has become an undeniable force in our lives over the past few decades, changing everything from the way we do our shopping to the way...

Continue reading

Posted in Big Data

5 Great Reasons to Meet Us at Strata

Posted on August 24, 2017 by webhose

If you’re visiting this year’s Strata Data Conference in New York, you can find us at Booth #P17, and absolutely should. Here are 5 reasons why our (modest) booth is probably going to rock this year’s Strata Conference and be the biggest thing since distributed databases. 1. Because We’re Giving Away this Awesome Drone We’re...

Continue reading

Posted in News

Web Data Visualization of The Hillary Clinton Top 100 Network Graph

Posted on October 20, 2016 by ohadf

The web data business can get pretty tricky, especially when your job is to extract the broadest possible dataset from the planet’s biggest database. Last week, Webhose CEO Ran Geva ran a fun experiment to visualize Hillary Clinton’s web network. More precisely, who are the top 100 people most frequently mentioned in news articles and blog...

Continue reading

Posted in Machine Learning

Should you buy crawled web data or build your own solution?

Posted on October 10, 2016 by ohadf

In a technologically driven environment, the temptation to develop a proprietary web crawling solution is virtually irresistible. Our latest report examines the true cost of computing and software development resources required to deliver a data crawling and structuring solution at scale: Development & Maintenance Development could mean coding a proprietary solution from scratch, or modifying an existing crawling...

Continue reading

Posted in Technology

The Race to Achieve 100% Coverage of the Web

Posted on September 19, 2016 by ohadf

In our new report, we deconstruct the all-too-familiar race to achieve 100% coverage of the web. Data acquisition efforts usually rely on one of three approaches – build an internal web crawling capability, rely on data providers, or implement a combination of both. The goal is to tap into as much structured web data as...

Continue reading

Posted in Big Data