data extraction

June 28, 2018

Survey Results: What Matters to Web Data Collection Buyers

While structured web data presents exciting possibilities in many fields of endeavor – including finance, cyber-security, artificial intelligence and more – the market for data extraction platforms is still fairly young. Only […]

October 10, 2017

Quick Guide to News APIs

All you need to know about News APIs.

March 21, 2016

Why Extracting Content From The Open Web Is Better than Surveys for Research

What’s the best way to find out how people feel about a given topic? Simply ask them, right? Well, at least that’s what we’ve been led to believe. Standard polling practice tells […]

December 13, 2015

Article’s publication date extractor – an overview

A few days ago I’ve released an open source Python module that provides you with a simple way to extract and normalize the publication date of any online blog or news post. […]

August 16, 2015

Dead simple {for devs} python crawler (script) for extracting structured data from any website into CSV

On my previous post I wrote about a very basic web crawler I wrote, that can randomly scour the web and mirror/download websites. Today I want to share with you a very simple […]

November 25, 2014

Crawling Horrors – Browser Scraping

In my previous blog post, I wrote about RSS crawlers, and why they don’t really work. In this post I want to discuss the technique of using a headless browser to parse […]