Why Dark Web Search Engines are Not Enough

Posted on May 26, 2020 by Liran Sorani

read the article

Since the dawn of the internet, organizations and businesses alike have realized the importance of continuously monitoring both cybercriminal activity as well as their brands. Law enforcement agencies (LEA) need to keep track of the latest data breaches and illicit sales. Both organizations and brands alike need to leverage much more than dark web search engines for increasing their digital risk protection. Regardless of the type of monitoring, it all has to be done quickly and accurately, before cybercriminals carry out their crimes and brand reputations are irreparably destroyed. 

And with new marketplaces popping up all the time it means that content on the dark web is increasing exponentially and more data has to be sorted through. That leaves no choice but to find an automated technology that can do this as cost-effectively as possible. 

The Challenge of Comprehensive and Relevant Coverage 

In our last blog post, we listed the pros and cons of the top 5 dark web search engines. These dark web search engines – and most dark web search engines –  are limited in providing coverage of the TOR network and open-web networks only. Other networks like Zeronet and I2P aren’t included in that coverage. That’s not even mentioning including coverage of the seemingly infinite number of new marketplaces that are popping up all the time to avoid tracking by LEA or that are constantly migrating to chat applications like Telegram, Discord, or IRC.  

Today it is still a challenge to find a dark web search engine that includes coverage of a wide variety of forums, marketplaces, and chat applications. (Of the 5 dark web search engines covered in our last post, only Ahmia included both marketplace or forum content coverage). Only a dark web search engine with a sophisticated discovery mechanism would enable this level of robust dark web monitoring that would be effective for LEA, organizations, and brands. 

And even if dark web search engines did offer robust dark web monitoring, or will in the future, it doesn’t mean they are able to access relevant and accurate data. For instance, a crawler might be able to deliver data on Telegram, but what if you only want to search for fentanyl-related discussions? Or you’d like to identify breached data from your organization such as emails, wallet ID or credit card, and bank information? You’ll need advanced dark web monitoring that offers this type of granular filtering capabilities to filter through the massive amount of data in the dark web to reveal relevant risks to organizations or LEA, such as major bank breaches or drug deals — long before they are carried out. 

This is all part of the natural evolution of the dark web search engines. They are becoming more sophisticated with time. When the dark web first started gaining traction among cybercriminals, dark web search engines were developed with the simple goals of identifying illicit goods and services for buying or selling. If they were able to identify one or two marketplaces where the good or service was sold, it was enough. But LEA, brands, and organizations using the dark web today need access to far more robust coverage to monitor all illicit activity across the dark web’s complex web of forums, marketplaces, and networks. 

Overcoming Technological Obstacles

Another way that dark web engines are becoming more sophisticated is in that they are built with the goal of overcoming various technological challenges. Many dark web search engines might rely on crawlers that don’t crawl the data as frequently as necessary to deliver relevant data. For instance, dark web search engine crawlers might only crawl a marketplace or forum once a week, whereas advanced dark web monitoring technology can crawl the same marketplace or forum every day or hour. 

Dark web search engines face additional technological obstacles in gathering dark web data. For instance, a lot of data is blocked to avoid automatic crawling by bots. Webhose was able to bypass this obstacle by developing an user mimic flow based on a headless browser. The headless browser eventually behaves like a human being browsing the web. 

That includes access to content blocked by a paywall or content that requires a user to login with a username and password. (None of the dark web search engines we covered in the last post include access to content blocked by a paywall, and only Kilos is able to access content that requires a username and password).

The Dark Web Data Feeds of the Future

Most LEA, organizations, and brands don’t have the time or resources to build their own dark web search engine, especially one that delivers comprehensive dark data feeds from new darknets with millions of sites, files, marketplaces, and chat platforms crawled daily. Webhose’s Dark Web API offers simplified data extraction that includes access to both structured and unstructured data — all from a single endpoint, saving you time and resources with advanced dark data feeds rather than building your own crawlers or web scrapers. 

What’s more, our advanced dark web monitoring enables you to meet these needs at scale and on-demand so that they can constantly monitor threats across the darknets. As the risks of cyber threats rise across different industries, including healthcare, finance, intelligence, and law enforcement. The ability to scale data collection and extraction from the darknets will be the most effective way to monitor and mitigate damage to your organization and brand and fight crime. 

Want to learn more about Webhose’s robust and accurate dark web data coverage? Get in touch with our dark web data experts today!