Why do my dark web results contain false positives?

A common question received from users is about dark web URLs being reported by SpiderFoot, but when visiting the .onion site, the content bears no mention of the name or other entity that SpiderFoot claims was found.

This issue fundamentally comes down to the various dark web search engines employing fairly fuzzy matching of search terms. For instance, searching a dark web search engine for Frank Smith may return any indexed dark web site mentioning just Frank or Smith, or a search for someone@example.org may return all pages mentioning example.org.

There is also the second issue that dealing with names is a hard problem. Two people can share the same name and names can have different representations.

Fortunately there are a few options for limiting the noise in such cases:

  1. Ensure that for dark web modules, the option to fetch the dark web pages via TOR is set to True. This is available in both Open Source and HX versions of SpiderFoot.

  2. In SpiderFoot HX, you can disable human name iteration (disabled by default) if you find that performing any operation on a human name is just too noisy:

  3. As shown in the image from the first point, you can also disable handling human names at the per-module level in both the Open Source and HX versions of SpiderFoot. Note that this option has no impact in HX if you have disabled human name iteration at the scan level, as shown in the second point.

Note: There is also the possibility that the dark web content used to contain the searched term but no longer does, or that the dark web content contains the searched term in the HTML of the page content but is not visible on the page.