Public Web Data Infrastructure

The how & why
behind our data.

ScanVora is Ifetchly's internal web crawling system. We analyze millions of publicly accessible web pages to source and organize business data while fully respecting data protection standards.

10M+
Web pages crawled
every day
200M+
Public sources
of data
50M+
Professional contacts
indexed

About ScanVora

An internal system built for responsible data collection.

ScanVora is a web crawler designed to scan publicly accessible pages across the internet. Like a search engine, it maintains an index of web pages and uses them to source structured business data that powers Ifetchly's enrichment tools.

The system is part of Ifetchly's internal data infrastructure. It is not a commercial product and is not available for third-party use. This page exists to provide complete transparency about how ScanVora operates.

Synchronized with the public web.

Every day, ScanVora visits millions of web pages to find and organize business information.

Like a search engine, we maintain an index of publicly accessible web content and use it to source verified business data. We believe data should be processed transparently in a manner every stakeholder approves of. To achieve this, we follow four principles.

Disclosed public sources

All data collected by ScanVora comes from publicly accessible web pages. We maintain records of where and when information was found.

Stale data is removed

Information that no longer has public sources is automatically removed from our systems. We re-verify data regularly to ensure accuracy.

Data subjects have control

Individuals can request removal of their information at any time by contacting our team. Removal requests are processed promptly.

Website owners have control

You can identify and control ScanVoraBot using the standard robots.txt protocol. We respect all disallow and crawl-delay directives.

Responsible crawling standards.

ScanVora is built with ethical data collection at its core.

Respects robots.txt — Every robots.txt file is read and followed. If crawling is not permitted, those pages are never accessed.
Public pages only — Only content freely available to any internet visitor is accessed. No login-protected or restricted content is targeted.
No authentication bypass — Pages requiring login, CAPTCHA, or any access control are never accessed or circumvented.
No private systems — ScanVora operates exclusively on the public web. Private databases, internal networks, and restricted infrastructure are never accessed.
Controlled crawl rate — Maximum one request every two seconds per domain ensures other visitors are never affected by crawling activity.
Clear identification — ScanVoraBot identifies itself in every request through its user-agent string so website owners always know who is visiting.

ScanVoraBot

Our crawler identifies itself on every request.

Crawler name ScanVoraBot
Version 1.0
User-Agent string Mozilla/5.0 (compatible; ScanVoraBot/1.0; +https://scanvora.com)
Obeys robots.txt Yes
Crawl rate limit 1 request / 2 seconds per domain

What is ScanVoraBot doing on your website?

ScanVoraBot analyzes public web pages and navigates at a controlled pace. It never visits more than one page every two seconds for any single website, ensuring that your other visitors are not slowed down by crawling activity. The crawler only reads publicly visible content and does not submit forms, create accounts, or interact with your site in any other way.

How to block ScanVoraBot

ScanVoraBot strictly respects robots.txt rules. If you wish to block it from accessing your website, add the following lines to your robots.txt file:

# Block ScanVoraBot from crawling your site
User-agent: ScanVoraBot
Disallow: /

Changes to robots.txt are respected on the next crawl cycle. If you need immediate action, contact us directly and we will process your request within one business day.

About Ifetchly

Ifetchly develops web intelligence tools and internal data processing infrastructure. The company builds systems that help organize and make sense of publicly available business information on the internet.

ScanVora is one component of Ifetchly's internal technology stack. It is not offered as a standalone product or service. For more information, visit ifetchly.com.

Contact

If you are a website owner with questions about ScanVoraBot's crawling activity, or if you would like to request data removal or site exclusion, our team is here to help.

Crawling inquiries & removal requests [email protected]
General contact [email protected]