Legal scale weighing email collection methods against privacy regulations
Guides

What Is Email Harvesting? Legal vs Illegal Methods in 2026

Ziwa··8 min read

Defining the Spectrum

Email harvesting describes the collection of email addresses for use in marketing or outreach — but that definition covers a spectrum from entirely legitimate to clearly illegal, with a large gray zone in between. Understanding where specific methods fall on that spectrum is essential before you build any outreach workflow.

At the legitimate end: manually copying publicly listed business emails from company websites. At the illegal end: deploying bots to systematically extract email addresses from websites without permission. In between: data broker APIs, social media profile enrichment, web scraping with manual review, and purchased lists from third parties whose compliance practices you can't fully audit.

The law doesn't neatly separate these categories — it evaluates method, consent, jurisdiction, and use. Let's go through each dimension.

What the Laws Actually Say

CAN-SPAM (US). The Controlling the Assault of Non-Solicited Pornography And Marketing Act applies to all commercial email sent from or to US recipients. It does not prohibit collecting publicly available email addresses. It does require accurate sender identification, a physical mailing address in every email, a functioning opt-out mechanism, and honoring opt-out requests within 10 business days. Violating CAN-SPAM carries penalties up to $51,744 per email.

GDPR (EU). The General Data Protection Regulation is significantly stricter. An email address is personal data. To process it, you need a legal basis. For B2B marketing, the most commonly relied-upon basis is "legitimate interest" — but this requires a balancing test that considers whether the recipient would reasonably expect to receive your emails, whether you've given them a way to opt out, and whether your interest outweighs their privacy rights. Penalties can reach €20 million or 4% of global annual revenue.

CASL (Canada). Canada's Anti-Spam Legislation is among the strictest in the world. It requires express or implied consent before sending commercial electronic messages. "Implied consent" is narrowly defined — it applies when you have an existing business relationship, not simply because someone's email appears on a public website.

CFAA (US Computer Fraud and Abuse Act). The CFAA is what makes automated scraping legally precarious. Courts have reached different conclusions on whether scraping publicly available websites violates the CFAA, but using bots to extract data from sites that explicitly prohibit it in their terms of service creates real legal exposure. The hiQ vs. LinkedIn case established that scraping publicly available data is not inherently a CFAA violation, but the legal question remains unsettled.

Methods: Where Each Falls

Manually copying emails from company contact pages. Generally legal under all major frameworks. The email is publicly listed for business purposes. Under GDPR's legitimate interest, sending a relevant B2B email to a business address publicly listed for business contact is defensible. Under CAN-SPAM, you just need to follow the opt-out and header requirements.

Using a data broker API (PDL, Clearbit, ZoomInfo). Legal when the provider has proper data licensing agreements and compliance frameworks. The reputable providers have terms that require you to use data for legitimate business purposes and prohibit building databases for resale. GDPR compliance depends on your use case — the provider's compliance doesn't transfer to you automatically.

Automated scraping of websites with an email extraction bot. High risk. Likely violates the terms of service of most sites, potentially violates the CFAA, and definitively violates GDPR if the emails belong to EU residents and you can't document a valid legal basis for processing them.

Buying lists from unknown third parties. High risk. You don't know how the emails were collected, whether consent was properly obtained, whether the list contains EU residents, or whether the data is current. If the list was compiled through illegal scraping, downstream use creates shared liability in some jurisdictions.

Social media enrichment through OSINT tools. Depends on implementation. Tools like Ziwa use the People Data Labs API, which aggregates publicly available data through licensed data partnerships rather than direct scraping. This approach has a cleaner compliance posture than bot-based scraping. You're still responsible for how you use the resulting email addresses.

Building a Compliant Email Collection Workflow

If you need to build an email list for B2B outreach without legal exposure, the workflow is:

  1. Define your ICP precisely. The narrower your target, the more defensible your legitimate interest claim under GDPR. "Everyone in SaaS" is not a legitimate interest. "CTOs at Series A SaaS companies in the UK with 10–50 employees who use Salesforce" has a clear business rationale.
  2. Use licensed data sources. Pull email addresses from providers with documented compliance programs. Use bulk enrichment tools that can process your ICP list efficiently.
  3. Validate before sending. Run every email through a validation tool (ZeroBounce, NeverBounce, or similar) before your first send. Remove invalid addresses to protect your domain reputation.
  4. Include opt-out in every email. Required under CAN-SPAM. Best practice under GDPR. Process opt-out requests within 10 business days and update your suppression list immediately.
  5. Document your legal basis. If you're emailing EU residents, document why you have a legitimate interest in contacting each category of person. This becomes your defense if you receive a GDPR complaint.

For finding email addresses from social profiles specifically, Ziwa's Prospects tool and batch extraction tools give you access to PDL's enrichment data with a pay-per-result model — you only pay when an email is found, which keeps costs proportional to your actual list size. See pricing here.

Frequently Asked Questions

Is email harvesting legal?
It depends entirely on the method and jurisdiction. Collecting publicly listed business emails from company websites is generally legal in the US. Scraping emails with automated bots often violates the Computer Fraud and Abuse Act (CFAA) in the US and similar laws in the EU. Using purchased lists from reputable data providers with GDPR-compliant data handling is legal with proper consent frameworks.
What is the difference between email scraping and email harvesting?
The terms are often used interchangeably, but technically: harvesting refers to any collection method, while scraping specifically refers to automated extraction of emails from websites using bots. Scraping is more likely to violate terms of service and potentially laws like the CFAA.
Does CAN-SPAM apply to cold B2B emails?
Yes. CAN-SPAM applies to all commercial email in the US, including B2B cold outreach. It requires accurate sender identification, a physical address, and a clear opt-out mechanism. GDPR is stricter for EU recipients and requires a valid legal basis for processing the email address.
What's the safest way to collect business emails for outreach?
Use a reputable data provider with clear compliance documentation. Collect emails only for people within your target ICP. Include opt-out links. Avoid scraping websites directly. This approach keeps you on the right side of CAN-SPAM and gives you defensible grounds under GDPR's legitimate interest provision for B2B outreach.

Related Articles

Ready to extract contacts?

Try Ziwa free. Pay only when you get results.

Get Started Free