Data Engineering7 min read11 May 2026

Change Data Capture (CDC) for CRM Syncs: Why Polling Breaks and How to Fix It

A practical guide to change data capture — why timestamp-based sync fails, the patterns that actually work, and how to implement CDC in your automation stack.

Haroon Mohamed

AI Automation & Lead Generation

The problem: polling-based sync breaks

Most cross-system CRM syncs work like this:

Every 15 minutes, a workflow runs
Query System A: "give me contacts updated since 15 minutes ago"
Update each one in System B

It feels reasonable. It's also fragile.

Problems:

Clock drift: What's "15 minutes ago" in your automation tool vs. the CRM? They may differ by seconds.
Race conditions: A contact updated during the sync window might be missed.
Time zone issues: CRM reports UTC. Your automation tool runs in local time. One-hour gaps.
Duplicate processing: If a sync fails and retries, the same records may process twice.
Bulk updates miss data: A batch update might touch 1,000 records in a second; the sync might query the API before the update is visible.

After 3-6 months of polling-based sync, you find drift — records that are inconsistent between systems. Some records never synced. Some synced twice.

What CDC is

Change Data Capture (CDC) means reacting to specific changes as they happen, not polling for them.

Two ways to do CDC:

1. Webhook-based CDC

The source system pushes changes to your automation as they happen.

GoHighLevel → webhook on "Contact Updated" → your flow processes the change
HubSpot → webhook on "Contact Property Changed" → your flow processes

No polling. No time window. Just "when X changes, do Y."

2. Log-based CDC

The source system publishes a stream of changes that downstream consumers read.

Postgres → logical replication → Debezium → Kafka → your consumers
Databases → binlog → CDC tools

This is enterprise-scale. Overkill for most automation stacks but worth knowing exists.

Why webhook-based CDC is the right fit for most automation

For a CRM → CRM sync, database → CRM, or CRM → custom app, webhook-based CDC solves the problems of polling:

No clock drift: the system tells you when a change happened.
No race conditions: the webhook fires immediately on change.
No duplicates: the webhook fires once per change (usually — more on this below).
Real-time: changes propagate in seconds, not minutes.

CDC patterns

Pattern 1: Direct webhook → sync

System A changes → webhook to Make/n8n → update System B

Simplest case. Works for:

New contact in GHL → create in HubSpot
Deal stage change in HubSpot → update in custom dashboard

Works when: change volume is low to moderate, and the receiving side can handle each event synchronously.

Pattern 2: Webhook → queue → process

System A changes → webhook → message queue → workers process

Used when: high volume or slow downstream processing.

Implementation in automation stack:

Webhook writes to Supabase/SQS/Redis queue
Scheduled workflow processes items from queue
Retries on failure

Pattern 3: Webhook → event sourcing

System A changes → webhook → append to event log → projections rebuild state

Enterprise-grade. Keeps a full history of every change. Can rebuild any state from the log.

Usually overkill for small automation stacks. But worth noting: if compliance requires full audit, event sourcing is the pattern.

Implementation: CRM → custom dashboard CDC

Scenario: you want a custom dashboard that stays in sync with your CRM in real-time.

With GoHighLevel

Create a GHL workflow: "Contact Updated" → Webhook Out
Configure webhook URL: your Supabase edge function or n8n webhook endpoint
Send the contact's key fields in the payload
Receiver (Supabase edge function, Make, or n8n):
- Parse payload
- UPSERT contact in your Supabase contacts table
- Update any dependent aggregate tables

With HubSpot

Create a HubSpot workflow: property-change trigger
Webhook action → POST to your endpoint
Same receiver logic

With Stripe

Stripe webhooks are always CDC. Subscribe to events like customer.updated, invoice.paid — they fire automatically.

Handling webhook reliability

Webhooks aren't perfectly reliable. Key issues and mitigations:

Issue: webhook delivery fails

Your endpoint is down → webhook fails to deliver.

Mitigation: most providers retry with exponential backoff (Stripe retries for 3 days). But some providers (GoHighLevel) have limited retry. Missed webhooks = data drift.

Fallback: run a nightly polling sync as a safety net. Catches any missed webhooks without the timing issues of polling-only.

Issue: duplicate webhook delivery

Same event delivered twice (retry logic, network hiccups).

Mitigation: idempotency. Every webhook has an event ID. Track which event IDs you've processed. Skip duplicates.

-- Idempotency table
CREATE TABLE processed_events (
  event_id TEXT PRIMARY KEY,
  processed_at TIMESTAMP DEFAULT NOW()
);

Before processing: check if event_id exists. If yes, skip. If no, process and insert event_id.

Issue: out-of-order delivery

Event B fires at 10:00:01 and event A at 10:00:00. You receive B first.

Mitigation: use source timestamps. If the incoming event's timestamp is older than the last-processed event for that record, skip or merge carefully.

Issue: webhook spoofing

Malicious actor POSTs fake data to your webhook URL.

Mitigation: verify webhook signatures (Stripe, Shopify, GitHub support HMAC signatures). For providers without signatures (GHL), use shared secret in URL or headers.

Implementation: idempotent upsert pattern

This is the workhorse pattern for CDC ingestion:

INSERT INTO contacts (
  external_id, email, name, phone, source, last_updated_at
)
VALUES (
  'ghl_contact_abc123', 
  'jane@example.com', 
  'Jane Doe', 
  '+15551234567', 
  'facebook', 
  '2026-04-24T15:30:45Z'
)
ON CONFLICT (external_id) 
DO UPDATE SET
  email = EXCLUDED.email,
  name = EXCLUDED.name,
  phone = EXCLUDED.phone,
  source = EXCLUDED.source,
  last_updated_at = EXCLUDED.last_updated_at
WHERE contacts.last_updated_at < EXCLUDED.last_updated_at;

Key elements:

ON CONFLICT handles duplicates gracefully (no error on second insert)
WHERE contacts.last_updated_at < EXCLUDED.last_updated_at ensures out-of-order events don't overwrite newer data

Bi-directional sync is harder

If changes can happen on both sides (CRM A ↔ CRM B), CDC is trickier:

Problem: infinite loops

Change in A → webhook to B → update in B → webhook to A → update in A → back to B...

Mitigation: mark updates as "sourced from sync." When webhook fires for an update marked as sync-sourced, skip it.

Problem: conflict resolution

Same contact updated in both A and B within seconds. Which wins?

Mitigation: define a source of truth per field. E.g., CRM A owns contact fields, CRM B owns deal fields. Or last-write-wins with timestamp comparison.

Bidirectional sync is complex enough that many teams avoid it and run unidirectional syncs with clear ownership.

When NOT to use CDC

Batch workloads: processing 50,000 records nightly is fine with a scheduled polling job.
Cross-organization sync: if the source system can't push webhooks to you, polling is your only option.
Analytics data: real-time CDC for dashboards is often unnecessary. 15-minute freshness is usually fine.

Tools for CDC in the automation stack

For small stacks

Make.com / n8n / Zapier with webhook triggers: covers 90% of cases
Supabase edge functions: HTTP endpoints that can receive webhooks and write to Postgres

For larger stacks

Fivetran / Airbyte: managed CDC platforms that connect CRMs to data warehouses
Segment: customer data platform with CDC-like event routing
PostHog / Mixpanel: event streaming platforms

For custom apps

Supabase realtime: Postgres → websocket updates for client apps
Hasura: GraphQL subscriptions backed by Postgres
Kafka / RabbitMQ: enterprise message queues

Migration from polling to CDC

If you're already running polling:

Build the webhook-based sync alongside polling
Run both in parallel for 2-4 weeks
Compare results — ensure webhook version catches everything polling does
Switch primary to webhook
Keep polling as nightly fallback for missed webhooks
After 3 months of clean operation, retire polling (or keep as insurance)

Sources

Change Data Capture concepts are industry-standard, documented in database replication literature (Postgres logical replication docs at postgresql.org, Kafka Streams documentation, Debezium documentation). Webhook reliability patterns reference Stripe's webhook best practices (stripe.com/docs/webhooks/best-practices), GitHub's webhook guide, and similar provider docs. Implementation examples are standard patterns for Supabase / Postgres deployments.

Running into drift between your CRM and your custom database? Let's talk — migrating from polling to CDC is usually a 1-2 week engagement with dramatic reliability improvements.

Need This Built?

Ready to implement this for your business?

Everything in this article reflects real systems I've built and operated. Let's talk about yours.

Build My System See Live Results →

Haroon Mohamed

Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.

ShareShare on X →

Data Engineering8 min read

Time-Series Data for Marketing Analytics: When PostgreSQL Beats a Real TSDB

Time-series data is data with a timestamp where the timestamp matters. Every event has a "when," and you analyze across the time dimension constantly. For marketing analytics, this is most of the dat…

26 Jun 2026Read →

Data Engineering8 min read

Schema Migrations Without Downtime: How to Evolve Your CRM Database Safely

In a small operation, schema changes feel low-risk. You add a custom field. You rename a tag. You change a dropdown to a multi-select. The change works in the CRM UI and you move on. What you didn't …

25 Jun 2026Read →

Change Data Capture (CDC) for CRM Syncs: Why Polling Breaks and How to Fix It

The problem: polling-based sync breaks

What CDC is

1. Webhook-based CDC

2. Log-based CDC

Why webhook-based CDC is the right fit for most automation

CDC patterns

Pattern 1: Direct webhook → sync

Pattern 2: Webhook → queue → process

Pattern 3: Webhook → event sourcing

Implementation: CRM → custom dashboard CDC

With GoHighLevel

With HubSpot

With Stripe

Handling webhook reliability

Issue: webhook delivery fails

Issue: duplicate webhook delivery

Issue: out-of-order delivery

Issue: webhook spoofing

Implementation: idempotent upsert pattern

Bi-directional sync is harder

Problem: infinite loops

Problem: conflict resolution

When NOT to use CDC

Tools for CDC in the automation stack

For small stacks

For larger stacks

For custom apps

Migration from polling to CDC

Sources

Ready to implement this for your business?

Related articles

Time-Series Data for Marketing Analytics: When PostgreSQL Beats a Real TSDB

Schema Migrations Without Downtime: How to Evolve Your CRM Database Safely