Change Data Capture (CDC) for CRM Syncs: Why Polling Breaks and How to Fix It
A practical guide to change data capture — why timestamp-based sync fails, the patterns that actually work, and how to implement CDC in your automation stack.
Haroon Mohamed
AI Automation & Lead Generation
The problem: polling-based sync breaks
Most cross-system CRM syncs work like this:
- Every 15 minutes, a workflow runs
- Query System A: "give me contacts updated since 15 minutes ago"
- Update each one in System B
It feels reasonable. It's also fragile.
Problems:
- Clock drift: What's "15 minutes ago" in your automation tool vs. the CRM? They may differ by seconds.
- Race conditions: A contact updated during the sync window might be missed.
- Time zone issues: CRM reports UTC. Your automation tool runs in local time. One-hour gaps.
- Duplicate processing: If a sync fails and retries, the same records may process twice.
- Bulk updates miss data: A batch update might touch 1,000 records in a second; the sync might query the API before the update is visible.
After 3-6 months of polling-based sync, you find drift — records that are inconsistent between systems. Some records never synced. Some synced twice.
What CDC is
Change Data Capture (CDC) means reacting to specific changes as they happen, not polling for them.
Two ways to do CDC:
1. Webhook-based CDC
The source system pushes changes to your automation as they happen.
- GoHighLevel → webhook on "Contact Updated" → your flow processes the change
- HubSpot → webhook on "Contact Property Changed" → your flow processes
No polling. No time window. Just "when X changes, do Y."
2. Log-based CDC
The source system publishes a stream of changes that downstream consumers read.
- Postgres → logical replication → Debezium → Kafka → your consumers
- Databases → binlog → CDC tools
This is enterprise-scale. Overkill for most automation stacks but worth knowing exists.
Why webhook-based CDC is the right fit for most automation
For a CRM → CRM sync, database → CRM, or CRM → custom app, webhook-based CDC solves the problems of polling:
- No clock drift: the system tells you when a change happened.
- No race conditions: the webhook fires immediately on change.
- No duplicates: the webhook fires once per change (usually — more on this below).
- Real-time: changes propagate in seconds, not minutes.
CDC patterns
Pattern 1: Direct webhook → sync
System A changes → webhook to Make/n8n → update System B
Simplest case. Works for:
- New contact in GHL → create in HubSpot
- Deal stage change in HubSpot → update in custom dashboard
Works when: change volume is low to moderate, and the receiving side can handle each event synchronously.
Pattern 2: Webhook → queue → process
System A changes → webhook → message queue → workers process
Used when: high volume or slow downstream processing.
Implementation in automation stack:
- Webhook writes to Supabase/SQS/Redis queue
- Scheduled workflow processes items from queue
- Retries on failure
Pattern 3: Webhook → event sourcing
System A changes → webhook → append to event log → projections rebuild state
Enterprise-grade. Keeps a full history of every change. Can rebuild any state from the log.
Usually overkill for small automation stacks. But worth noting: if compliance requires full audit, event sourcing is the pattern.
Implementation: CRM → custom dashboard CDC
Scenario: you want a custom dashboard that stays in sync with your CRM in real-time.
With GoHighLevel
- Create a GHL workflow: "Contact Updated" → Webhook Out
- Configure webhook URL: your Supabase edge function or n8n webhook endpoint
- Send the contact's key fields in the payload
- Receiver (Supabase edge function, Make, or n8n):
- Parse payload
- UPSERT contact in your Supabase contacts table
- Update any dependent aggregate tables
With HubSpot
- Create a HubSpot workflow: property-change trigger
- Webhook action → POST to your endpoint
- Same receiver logic
With Stripe
Stripe webhooks are always CDC. Subscribe to events like customer.updated, invoice.paid — they fire automatically.
Handling webhook reliability
Webhooks aren't perfectly reliable. Key issues and mitigations:
Issue: webhook delivery fails
Your endpoint is down → webhook fails to deliver.
Mitigation: most providers retry with exponential backoff (Stripe retries for 3 days). But some providers (GoHighLevel) have limited retry. Missed webhooks = data drift.
Fallback: run a nightly polling sync as a safety net. Catches any missed webhooks without the timing issues of polling-only.
Issue: duplicate webhook delivery
Same event delivered twice (retry logic, network hiccups).
Mitigation: idempotency. Every webhook has an event ID. Track which event IDs you've processed. Skip duplicates.
-- Idempotency table
CREATE TABLE processed_events (
event_id TEXT PRIMARY KEY,
processed_at TIMESTAMP DEFAULT NOW()
);
Before processing: check if event_id exists. If yes, skip. If no, process and insert event_id.
Issue: out-of-order delivery
Event B fires at 10:00:01 and event A at 10:00:00. You receive B first.
Mitigation: use source timestamps. If the incoming event's timestamp is older than the last-processed event for that record, skip or merge carefully.
Issue: webhook spoofing
Malicious actor POSTs fake data to your webhook URL.
Mitigation: verify webhook signatures (Stripe, Shopify, GitHub support HMAC signatures). For providers without signatures (GHL), use shared secret in URL or headers.
Implementation: idempotent upsert pattern
This is the workhorse pattern for CDC ingestion:
INSERT INTO contacts (
external_id, email, name, phone, source, last_updated_at
)
VALUES (
'ghl_contact_abc123',
'jane@example.com',
'Jane Doe',
'+15551234567',
'facebook',
'2026-04-24T15:30:45Z'
)
ON CONFLICT (external_id)
DO UPDATE SET
email = EXCLUDED.email,
name = EXCLUDED.name,
phone = EXCLUDED.phone,
source = EXCLUDED.source,
last_updated_at = EXCLUDED.last_updated_at
WHERE contacts.last_updated_at < EXCLUDED.last_updated_at;
Key elements:
ON CONFLICThandles duplicates gracefully (no error on second insert)WHERE contacts.last_updated_at < EXCLUDED.last_updated_atensures out-of-order events don't overwrite newer data
Bi-directional sync is harder
If changes can happen on both sides (CRM A ↔ CRM B), CDC is trickier:
Problem: infinite loops
Change in A → webhook to B → update in B → webhook to A → update in A → back to B...
Mitigation: mark updates as "sourced from sync." When webhook fires for an update marked as sync-sourced, skip it.
Problem: conflict resolution
Same contact updated in both A and B within seconds. Which wins?
Mitigation: define a source of truth per field. E.g., CRM A owns contact fields, CRM B owns deal fields. Or last-write-wins with timestamp comparison.
Bidirectional sync is complex enough that many teams avoid it and run unidirectional syncs with clear ownership.
When NOT to use CDC
- Batch workloads: processing 50,000 records nightly is fine with a scheduled polling job.
- Cross-organization sync: if the source system can't push webhooks to you, polling is your only option.
- Analytics data: real-time CDC for dashboards is often unnecessary. 15-minute freshness is usually fine.
Tools for CDC in the automation stack
For small stacks
- Make.com / n8n / Zapier with webhook triggers: covers 90% of cases
- Supabase edge functions: HTTP endpoints that can receive webhooks and write to Postgres
For larger stacks
- Fivetran / Airbyte: managed CDC platforms that connect CRMs to data warehouses
- Segment: customer data platform with CDC-like event routing
- PostHog / Mixpanel: event streaming platforms
For custom apps
- Supabase realtime: Postgres → websocket updates for client apps
- Hasura: GraphQL subscriptions backed by Postgres
- Kafka / RabbitMQ: enterprise message queues
Migration from polling to CDC
If you're already running polling:
- Build the webhook-based sync alongside polling
- Run both in parallel for 2-4 weeks
- Compare results — ensure webhook version catches everything polling does
- Switch primary to webhook
- Keep polling as nightly fallback for missed webhooks
- After 3 months of clean operation, retire polling (or keep as insurance)
Sources
Change Data Capture concepts are industry-standard, documented in database replication literature (Postgres logical replication docs at postgresql.org, Kafka Streams documentation, Debezium documentation). Webhook reliability patterns reference Stripe's webhook best practices (stripe.com/docs/webhooks/best-practices), GitHub's webhook guide, and similar provider docs. Implementation examples are standard patterns for Supabase / Postgres deployments.
Running into drift between your CRM and your custom database? Let's talk — migrating from polling to CDC is usually a 1-2 week engagement with dramatic reliability improvements.
Need This Built?
Ready to implement this for your business?
Everything in this article reflects real systems I've built and operated. Let's talk about yours.
Haroon Mohamed
Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.
Related articles
Time-Series Data for Marketing Analytics: When PostgreSQL Beats a Real TSDB
Time-series data is data with a timestamp where the timestamp matters. Every event has a "when," and you analyze across the time dimension constantly. For marketing analytics, this is most of the dat…
Schema Migrations Without Downtime: How to Evolve Your CRM Database Safely
In a small operation, schema changes feel low-risk. You add a custom field. You rename a tag. You change a dropdown to a multi-select. The change works in the CRM UI and you move on. What you didn't …