Data Engineering7 min read1 May 2026

Importing CSV Data into Your CRM Without Breaking Everything

A practical checklist for CSV imports into GoHighLevel, HubSpot, or any CRM — the specific mistakes that cause data corruption, and the pre-import steps that prevent them.

H

Haroon Mohamed

AI Automation & Lead Generation

Why CSV imports go wrong

Most CRM CSV imports happen in one of these contexts:

  • Migrating from an old tool
  • Importing a purchased lead list
  • Uploading historical data from a spreadsheet
  • Bulk-loading leads from an event or campaign

Any of these can corrupt your CRM if done wrong:

  • Duplicates that nuke your deduplication
  • Encoding errors that turn "José" into "Jos??"
  • Date fields that don't parse
  • Fields that end up in wrong columns
  • Contacts that trigger welcome emails to people who have already been customers for 2 years

Here's the checklist that prevents each.


The pre-import checklist

Before uploading:

1. Back up existing data

Export your current CRM contacts to CSV before any major import. If the import goes wrong, you can compare or rollback.

2. Verify file encoding is UTF-8

Open the CSV in a text editor (VS Code, Sublime). Check the encoding indicator. If it says "UTF-16" or "ISO-8859-1," convert to UTF-8.

In VS Code: Bottom right → click encoding → "Save with Encoding" → UTF-8.

Command line (on any file):

iconv -f UTF-16 -t UTF-8 input.csv > output.csv

Why: most CRMs expect UTF-8. Wrong encoding = garbled special characters (accents, emojis, non-Latin text).

3. Check headers match CRM fields

Open the CSV. First row = headers. Verify every header maps to an existing field in your CRM. If a header doesn't match, either:

  • Rename the CSV column to match the CRM field
  • Create a new custom field in the CRM before importing

Missing field mapping = data ends up in the wrong place or gets dropped silently.

4. Normalize phone numbers

Run a find-and-replace in Excel/Google Sheets:

  • Remove all non-digit characters
  • Add country code prefix if missing
  • Target: E.164 format (+15551234567)

Before:

(555) 123-4567
555.123.4567
1-555-123-4567

After:

+15551234567
+15551234567
+15551234567

Why: inconsistent phone format breaks dedup, SMS sends, and display.

5. Normalize emails

  • Lowercase all emails
  • Trim leading/trailing whitespace
  • Remove any invalid characters

Formula in Google Sheets:

=LOWER(TRIM(A2))

Why: JOHN@gmail.com and john@gmail.com are the same person but will dedup as two if case isn't normalized.

6. Check date formats

Pick one format (ISO 8601 recommended: YYYY-MM-DD) and make every date column consistent.

Watch out for:

  • 01/02/2026 — is that Jan 2 or Feb 1? Depends on locale. Convert to unambiguous 2026-01-02.
  • Excel auto-converting dates — "7/8" becomes "July 8, 2024" or "August 7, 2024" depending on regional settings.

Preformat date columns as text in Excel before importing.

7. Remove obvious garbage

  • Empty rows
  • Rows with only 1-2 fields filled
  • Test data ("John Test", "asdf@asdf.com")
  • Rows where required fields (email OR phone) are missing

8. Dedupe against existing CRM contacts

Before importing, compare your CSV emails (normalized) to existing CRM contacts. Two approaches:

Approach A: dedupe before import

  • Export existing contacts from CRM
  • Compare CSV emails to existing emails (Excel VLOOKUP or pandas merge)
  • Remove CSV rows that already exist
  • Import only new rows

Approach B: let CRM dedupe

  • Most CRMs have "update existing if duplicate" option on import
  • Map email as the matching key
  • CRM updates existing + adds new

Approach A gives more control. Approach B is faster.

9. Limit import size

Even if your CSV has 50,000 rows, consider importing in batches of 5,000-10,000. Why:

  • Easier to spot errors
  • Faster to debug if something breaks
  • Easier to rollback a batch than 50k rows
  • Some CRMs slow down on very large imports

10. Tag imports

Add a column source or import_batch with a unique value (e.g., march-2026-csv-import). After import, every contact has this tag, so you can:

  • Find all imported contacts if you need to roll back
  • Trigger specific workflows only for imported contacts
  • Attribute revenue to the import

The mistakes that cause the most damage

Mistake 1: Importing without dedup

Consequence: 5,000 contacts become 8,500 contacts, with 3,500 duplicates. Cleanup takes a week.

Mistake 2: Triggering welcome sequences on historical contacts

Consequence: 2,000 existing customers receive a "Welcome! We're excited to have you!" email. Many unsubscribe. Some churn.

Prevention: Before import, pause or disable welcome workflows for 24 hours. OR: add a tag on import that excludes contacts from new-contact workflows.

Mistake 3: Wrong encoding

Consequence: Spanish/French/Arabic names display as gibberish. Customer records permanently corrupted.

Prevention: Always verify UTF-8 before import. Spot-check a few rows with special characters after import.

Mistake 4: Phone numbers in wrong format

Consequence: SMS workflows fail for 60% of imported contacts. Twilio errors. No one notices until a client complains.

Prevention: Normalize to E.164 before import. Validate with libphonenumber or a regex check.

Mistake 5: Dates in wrong format

Consequence: "Date created" fields show imports all on the import date, not the original date. Reports break.

Prevention: Map CSV dates to a custom field (like original_date_created) if the CRM won't let you override created_at.


Field mapping best practices

When mapping CSV columns to CRM fields:

  • Email: Map to the email field (primary)
  • Phone: Map to the phone field (primary). If there are multiple phones, create custom fields (mobile, work, home).
  • Name: Split into first name + last name in CRM. If your CSV has "Full Name," split first:
    First Name: LEFT(A2, FIND(" ", A2)-1)
    Last Name: MID(A2, FIND(" ", A2)+1, LEN(A2))
    
  • Source: Don't map random "how you found us" text to source. Map to a canonical source value ("cold-import-march-2026").
  • Tags: Map multi-select tags via comma-separated values.
  • Custom fields: Map only if fields exist. Create CRM custom fields BEFORE import.

After the import

1. Verify count

Expected rows in CSV vs. actual imported contacts. Should match (or new rows = imported, duplicates = merged/skipped).

2. Spot check

Open 10-20 random contacts. Verify all fields populated correctly. Check non-Latin characters if present.

3. Check for unintended workflow triggers

Did any welcome sequence fire? Did any SMS go out? Check logs.

4. Tag and segment

Apply the import tag (if not done at import). Create a CRM list/segment for "imported this batch" so you can track performance.

5. Document

In your ops doc or Notion: record the import date, source, size, and any issues. Next time someone imports, they learn from this one.


CRM-specific notes

GoHighLevel

  • Imports via Contacts → Import Contacts
  • Supports "update existing" for dedup
  • Custom fields must exist before import
  • Large imports can take 10-30 minutes
  • Gotcha: phone numbers get auto-formatted, but verify E.164 output

HubSpot

  • Imports via Contacts → Import
  • Strong field mapping UI
  • Supports matching by email, phone, or contact ID
  • Handles up to 250,000 rows per import
  • Gotcha: custom properties must exist with correct type

Pipedrive

  • Imports via Contacts → Import Data
  • Less flexible field mapping
  • Supports CSV and Excel
  • Smaller bulk import size limits

Salesforce

  • Data Loader tool for large imports
  • Excel Connector for smaller
  • Very strict on field types and validation rules
  • Gotcha: trigger bypass in settings may be required for large imports

Automation to prevent this in the future

The best way to avoid CSV import headaches: don't rely on CSV imports. Get data into your CRM via:

  • Form integrations (webhook-based, real-time)
  • API integrations (direct tool-to-tool connection)
  • Scheduled sync (e.g., Make.com pulling leads from a data provider nightly)

CSV imports should be for one-time migrations, not ongoing operations.


Sources

CRM import behaviors verified against official docs for GoHighLevel, HubSpot, Pipedrive, and Salesforce as of April 2026. CSV encoding standards (RFC 4180 for CSV format, UTF-8 for encoding) are standard references. Field normalization patterns align with best practices in data engineering literature.

Running a large data import and nervous about breaking things? Let's talk — I've done dozens of CRM migrations and can help you plan yours.

Need This Built?

Ready to implement this for your business?

Everything in this article reflects real systems I've built and operated. Let's talk about yours.

H

Haroon Mohamed

Full-stack automation, AI, and lead generation specialist. 2+ years running 13+ concurrent client campaigns using GoHighLevel, multiple AI voice providers, Zapier, APIs, and custom data pipelines. Founder of HMX Zone.

ShareShare on X →