Data loads are supposed to be the “easy” work. You export a CSV, you import a CSV, everyone goes home.
And then a flow fires, an integration wakes up, a validation rule you forgot exists blocks half the rows, and suddenly you’re in detective mode with 40 tabs open.
This is the checklist-y, slightly paranoid way I run loads now—because it’s faster than cleaning up the mess later.
The three questions you answer first
1) Where is truth?
If the “source of truth” is a spreadsheet, pause. Decide what system actually owns correctness (ERP, billing, marketing platform, etc.). Otherwise you’ll “fix Salesforce” today and re-break it on the next sync.
2) What’s the blast radius?
List what can fire on the objects you’ll touch:
- flows / Apex triggers
- assignment rules / auto-response rules
- rollups / scheduled jobs
- downstream integrations
If you can’t list them, your first run is a sandbox run.
3) What’s the rollback?
Rollback is not “we’ll fix it manually.” Rollback is a plan that works when you’re tired and the business is watching.
Minimum viable rollback:
- export the records you’ll touch with Id + every field you will change
- keep the export somewhere safe
- be able to update those values back
Add one field that pays for itself: Import Run Id
Add a text field on high-load objects:
Import_Run_Id__c
Populate it in every CSV with a unique run id:
2026-01-30_lead_cleanup_v2billing_backfill_2026w05
This gives you:
- quick validation (“show me everything we changed”)
- easy reporting (“how many records did we touch?”)
- targeted rollback (“undo this run”)
Matching rules that don’t create duplicates
Prefer Ids when you have them.
If you must match by External ID:
- keep the list small
- make it unique where possible
- document which system owns it
External IDs fail when multiple systems “help.”
Automation: default ON, bypass only with control
Default stance:
If production automation can’t survive normal data change, that automation needs work.
When bypass is justified, make bypass permission-based, not record-based.
Avoid:
Bypass_All_Automation__ccheckbox on records- hardcoding integration usernames into every flow
Prefer:
- Custom Permission:
Data_Load_Bypass - Flow entry condition:
NOT($Permission.Data_Load_Bypass)
Grant bypass temporarily. Remove it afterwards.
Stage the load (don’t be clever)
If you’re touching multiple dependent fields:
- Pass 1: run id + safe fields
- Pass 2: fields that trigger automation
- Pass 3: relationship fields (lookups) after you’ve validated keys
It’s slower. It’s stable.
Validation: counts + spot checks
Count checks:
- expected rows: X
- succeeded: X
- failed: 0
Spot checks (pick ~20 records across segments):
- lookups correct?
- picklists valid?
- automation didn’t overwrite your values?
Rollback: be explicit
With Import_Run_Id__c:
- filter by run id
- update fields back from your pre-export
If you didn’t pre-export old values, you don’t have rollback. You have optimism.
Checklist
Before:
- sandbox dry-run completed
- run id column present
- pre-export saved (Id + impacted fields)
- automation approach agreed (on vs bypass)
- expected row counts agreed
After:
- reconcile counts
- spot-check records
- monitor integrations/logs for 1–2 hours
- remove bypass permissions (if used)
- write one paragraph in your change log (what/why)