1,500 customers. 17,700 interactions.
Every mutation witnessed.
A South American chemical manufacturer needed to structure three years of scattered customer data before migrating to a new ERP. Signal delivered a verified, tamper-evident dataset in two days.
The situation
The company had just completed a three-year ERP deployment. The system was live, but the operational intelligence scattered across files, spreadsheets, and people was invisible. Customer records existed in multiple formats. Interaction history was fragmented across three legacy XLSX exports. Nobody could answer basic questions: which customers are active, which are dormant, where is engagement concentrated?
Before importing this data into the new ERP, they needed to know it was clean, structured, and provably unaltered. Not a guess. Not a spot check. Proof.
What we delivered
~1,500
Customer records structured from 3 legacy spreadsheets
~17,700
Interactions ingested with source traceability
~2,800
Ledger entries in a tamper-evident hash chain
2 days
From raw spreadsheets to verified, ERP-ready dataset
How it worked
-
1. Ingest and structure
Three legacy XLSX files parsed into a relational schema. Customer master, interaction history, market segments, and statistical groups normalized and deduplicated. Every parsing decision recorded in the ledger.
-
2. Classify and enrich
Deterministic keyword matching first. Local LLM (Ollama, no cloud) only on the residue. 83% market segment coverage, 95% statistical group coverage. Every classification traced to its source: keyword rule or model inference.
-
3. Surface findings
The pipeline surfaced data quality issues the team did not know existed: invalid tax IDs, missing commercial terms, contact gaps. Each finding proposed as a hypothesis, not asserted as fact. The client confirmed or rejected each one through an interactive dashboard.
-
4. Witness everything
Every data mutation, every classification, every client confirmation was recorded in a hash-chained ledger. Each entry references the one before it. The chain verified VALID at delivery. The client can re-verify at any time. Nobody needs to trust anybody.
-
5. Export to ERP
Clean, structured records exported in a format ready for direct import into the client's ERP system. The provenance chain travels with the data.
Why provenance mattered
The client had lived through a three-year ERP migration. They knew the data quality problem existed but could not see it. More importantly, they needed to prove to their own leadership that the data being imported was clean and that nothing was fabricated or lost in the process.
The hash-chained ledger gave them that proof. Every record traceable to a source file. Every transformation witnessed. Every client review logged with a timestamp and reviewer identity. The chain either holds or it does not. No trust required.
Architecture
- Data residency: Everything ran on the client's infrastructure. Zero data left their environment.
- LLM: Local Ollama instance. No API calls, no cloud, no data exfiltration risk.
- Storage: SQLite. The client owns the database file. No vendor lock-in.
- Ledger: Append-only, hash-chained. SHA-256. Each entry references the previous hash. Independently verifiable.
- Dashboard: Bilingual (Portuguese/English), light and dark mode, runs on localhost.
The same technology, on your machine
The hash-chained ledger that tracked 2,800+ mutations for this deployment is the same ledger Signal Provenance puts on your machine. Same cryptography. Same verification. Same proof.