Migration of broker, carrier, and program reference data. Most tables are legacy/unused - migrating as reference data only.
Migration of broker, carrier, and program data from DTX 1.0 to DTX 2.0. This is Phase 2 — independent of other areas (no dependencies).
Critical finding: DTX 2.0 currently has NO dedicated Broker, Carrier (entity), or Program models. The existing DTX 2.0 codebase stores this data as denormalized fields:
- Brokers/Producers → text fields on Certificate (agent_name, agent_contact, agent_address, agent_title)
- Carriers → text fields on Certificate (carrier_a through carrier_f) and coverage records (carrier_code, carrier_naic, carrier_name)
- Programs → integer field on Account (program_id)
These models do not exist in DTX 2.0 today. The team must decide whether to: 1. Create new models in DTX 2.0 for Broker, Carrier, and/or Program and migrate the data into them 2. Accept the denormalized approach and treat the source tables as reference data only
The decisions below assume option 2 unless the team decides otherwise. If new models are created, the field mappings and source data profiles in this document provide the basis for migration scripts.
Important: Full production data is not available. All observations are from 100-row samples.
| Table | Columns | Rows (sample) | Purpose |
|---|---|---|---|
broker |
18 | 100 | Broker/producer master data — name, address, phone, email (PHP serialized) |
broker_coverage |
2 | 100 | Coverage types handled by each broker |
carrier |
5 | 100 | Carrier lookup table — name, state, NAIC |
carrier_ratings |
13 | 100 | AM Best rating reference data — strength, size, NAIC |
program |
29 | 100 | Program/billing metadata — fees, services (HTML), emails (PHP serialized) |
None dedicated. Related models already populated by other areas:
| DTX 2.0 Model | Field(s) | Already Populated By |
|---|---|---|
Certificate |
agent_name, agent_title, agent_contact, agent_address |
Area 4 (from ProdName, ProdAddr, ProdPhone, etc.) |
Certificate |
carrier_a_name through carrier_f_naic |
Area 5 (Carrier A-F backfill) |
Account |
program_id |
Area 2 (direct integer mapping from customer.program_id) |
CarrierRating |
avii_or_above |
Areas 5 & 6 (from AMBestRatingOK flag) |
| Item | Estimate |
|---|---|
| Certificate agent enrichments | 50K-200K (if broker_id supplements empty agent fields) |
| All other source records | 0 — no migration target |
Priority: MEDIUM — affects scope of this area
Area 4 populated Certificate agent_* fields from the certificate's own producer fields (ProdName, ProdAddr1, ProdPhone, etc.). The broker table contains the broker's master contact data (canonical name, address, email).
Question for DTX 2.0 team:
1. Are the certificate producer fields (already migrated) sufficient?
2. Or should certificates with empty agent_* fields be backfilled from the broker master data?
3. Should broker data be stored anywhere else (e.g., Account-level field)?
Default plan: Don't enrich — the certificate's own producer fields are the authoritative source (what was printed on the ACORD form). Broker master data kept as reference CSV only.
Priority: LOW
DTX 1.0's carrier_ratings table is a comprehensive AM Best reference dataset (strength A++/A+/A/A-/B++/etc., size I-XV, NAIC codes). DTX 2.0 simplified this to a boolean CarrierRating.avii_or_above flag.
Question: Should the full AM Best reference data be imported into DTX 2.0 as a lookup table for future use (carrier validation, reporting)? This would require creating a new model.
Default plan: Don't import — keep as reference CSV. DTX 2.0's simplified boolean is sufficient for compliance.
| # | Issue | Impact | Mitigation |
|---|---|---|---|
| 1 | No target models for 4 of 5 tables — DTX 2.0 intentionally dropped Broker, Carrier (entity), and Program models. Migration has nowhere to write the data. | All source data preserved only as reference CSVs, not in DTX 2.0 database | Confirm with DTX 2.0 team this is intentional. Document data availability in reference CSVs. |
| 2 | Deferred broker_id references unresolvable — Areas 2 and 4 stored broker_id as reference, expecting Area 9 to create Broker entities. DTX 2.0 has no Broker model. |
Deferred FKs on Account and Certificate have no target | Drop the deferred references. Document resolution. See Decisions #1 and #2. |
| 3 | broker.email is PHP serialized — Email field stored as a:1:{i:0;s:23:"dpiermont@optonline.net";} format. Not directly usable. |
Cannot read broker emails without PHP deserialization | Parse for reference report. Don't migrate to any model field. |
| # | Issue | Impact | Mitigation |
|---|---|---|---|
| 4 | carrier table is single-customer — All 100 sample records have customer_id=1, suggesting this is a global reference table shared across all customers. |
Not customer-scoped — unusual FK pattern | Treat as global lookup. Don't migrate as per-Account data. |
| 5 | carrier_ratings CSV parsing issues — Some carrier names contain commas/quotes that break CSV column alignment. Strength field captures partial company names in some rows. | Data corruption risk if parsed naively | Use robust CSV parser. Validate column count per row. |
| 6 | program.services and program.suitability contain HTML — Fields have <strong> tags and HTML formatting. |
Not plain text — would need stripping if migrated | Don't migrate. Keep as reference. |
| 7 | program.type is a full sentence — Contains "ProSight Subcontractor Insurance Compliance Verification and Management – ANNUAL POLICY PROGRAM" instead of a type code. | Not a clean enum — 2 values: ANNUAL (81%) and PER PROJECT (19%) | Don't migrate. program_id on Account is sufficient. |
| # | Issue | Impact | Mitigation |
|---|---|---|---|
| 8 | Broker data quality — Misspelled cities ("Philaelphia"), truncated names ("Crystal & Compan"), placeholder ZIPs ("00000"), inconsistent data entry. | Poor quality master data | Don't migrate — reference only. |
| 9 | carrier_ratings: 36% unrated (NR) — 36 of 100 sample carriers have strength="NR" (Not Rated). |
Limited utility for rating validation | Document in reference report. |
Risk Level: Low
Area 2 stored customer.broker_id as a deferred reference, expecting Area 9 to link it. DTX 2.0's Account model has no broker FK.
Resolution: Drop the deferred broker_id reference. No action needed on Account.
Rationale: 1. DTX 2.0 was designed without a Broker entity model 2. Account has no field for broker FK 3. Broker-level information is not used in DTX 2.0's compliance workflow 4. The broker relationship was an operational tracking feature in DTX 1.0
Alternative (if DTX 2.0 team requests): Add a broker_name TextField to Account and populate from broker table. Requires a schema migration.
Risk Level: Low (pending Action #1)
Area 4 stored certificate.broker_id as a deferred reference. Certificate agent_* fields were populated from the certificate's own producer fields (ProdName, ProdAddr, etc.) in Area 4.
Default: No action needed. The producer fields on the certificate ARE the broker/agent info for that certificate.
Optional enrichment (only if Action #1 confirms):
For each Certificate where agent_name is NULL and broker_id is populated:
1. Look up broker_id in broker table
2. Set agent_name = broker.name (if not NULL)
3. Set agent_address = concatenate(broker.address_1, address_2, city, state, zip)
4. Set agent_contact = concatenate(broker.phone, broker.fax, parsed_email)
where parsed_email = PHP-deserialize(broker.email)
5. Log enrichment to report
Key distinction: Certificate producer fields = what was printed on the ACORD form. Broker master data = canonical broker contact info. These may differ (e.g., a specific agent at the brokerage vs. the brokerage itself).
Risk Level: Low
No target model in DTX 2.0.
Don't migrate. Keep CSV for reference.
broker table (18 columns, ~100 sample rows):
broker_id: Primary key
customer_id: FK to customer (varied: customer 305 has 21 brokers, 593 has 14)
name: Broker name (person or company). Quality issues: truncated, misspelled
address_1/2: Street address
city/state/zip: Location. Some placeholder ZIPs (00000, 56789)
country: Mostly NULL
phone: Phone number
phone_extension: Mostly NULL
fax: Fax number (mostly NULL)
email: PHP serialized array: a:1:{i:0;s:23:"dpiermont@optonline.net";}
contact_name: Contact person at broker (mostly NULL)
contact_title: Contact title (mostly NULL)
comments: Operational notes (mostly NULL)
create_time: Creation timestamp
update_time: Last update timestamp
Risk Level: Low
Maps broker_id → coverage type names. No target model.
Don't migrate. Keep CSV for reference.
broker_coverage table (2 columns, ~100 sample rows):
broker_id: FK to broker
policy_id: NOT a policy ID — contains coverage type descriptions:
"GL", "Auto", "WC", "UMB", "ERRORS & OMISSIONS", "PROPERTY",
"Professional Liability", "Cyber Liability", "Pollution",
"Earthquake Insurance", "Flood Coverage", "Garage Liability",
"Dealer Inventory", "Equipment Coverage", "Business Income"
Data quality: Inconsistent casing (GL vs Auto), trailing spaces ("PROPERTY "),
truncated names ("Bus Personal Prop &")
Cardinality: 1-9 coverage types per broker
Note: The policy_id field name is misleading — it stores coverage TYPE descriptions, not policy identifiers.
Risk Level: Low
Global carrier lookup table. All sample records have customer_id=1 (not customer-scoped).
Don't migrate. Keep CSV as NAIC reference for validating Area 5's Carrier A-F backfill.
carrier table (5 columns, ~100 sample rows):
carrier_id: Primary key
customer_id: Always "1" in sample — global/system-level table
name: Carrier name. Includes subsidiary indicators ("SI", "SI CAB", "(G)")
state: State of domicile (28 NULL — international or unknown)
naic_number: NAIC code (14 NULL — international carriers)
The carrier table provides a canonical NAIC → name mapping that can validate: 1. Area 5's Carrier A-F slot assignments (do the NAICs in coverage records match known carriers?) 2. Carrier name normalization (is "Continental Casualty Co." the same as NAIC 20443 in this table?)
Post-migration validation query:
For each unique (carrier_naic, carrier_name) on migrated coverage records:
Check if carrier_naic exists in carrier.csv
If yes: compare name similarity
Flag mismatches for review
Risk Level: Low (pending Action #2)
AM Best rating reference data. DTX 2.0 uses a simplified boolean (CarrierRating.avii_or_above) instead of the full strength/size rating system.
Don't migrate. The DTX 2.0 CarrierRating model (used in Areas 5 and 6) stores per-certificate and per-template boolean flags, not the full AM Best rating dataset.
carrier_ratings table (13 columns, ~100 sample rows):
amb_id: AM Best company identifier (e.g., "010244")
name: Carrier name
strength: AM Best rating: A++ (2), A+ (9), A (24), A- (10),
B++ (5), B+ (2), B (1), NR (36 = Not Rated)
strength_val: Numeric: 15(A), 14(A-), 13(B++), ..., 1(NR)
strength_modifier: "u" (under review) — 99 NULL
size: Roman numeral I-XV: XV=≥$2B, V=$10-25M, NULL=unrated
size_val: Numeric: 15(XV) → 0(unrated)
size_desc: "USD 100 Million to Less than 250 Million", etc.
country: All "United States" in sample
state: Full state name ("New York", "California", etc.)
naic: NAIC code (links to carrier table)
amb_rating_unit_number: AM Best unit number
update_time: Last update (2025 timestamps — recently refreshed)
DTX 1.0:
- carrier_ratings = global reference table of AM Best ratings per carrier
- Strength grades: A++, A+, A, A-, B++, B+, B, C++, C+, C, D, E, F, S, NR
- Size categories: I through XV (financial size)
DTX 2.0:
- CarrierRating = per-certificate or per-template boolean flag
- Only tracks avii_or_above (is the carrier rated A-VII or above?)
- No strength letter, no size category, no AM Best ID
These are fundamentally different concepts. The DTX 1.0 table is a carrier directory. The DTX 2.0 model is a compliance flag.
The carrier_ratings CSV can be used for:
1. Post-migration validation: For certificates with CarrierRating.avii_or_above=True, verify that the carriers on the certificate actually have A-VII+ ratings in the reference data
2. Future feature: If DTX 2.0 adds granular rating tracking, this data is available
3. NAIC cross-reference: Validate NAIC codes used in Area 5 carrier backfill
Risk Level: Low
Program/billing metadata. Account.program_id (IntegerField) already mapped in Area 2.
Don't migrate. Keep CSV for reference.
program table (29 columns, ~100 sample rows):
program_id: Primary key (Area 2 already mapped to Account.program_id)
customer_id: FK to customer (76 unique customers in 100 rows)
name: "MetroBuilders, Program Brokerage Corporation" (varies)
program_title: Full program description (very long)
type: 2 values:
- "...ANNUAL POLICY PROGRAM" (81%)
- "...PER PROJECT POLICY PROGRAM" (19%)
setup_fee: "Included" (consistent)
tracking_fee: "75" (consistent)
base_service_price: "2025"-"2700" range
base_service_terms: "20" (consistent)
additional_service_price/terms: Vary
sub_on_additional_project_full/limited: Pricing
training_hours: "4" (consistent)
training_rate: "150" (consistent)
services: HTML-formatted text with <strong> tags
suitability: HTML-formatted text
policy_number: NULL in all samples
policy_effDate: NULL in all samples
policy_expDate: NULL in all samples
envelope_id: FK to envelope (Area 8)
charge_id: FK to billing (Area 11)
parent_email: PHP serialized array
mga_email: PHP serialized array (often empty: a:0:{})
retailer_email: NULL in all samples
insured_email: NULL in all samples
former_prosight: NULL or legacy flag
report_master_parent_id: FK to reporting (Area 10)
create_time: Creation timestamp
Account.program_id already stores the program association (Area 2)Risk Level: Low
Summary of deferred broker/carrier/program references and their resolution:
| Area | Deferred Item | Original Plan | Resolution |
|---|---|---|---|
| Area 2 | customer.broker_id |
"Store as reference, link after broker migration" | Drop. No Broker model in DTX 2.0. Account has no broker field. |
| Area 4 | certificate.broker_id |
"Area 9 — store as reference" | Drop. Certificate agent_* fields already populated from producer fields. Optional enrichment per Decision #2. |
| Area 5 | carrier data on coverage records | Already stored as text fields | No action. Carrier data is already on coverage records. carrier table serves as reference only. |
| Area 6 | so_review.broker_id |
"Don't migrate — references Area 9" | Confirmed. so_review not migrated. |
All 5 source tables resolve to "don't migrate." Active migrations are limited to: 1. Optional Certificate agent field enrichment from broker data (Decision #2, pending Action #1) 2. Reference CSV preservation for all 5 tables
| Source Table | Reference File | Purpose |
|---|---|---|
broker |
reference_brokers.csv |
Broker master data — validate against certificate agent fields |
broker_coverage |
reference_broker_coverage.csv |
Coverage types per broker — operational reference |
carrier |
reference_carriers.csv |
NAIC → carrier name mapping — validate Area 5 backfill |
carrier_ratings |
reference_carrier_ratings.csv |
AM Best ratings — validate CarrierRating records |
program |
reference_programs.csv |
Program metadata — billing/operational reference |
| Source | Field(s) | Reason |
|---|---|---|
broker |
(entire table) | No target model in DTX 2.0 |
broker_coverage |
(entire table) | No target model |
carrier |
(entire table) | No target model. Data exists as text on coverage/certificate records |
carrier_ratings |
(entire table) | No target model. DTX 2.0 uses boolean CarrierRating instead |
program |
(entire table) | No target model. program_id already on Account (Area 2) |
| Area 2 deferred | customer.broker_id |
No Broker model to link to |
| Area 4 deferred | certificate.broker_id |
No Broker model to link to. agent_* fields already populated |
1. Validate prerequisites:
- Confirm DTX 2.0 team approves "don't migrate" decisions
- Resolve Action #1 (broker enrichment decision)
- Resolve Action #2 (carrier_ratings reference import decision)
2. Dry-run report
3. (If Action #1 approved) Enrich Certificate agent_* fields from broker data
4. Copy all 5 source CSVs to reference directory
5. Generate validation report (cross-reference carrier/carrier_ratings against Area 5 data)
=== DTX 1.0 Brokers, Carriers & Programs Migration - Dry Run Report ===
BROKER:
Total records: ???
Unique customer_ids: ???
With email (PHP serialized):???
With phone: ???
With address: ???
Data quality issues: ??? (misspelled, truncated, placeholder)
BROKER COVERAGE:
Total records: ???
Unique broker_ids: ???
Unique coverage types: [list all with count]
Avg types per broker: ???
CARRIER:
Total records: ???
customer_id = 1 (global): ???
With NAIC: ???
Without NAIC: ???
Unique states: ???
CARRIER RATINGS:
Total records: ???
Strength distribution: A++: ???, A+: ???, A: ???, A-: ???,
B++: ???, B+: ???, B: ???, NR: ???
Size distribution: [list by Roman numeral]
With NAIC: ???
CSV parsing errors: ???
PROGRAM:
Total records: ???
Unique customer_ids: ???
Type distribution: ANNUAL: ???, PER PROJECT: ???
With policy dates: ???
With HTML in services: ???
DEFERRED REFERENCE RESOLUTION:
Certificates with broker_id: ???
Certificates with empty agent_*: ???
Certificates enrichable: ??? (have broker_id AND empty agent fields)
Accounts with broker_id: ???
CROSS-VALIDATION (carrier data vs Area 5):
Unique NAICs on coverage records: ???
NAICs matching carrier.csv: ???
NAICs matching carrier_ratings.csv: ???
NAICs in coverage but NOT in carrier: ???
Issues flagged: ???
| Scenario | Behavior |
|---|---|
| PHP serialized email parse failure | Log and skip — reference only |
| CSV column misalignment (carrier_ratings) | Use robust parser, validate column count |
| broker_id on certificate not in broker table | Log mismatch — no action needed |
| carrier_ratings strength field contains partial name | Flag as CSV parse error |
| HTML in program.services/suitability | Note in report — don't strip |
| File | Purpose |
|---|---|
reference_brokers.csv |
Copy of source broker data |
reference_broker_coverage.csv |
Copy of source broker_coverage data |
reference_carriers.csv |
Copy of source carrier data |
reference_carrier_ratings.csv |
Copy of source carrier_ratings data |
reference_programs.csv |
Copy of source program data |
broker_certificate_enrichment.csv |
(If enrichment approved) broker_id → certificate_id → fields enriched |
carrier_naic_validation.csv |
Cross-reference of coverage NAICs vs carrier/carrier_ratings |
migration_report.txt |
Summary statistics and deferred reference resolutions |
| # | Check | Method |
|---|---|---|
| 1 | Reference CSVs preserved | All 5 source CSVs copied to reference directory |
| 2 | Deferred broker_id resolved | No pending "link after Area 9" references remain in migration tracking |
| 3 | Certificate agent fields | Spot-check 20 certificates — agent_* fields populated (from Area 4) |
| 4 | Account.program_id | Spot-check 10 accounts — program_id matches source customer.program_id |
| 5 | NAIC cross-validation | Coverage record NAICs cross-referenced against carrier table |
| 6 | CarrierRating validation | Certificates with avii_or_above=True checked against carrier_ratings AM Best data |
| 7 | (If enrichment done) | Enriched certificates have non-NULL agent_name matching broker.name |
| Depends On | What's Needed |
|---|---|
| Area 2: Customers → Accounts | Account.program_id already mapped. broker_id deferred reference to resolve. |
| Area 4: Certificates & Documents | Certificate agent_* fields already populated. broker_id deferred reference to resolve. |
| Area 5: Coverage & Insurance | Carrier A-F backfill already done. NAIC data for cross-validation. |
| Depended On By | For |
|---|---|
| (none) | No other areas depend on Area 9 entity creation (since no entities are created) |
Note: The README dependency chart shows Areas 7, 8 in Phase 4 and Areas 10, 11 in Phase 5. None of these depend on Area 9 for entity creation. The so_review table (Area 6) referenced broker_id but was not migrated.
The simplest area in the migration. DTX 2.0 intentionally dropped the Broker, Carrier (entity), and Program models. All 5 source tables resolve to "don't migrate as entities." The only potential active work is optional certificate agent enrichment.
| Factor | Impact |
|---|---|
| No target models for any source table | Eliminates all structural mapping work |
| Related data already migrated (Areas 2, 4, 5) | No new records to create |
| DTX 2.0 architecture decision (denormalized) | Intentional — not a gap |
| PHP serialized emails in broker/program | Parse for reference only, no migration target |
| carrier/carrier_ratings as reference data | Useful for validation, not migration |
| Optional broker enrichment | Conditional on team decision |
| # | Decision | Status | Approach |
|---|---|---|---|
| 1 | broker_id on Account | Resolved | Drop deferred reference. No Broker model in DTX 2.0 |
| 2 | broker_id on Certificate | Pending Action #1 | Default: drop. Optional: enrich empty agent_* fields from broker data |
| 3 | broker table | Resolved | Don't migrate. Keep as reference CSV |
| 4 | broker_coverage table | Resolved | Don't migrate. Keep as reference CSV |
| 5 | carrier table | Resolved | Don't migrate. Keep as NAIC reference for Area 5 validation |
| 6 | carrier_ratings table | Pending Action #2 | Default: don't migrate. Optional: import as reference lookup table |
| 7 | program table | Resolved | Don't migrate. program_id already on Account from Area 2 |
| 8 | Deferred references | Resolved | All broker_id deferrals from Areas 2 and 4 resolved (drop) |