📋 Migration Summary

Source Tables: 5
Target Models: 0 (denormalized)
Decisions: 8

Migration of broker, carrier, and program reference data. Most tables are legacy/unused - migrating as reference data only.

🎯 Key Actions

⚠️ Key Challenges

DTX 1.0 → DTX 2.0 Migration Plan: Brokers, Carriers & Programs

Overview

Migration of broker, carrier, and program data from DTX 1.0 to DTX 2.0. This is Phase 2 — independent of other areas (no dependencies).

Critical finding: DTX 2.0 currently has NO dedicated Broker, Carrier (entity), or Program models. The existing DTX 2.0 codebase stores this data as denormalized fields: - Brokers/Producers → text fields on Certificate (agent_name, agent_contact, agent_address, agent_title) - Carriers → text fields on Certificate (carrier_a through carrier_f) and coverage records (carrier_code, carrier_naic, carrier_name) - Programs → integer field on Account (program_id)

These models do not exist in DTX 2.0 today. The team must decide whether to: 1. Create new models in DTX 2.0 for Broker, Carrier, and/or Program and migrate the data into them 2. Accept the denormalized approach and treat the source tables as reference data only

The decisions below assume option 2 unless the team decides otherwise. If new models are created, the field mappings and source data profiles in this document provide the basis for migration scripts.

Important: Full production data is not available. All observations are from 100-row samples.

Source Tables (5)

Table Columns Rows (sample) Purpose
broker 18 100 Broker/producer master data — name, address, phone, email (PHP serialized)
broker_coverage 2 100 Coverage types handled by each broker
carrier 5 100 Carrier lookup table — name, state, NAIC
carrier_ratings 13 100 AM Best rating reference data — strength, size, NAIC
program 29 100 Program/billing metadata — fees, services (HTML), emails (PHP serialized)

Target Models

None dedicated. Related models already populated by other areas:

DTX 2.0 Model Field(s) Already Populated By
Certificate agent_name, agent_title, agent_contact, agent_address Area 4 (from ProdName, ProdAddr, ProdPhone, etc.)
Certificate carrier_a_name through carrier_f_naic Area 5 (Carrier A-F backfill)
Account program_id Area 2 (direct integer mapping from customer.program_id)
CarrierRating avii_or_above Areas 5 & 6 (from AMBestRatingOK flag)

Estimated Volume

Item Estimate
Certificate agent enrichments 50K-200K (if broker_id supplements empty agent fields)
All other source records 0 — no migration target

Pre-Migration Actions Required

Action 1: Decide Whether to Enrich Certificate Agent Fields from Broker Data

Priority: MEDIUM — affects scope of this area

Area 4 populated Certificate agent_* fields from the certificate's own producer fields (ProdName, ProdAddr1, ProdPhone, etc.). The broker table contains the broker's master contact data (canonical name, address, email).

Question for DTX 2.0 team: 1. Are the certificate producer fields (already migrated) sufficient? 2. Or should certificates with empty agent_* fields be backfilled from the broker master data? 3. Should broker data be stored anywhere else (e.g., Account-level field)?

Default plan: Don't enrich — the certificate's own producer fields are the authoritative source (what was printed on the ACORD form). Broker master data kept as reference CSV only.

Action 2: Decide Whether carrier_ratings Should Be Imported as Reference Table

Priority: LOW

DTX 1.0's carrier_ratings table is a comprehensive AM Best reference dataset (strength A++/A+/A/A-/B++/etc., size I-XV, NAIC codes). DTX 2.0 simplified this to a boolean CarrierRating.avii_or_above flag.

Question: Should the full AM Best reference data be imported into DTX 2.0 as a lookup table for future use (carrier validation, reporting)? This would require creating a new model.

Default plan: Don't import — keep as reference CSV. DTX 2.0's simplified boolean is sufficient for compliance.


Known Issues & Risks

High

# Issue Impact Mitigation
1 No target models for 4 of 5 tables — DTX 2.0 intentionally dropped Broker, Carrier (entity), and Program models. Migration has nowhere to write the data. All source data preserved only as reference CSVs, not in DTX 2.0 database Confirm with DTX 2.0 team this is intentional. Document data availability in reference CSVs.
2 Deferred broker_id references unresolvable — Areas 2 and 4 stored broker_id as reference, expecting Area 9 to create Broker entities. DTX 2.0 has no Broker model. Deferred FKs on Account and Certificate have no target Drop the deferred references. Document resolution. See Decisions #1 and #2.
3 broker.email is PHP serialized — Email field stored as a:1:{i:0;s:23:"dpiermont@optonline.net";} format. Not directly usable. Cannot read broker emails without PHP deserialization Parse for reference report. Don't migrate to any model field.

Medium

# Issue Impact Mitigation
4 carrier table is single-customer — All 100 sample records have customer_id=1, suggesting this is a global reference table shared across all customers. Not customer-scoped — unusual FK pattern Treat as global lookup. Don't migrate as per-Account data.
5 carrier_ratings CSV parsing issues — Some carrier names contain commas/quotes that break CSV column alignment. Strength field captures partial company names in some rows. Data corruption risk if parsed naively Use robust CSV parser. Validate column count per row.
6 program.services and program.suitability contain HTML — Fields have <strong> tags and HTML formatting. Not plain text — would need stripping if migrated Don't migrate. Keep as reference.
7 program.type is a full sentence — Contains "ProSight Subcontractor Insurance Compliance Verification and Management – ANNUAL POLICY PROGRAM" instead of a type code. Not a clean enum — 2 values: ANNUAL (81%) and PER PROJECT (19%) Don't migrate. program_id on Account is sufficient.

Low

# Issue Impact Mitigation
8 Broker data quality — Misspelled cities ("Philaelphia"), truncated names ("Crystal & Compan"), placeholder ZIPs ("00000"), inconsistent data entry. Poor quality master data Don't migrate — reference only.
9 carrier_ratings: 36% unrated (NR) — 36 of 100 sample carriers have strength="NR" (Not Rated). Limited utility for rating validation Document in reference report.

Decisions

Decision #1: broker_id on Customer/Account — Drop Deferred Reference

Risk Level: Low

Area 2 stored customer.broker_id as a deferred reference, expecting Area 9 to link it. DTX 2.0's Account model has no broker FK.

Resolution: Drop the deferred broker_id reference. No action needed on Account.

Rationale: 1. DTX 2.0 was designed without a Broker entity model 2. Account has no field for broker FK 3. Broker-level information is not used in DTX 2.0's compliance workflow 4. The broker relationship was an operational tracking feature in DTX 1.0

Alternative (if DTX 2.0 team requests): Add a broker_name TextField to Account and populate from broker table. Requires a schema migration.

Decision #2: broker_id on Certificate — Optionally Enrich Agent Fields

Risk Level: Low (pending Action #1)

Area 4 stored certificate.broker_id as a deferred reference. Certificate agent_* fields were populated from the certificate's own producer fields (ProdName, ProdAddr, etc.) in Area 4.

Default: No action needed. The producer fields on the certificate ARE the broker/agent info for that certificate.

Optional enrichment (only if Action #1 confirms):

For each Certificate where agent_name is NULL and broker_id is populated:
  1. Look up broker_id in broker table
  2. Set agent_name = broker.name (if not NULL)
  3. Set agent_address = concatenate(broker.address_1, address_2, city, state, zip)
  4. Set agent_contact = concatenate(broker.phone, broker.fax, parsed_email)
     where parsed_email = PHP-deserialize(broker.email)
  5. Log enrichment to report

Key distinction: Certificate producer fields = what was printed on the ACORD form. Broker master data = canonical broker contact info. These may differ (e.g., a specific agent at the brokerage vs. the brokerage itself).

Decision #3: broker Table — Don't Migrate

Risk Level: Low

No target model in DTX 2.0.

Don't migrate. Keep CSV for reference.

Source data profile

broker table (18 columns, ~100 sample rows):
  broker_id:       Primary key
  customer_id:     FK to customer (varied: customer 305 has 21 brokers, 593 has 14)
  name:            Broker name (person or company). Quality issues: truncated, misspelled
  address_1/2:     Street address
  city/state/zip:  Location. Some placeholder ZIPs (00000, 56789)
  country:         Mostly NULL
  phone:           Phone number
  phone_extension: Mostly NULL
  fax:             Fax number (mostly NULL)
  email:           PHP serialized array: a:1:{i:0;s:23:"dpiermont@optonline.net";}
  contact_name:    Contact person at broker (mostly NULL)
  contact_title:   Contact title (mostly NULL)
  comments:        Operational notes (mostly NULL)
  create_time:     Creation timestamp
  update_time:     Last update timestamp

Decision #4: broker_coverage Table — Don't Migrate

Risk Level: Low

Maps broker_id → coverage type names. No target model.

Don't migrate. Keep CSV for reference.

Source data profile

broker_coverage table (2 columns, ~100 sample rows):
  broker_id:   FK to broker
  policy_id:   NOT a policy ID — contains coverage type descriptions:
               "GL", "Auto", "WC", "UMB", "ERRORS & OMISSIONS", "PROPERTY",
               "Professional Liability", "Cyber Liability", "Pollution",
               "Earthquake Insurance", "Flood Coverage", "Garage Liability",
               "Dealer Inventory", "Equipment Coverage", "Business Income"

  Data quality: Inconsistent casing (GL vs Auto), trailing spaces ("PROPERTY "),
                truncated names ("Bus Personal Prop &")
  Cardinality: 1-9 coverage types per broker

Note: The policy_id field name is misleading — it stores coverage TYPE descriptions, not policy identifiers.

Decision #5: carrier Table — Don't Migrate (Keep as NAIC Reference)

Risk Level: Low

Global carrier lookup table. All sample records have customer_id=1 (not customer-scoped).

Don't migrate. Keep CSV as NAIC reference for validating Area 5's Carrier A-F backfill.

Source data profile

carrier table (5 columns, ~100 sample rows):
  carrier_id:    Primary key
  customer_id:   Always "1" in sample — global/system-level table
  name:          Carrier name. Includes subsidiary indicators ("SI", "SI CAB", "(G)")
  state:         State of domicile (28 NULL — international or unknown)
  naic_number:   NAIC code (14 NULL — international carriers)

Reference value

The carrier table provides a canonical NAIC → name mapping that can validate: 1. Area 5's Carrier A-F slot assignments (do the NAICs in coverage records match known carriers?) 2. Carrier name normalization (is "Continental Casualty Co." the same as NAIC 20443 in this table?)

Post-migration validation query:
  For each unique (carrier_naic, carrier_name) on migrated coverage records:
    Check if carrier_naic exists in carrier.csv
    If yes: compare name similarity
    Flag mismatches for review

Decision #6: carrier_ratings Table — Don't Migrate (Keep as AM Best Reference)

Risk Level: Low (pending Action #2)

AM Best rating reference data. DTX 2.0 uses a simplified boolean (CarrierRating.avii_or_above) instead of the full strength/size rating system.

Don't migrate. The DTX 2.0 CarrierRating model (used in Areas 5 and 6) stores per-certificate and per-template boolean flags, not the full AM Best rating dataset.

Source data profile

carrier_ratings table (13 columns, ~100 sample rows):
  amb_id:                AM Best company identifier (e.g., "010244")
  name:                  Carrier name
  strength:              AM Best rating: A++ (2), A+ (9), A (24), A- (10),
                         B++ (5), B+ (2), B (1), NR (36 = Not Rated)
  strength_val:          Numeric: 15(A), 14(A-), 13(B++), ..., 1(NR)
  strength_modifier:     "u" (under review) — 99 NULL
  size:                  Roman numeral I-XV: XV=≥$2B, V=$10-25M, NULL=unrated
  size_val:              Numeric: 15(XV) → 0(unrated)
  size_desc:             "USD 100 Million to Less than 250 Million", etc.
  country:               All "United States" in sample
  state:                 Full state name ("New York", "California", etc.)
  naic:                  NAIC code (links to carrier table)
  amb_rating_unit_number: AM Best unit number
  update_time:           Last update (2025 timestamps — recently refreshed)

Semantic mismatch with DTX 2.0

DTX 1.0: - carrier_ratings = global reference table of AM Best ratings per carrier - Strength grades: A++, A+, A, A-, B++, B+, B, C++, C+, C, D, E, F, S, NR - Size categories: I through XV (financial size)

DTX 2.0: - CarrierRating = per-certificate or per-template boolean flag - Only tracks avii_or_above (is the carrier rated A-VII or above?) - No strength letter, no size category, no AM Best ID

These are fundamentally different concepts. The DTX 1.0 table is a carrier directory. The DTX 2.0 model is a compliance flag.

Reference value

The carrier_ratings CSV can be used for: 1. Post-migration validation: For certificates with CarrierRating.avii_or_above=True, verify that the carriers on the certificate actually have A-VII+ ratings in the reference data 2. Future feature: If DTX 2.0 adds granular rating tracking, this data is available 3. NAIC cross-reference: Validate NAIC codes used in Area 5 carrier backfill

Decision #7: program Table — Don't Migrate

Risk Level: Low

Program/billing metadata. Account.program_id (IntegerField) already mapped in Area 2.

Don't migrate. Keep CSV for reference.

Source data profile

program table (29 columns, ~100 sample rows):
  program_id:          Primary key (Area 2 already mapped to Account.program_id)
  customer_id:         FK to customer (76 unique customers in 100 rows)
  name:                "MetroBuilders, Program Brokerage Corporation" (varies)
  program_title:       Full program description (very long)
  type:                2 values:
                       - "...ANNUAL POLICY PROGRAM" (81%)
                       - "...PER PROJECT POLICY PROGRAM" (19%)
  setup_fee:           "Included" (consistent)
  tracking_fee:        "75" (consistent)
  base_service_price:  "2025"-"2700" range
  base_service_terms:  "20" (consistent)
  additional_service_price/terms: Vary
  sub_on_additional_project_full/limited: Pricing
  training_hours:      "4" (consistent)
  training_rate:       "150" (consistent)
  services:            HTML-formatted text with <strong> tags
  suitability:         HTML-formatted text
  policy_number:       NULL in all samples
  policy_effDate:      NULL in all samples
  policy_expDate:      NULL in all samples
  envelope_id:         FK to envelope (Area 8)
  charge_id:           FK to billing (Area 11)
  parent_email:        PHP serialized array
  mga_email:           PHP serialized array (often empty: a:0:{})
  retailer_email:      NULL in all samples
  insured_email:       NULL in all samples
  former_prosight:     NULL or legacy flag
  report_master_parent_id: FK to reporting (Area 10)
  create_time:         Creation timestamp

Why not migrate

  1. Account.program_id already stores the program association (Area 2)
  2. Program metadata (fees, services, terms) is billing/operational data with no DTX 2.0 model target
  3. HTML content in services/suitability fields would need stripping
  4. Email fields are PHP serialized
  5. Multiple FKs to unmigrated areas (envelope, charge, report_master)
  6. The data is configuration/pricing, not compliance data — outside DTX 2.0's core domain

Decision #8: Resolve Deferred References from Other Areas

Risk Level: Low

Summary of deferred broker/carrier/program references and their resolution:

Area Deferred Item Original Plan Resolution
Area 2 customer.broker_id "Store as reference, link after broker migration" Drop. No Broker model in DTX 2.0. Account has no broker field.
Area 4 certificate.broker_id "Area 9 — store as reference" Drop. Certificate agent_* fields already populated from producer fields. Optional enrichment per Decision #2.
Area 5 carrier data on coverage records Already stored as text fields No action. Carrier data is already on coverage records. carrier table serves as reference only.
Area 6 so_review.broker_id "Don't migrate — references Area 9" Confirmed. so_review not migrated.

Field Mapping

Summary: No Active Field Migrations

All 5 source tables resolve to "don't migrate." Active migrations are limited to: 1. Optional Certificate agent field enrichment from broker data (Decision #2, pending Action #1) 2. Reference CSV preservation for all 5 tables

Reference Preservation

Source Table Reference File Purpose
broker reference_brokers.csv Broker master data — validate against certificate agent fields
broker_coverage reference_broker_coverage.csv Coverage types per broker — operational reference
carrier reference_carriers.csv NAIC → carrier name mapping — validate Area 5 backfill
carrier_ratings reference_carrier_ratings.csv AM Best ratings — validate CarrierRating records
program reference_programs.csv Program metadata — billing/operational reference

Consolidated: Fields Dropped

Source Field(s) Reason
broker (entire table) No target model in DTX 2.0
broker_coverage (entire table) No target model
carrier (entire table) No target model. Data exists as text on coverage/certificate records
carrier_ratings (entire table) No target model. DTX 2.0 uses boolean CarrierRating instead
program (entire table) No target model. program_id already on Account (Area 2)
Area 2 deferred customer.broker_id No Broker model to link to
Area 4 deferred certificate.broker_id No Broker model to link to. agent_* fields already populated

Migration Execution

Migration Order (5 steps)

1. Validate prerequisites:
   - Confirm DTX 2.0 team approves "don't migrate" decisions
   - Resolve Action #1 (broker enrichment decision)
   - Resolve Action #2 (carrier_ratings reference import decision)
2. Dry-run report
3. (If Action #1 approved) Enrich Certificate agent_* fields from broker data
4. Copy all 5 source CSVs to reference directory
5. Generate validation report (cross-reference carrier/carrier_ratings against Area 5 data)

Dry-Run Report

=== DTX 1.0 Brokers, Carriers & Programs Migration - Dry Run Report ===

BROKER:
  Total records:              ???
  Unique customer_ids:        ???
  With email (PHP serialized):???
  With phone:                 ???
  With address:               ???
  Data quality issues:        ??? (misspelled, truncated, placeholder)

BROKER COVERAGE:
  Total records:              ???
  Unique broker_ids:          ???
  Unique coverage types:      [list all with count]
  Avg types per broker:       ???

CARRIER:
  Total records:              ???
  customer_id = 1 (global):   ???
  With NAIC:                  ???
  Without NAIC:               ???
  Unique states:              ???

CARRIER RATINGS:
  Total records:              ???
  Strength distribution:      A++: ???, A+: ???, A: ???, A-: ???,
                              B++: ???, B+: ???, B: ???, NR: ???
  Size distribution:          [list by Roman numeral]
  With NAIC:                  ???
  CSV parsing errors:         ???

PROGRAM:
  Total records:              ???
  Unique customer_ids:        ???
  Type distribution:          ANNUAL: ???, PER PROJECT: ???
  With policy dates:          ???
  With HTML in services:      ???

DEFERRED REFERENCE RESOLUTION:
  Certificates with broker_id:      ???
  Certificates with empty agent_*:  ???
  Certificates enrichable:          ??? (have broker_id AND empty agent fields)
  Accounts with broker_id:          ???

CROSS-VALIDATION (carrier data vs Area 5):
  Unique NAICs on coverage records:     ???
  NAICs matching carrier.csv:           ???
  NAICs matching carrier_ratings.csv:   ???
  NAICs in coverage but NOT in carrier: ???

Issues flagged:               ???

Error Handling

Scenario Behavior
PHP serialized email parse failure Log and skip — reference only
CSV column misalignment (carrier_ratings) Use robust parser, validate column count
broker_id on certificate not in broker table Log mismatch — no action needed
carrier_ratings strength field contains partial name Flag as CSV parse error
HTML in program.services/suitability Note in report — don't strip

Output Files

File Purpose
reference_brokers.csv Copy of source broker data
reference_broker_coverage.csv Copy of source broker_coverage data
reference_carriers.csv Copy of source carrier data
reference_carrier_ratings.csv Copy of source carrier_ratings data
reference_programs.csv Copy of source program data
broker_certificate_enrichment.csv (If enrichment approved) broker_id → certificate_id → fields enriched
carrier_naic_validation.csv Cross-reference of coverage NAICs vs carrier/carrier_ratings
migration_report.txt Summary statistics and deferred reference resolutions

Verification (post-migration)

# Check Method
1 Reference CSVs preserved All 5 source CSVs copied to reference directory
2 Deferred broker_id resolved No pending "link after Area 9" references remain in migration tracking
3 Certificate agent fields Spot-check 20 certificates — agent_* fields populated (from Area 4)
4 Account.program_id Spot-check 10 accounts — program_id matches source customer.program_id
5 NAIC cross-validation Coverage record NAICs cross-referenced against carrier table
6 CarrierRating validation Certificates with avii_or_above=True checked against carrier_ratings AM Best data
7 (If enrichment done) Enriched certificates have non-NULL agent_name matching broker.name

Dependencies

Depends On What's Needed
Area 2: Customers → Accounts Account.program_id already mapped. broker_id deferred reference to resolve.
Area 4: Certificates & Documents Certificate agent_* fields already populated. broker_id deferred reference to resolve.
Area 5: Coverage & Insurance Carrier A-F backfill already done. NAIC data for cross-validation.
Depended On By For
(none) No other areas depend on Area 9 entity creation (since no entities are created)

Note: The README dependency chart shows Areas 7, 8 in Phase 4 and Areas 10, 11 in Phase 5. None of these depend on Area 9 for entity creation. The so_review table (Area 6) referenced broker_id but was not migrated.


Difficulty Assessment

Overall: LOW

The simplest area in the migration. DTX 2.0 intentionally dropped the Broker, Carrier (entity), and Program models. All 5 source tables resolve to "don't migrate as entities." The only potential active work is optional certificate agent enrichment.

Factor Impact
No target models for any source table Eliminates all structural mapping work
Related data already migrated (Areas 2, 4, 5) No new records to create
DTX 2.0 architecture decision (denormalized) Intentional — not a gap
PHP serialized emails in broker/program Parse for reference only, no migration target
carrier/carrier_ratings as reference data Useful for validation, not migration
Optional broker enrichment Conditional on team decision

Decision Summary

# Decision Status Approach
1 broker_id on Account Resolved Drop deferred reference. No Broker model in DTX 2.0
2 broker_id on Certificate Pending Action #1 Default: drop. Optional: enrich empty agent_* fields from broker data
3 broker table Resolved Don't migrate. Keep as reference CSV
4 broker_coverage table Resolved Don't migrate. Keep as reference CSV
5 carrier table Resolved Don't migrate. Keep as NAIC reference for Area 5 validation
6 carrier_ratings table Pending Action #2 Default: don't migrate. Optional: import as reference lookup table
7 program table Resolved Don't migrate. program_id already on Account from Area 2
8 Deferred references Resolved All broker_id deferrals from Areas 2 and 4 resolved (drop)