📋 Migration Summary

Source Tables: 8
Target Models: 0 (not implemented)
Decisions: 8

DTX 2.0 has NO reporting models - no Report, ReportCategory, ReportData, or ReportSchedule. Only manual Excel exports exist. Team must decide reporting strategy before migration.

🎯 Key Actions

⚠️ Key Challenges

Area 10: Reporting

Overview

Migration of reporting system from DTX 1.0 to DTX 2.0. This is Phase 5 — depends on all previous areas.

DTX 1.0 has a full-featured reporting subsystem: a dynamic report builder using SQL fragment templates (report_field_definition), composable report definitions (report, report_definition), a 93-column denormalized materialized view (report_data), and scheduled report delivery with email distribution (report_master_schedule).

DTX 2.0 has no reporting models. There are no Report, ReportDefinition, ReportCategory, ReportData, ReportSchedule, or Dashboard models anywhere in the codebase. The only export functionality that exists is: - Manual Excel export for Records and Accounts (docutrax_hierarchy/export_utils.py) - Payment history CSV/PDF export (docutrax_billing/views.py) - Real-time compliance status calculations (no materialized views)

This means all 8 tables in this area have no direct DTX 2.0 target model. The team must decide the reporting strategy for DTX 2.0 before any data migration can occur.

Source Tables (8)

DTX 1.0 Table Rows (sample) Type DTX 2.0 Target
report 100 Configuration No target model
report_category 9 Lookup No target model
report_data 100 Materialized view No target model
report_definition 100 Junction (report × fields) No target model
report_field_definition 100 Metadata/data dictionary No target model
report_master_log 100 Execution audit No target model
report_master_schedule 100 Schedule config No target model
report_master_schedule_log 100 Execution audit No target model

Existing DTX 2.0 Export Capabilities

Component File Capability
export_record_to_xlsx() docutrax_hierarchy/export_utils.py:70–158 Single Record → Excel
export_account_to_xlsx() docutrax_hierarchy/export_utils.py:161–269 Single Account → Excel
DirectoryPaymentsExportView docutrax_billing/views.py:3130–3175 Payment history → CSV
DirectoryPaymentsPDFExportView docutrax_billing/views.py:3178–3210 Payment history → PDF

Dependencies

Depends On What's Needed
Area 2: Customers/Accounts report_data.SO_ID → Account mapping; report.customer_id → Account
Area 3: Vendors/Records report_data.PH_ID → Record mapping
Area 4: Certificates/Documents Compliance status data for report_data
Area 5: Coverage/Insurance Coverage compliance data
Area 6: Requirements/Compliance COIRiskProfile templates (compliance metrics)
Area 7: Extraction/OCR Extraction metrics
Area 8: Notifications Notification metrics
Area 11: File Repository Repo/billing metrics
Depended On By What It Provides
(None) This is the final area — nothing depends on it

DTX 1.0 Reporting Architecture

The reporting system has three distinct layers:

Layer 1: Configuration (what reports exist)

report_category (9 rows, 3 columns) — Simple lookup:

ID Name Notes
1 COI Certificate reports
2 DOC Document reports
3 Management Internal management reports
4 Accounts Account-level reports
5 Custom User-created ad-hoc reports
6 Custom Management Custom management reports
7 Scheduled Scheduled delivery reports
8 Special Request - SO Dashboard Client-specific (customer_type='A')
9 Auto Scheduled Automatically scheduled reports

report (100 rows, 15 columns) — Report templates: - Two tiers: system reports (customer_id=0, internal_name like "COI-Expired", "ACCT-Master") and custom reports (customer-specific, no internal_name). - Prefix convention: ACCT-*, COI-*, DOC-*, MAN-*, SRS-* (scheduled). - Format flags: csv_format (always 1), pdf_format (rarely 1). - industry_sector: Only "Construction" appears; most are NULL.

report_field_definition (100+ rows, 14 columns) — Data dictionary with SQL fragments: - Defines every available report field with: category, name, label, description, display_format, and raw SQL expressions (select_expression, table_name, table_join). - SQL uses template placeholders: __CUSTOMER_SELECTOR_FROM__, __CUSTOMER_SELECTOR_WHERE__, __NUM_DAYS_COMPLIANT__, __BETWEEN_INCLUSION__. - Field categories: Account Profile, Account Data, Account Data - Ping, Broker Profile, Carrier, Client Profile. - These SQL fragments reference DTX 1.0 table and column names. They cannot be used directly in DTX 2.0.

report_definition (100+ rows, 4 columns) — Junction table mapping report_id × report_field_id with ordering. Defines which fields appear in each report and in what column order.

Layer 2: Data (what reports query)

report_data (100+ rows, 93 columns) — Denormalized materialized view: - Flattens the full hierarchy: SO (Service Org) → CH (Client/Certificate Holder) → PH (Policyholder/Vendor) - Contains every compliance status, alert date, custom field, boolean flag, and count pre-computed for fast reporting. - Key column groups: - SO fields (7): ID, Name, Status, Industry, Account Manager - CH fields (6): ID, Name, Status, Create/Inactivated times - PH fields (60+): ID, Name, all status timestamps, COI compliance, DOC compliance, alert dates, boolean flags, custom fields 1–15, counts - Sentinel values: PH_DOC_RiskProfile_ID = -1 (no profile assigned) - Mixed boolean encoding: "yes"/"no" strings AND "1"/"0" integers in different columns

Layer 3: Scheduling & Audit (when reports run)

report_master_schedule (100 rows, 16 columns) — Scheduled report delivery: - recurrence: monthly_recurrence, daily_recurrence, weekly_recurrence, does_not_repeat_recurrence, or empty - tod: Time of day (hour, 0–13) - dow_bitmap: Day of week bitmask (same encoding as notification dow_bitmap) - dom: Day of month (for monthly) - email: PHP serialized email recipient arrays - do_not_send_email: PHP serialized exclusion lists - start_date: Many rows have 0000-00-00 (MySQL zero-date)

report_master_log (100 rows, 8 columns) — Manual execution audit: - Tracks every report run: user_id, customer_id, ip_address, timestamp - scheduled flag (always 0 in sample — all manual runs)

report_master_schedule_log (100 rows, 5 columns) — Scheduled execution audit: - Tracks scheduled report deliveries: schedule_id, create_time, sent (0/1), sent_time - Early entries show failures (sent=0), later entries show success — suggests system startup issues


Decisions

Decision #1: Reporting system architecture — Team decision required

DTX 2.0 has no reporting models today. The team must decide the reporting strategy before any migration can occur. The options are:

Option Description Migration Impact
A. Build DTX 2.0 reporting system Create Report, ReportDefinition, ReportSchedule models. Build a new report builder UI. Migrate report templates (Layer 1) as seed data for the new system. Report field definitions need complete rewrite (SQL fragments reference DTX 1.0 schema).
B. Use external BI tool Integrate Metabase, Looker, Power BI, or similar against DTX 2.0's PostgreSQL database. No model migration needed. Configure BI tool to query DTX 2.0 tables directly. Recreate report definitions in the BI tool's native format.
C. Build ad-hoc export system Extend existing Excel/CSV export to support configurable field selection and filtering. Minimal migration. Use report_field_definition as a reference for which fields users need. No schedule/materialized view needed.
D. Defer entirely Launch DTX 2.0 without reporting. Add it later based on user demand. No migration. Archive all 8 tables for reference.

This decision blocks all other decisions in this area.


Decision #2: report + report_category → Report templates

The report table (15 columns) contains report templates — both system-wide templates (categories 1–4, 7) and customer-specific custom reports (category 5).

If Option A (build reporting system): - Migrate system reports (customer_id=0) as seed data for the new system. - Custom reports (customer_id≠0) need customer_id → Account remapping. - report_category (9 rows) can serve as seed data for report categorization. - report_internal_name prefix convention (ACCT-/COI-/DOC-/MAN-/SRS-) provides a natural taxonomy.

If Option B/C/D: - Archive for reference. Report definitions would be recreated natively in the chosen tool/framework.


Decision #3: report_field_definition + report_definition → Report builder configuration

The report_field_definition table (14 columns) is the data dictionary powering the dynamic report builder. Each row defines a reportable field with its SQL expression, display format, and category. The report_definition junction table maps which fields appear in each report.

Critical limitation: The SQL fragments in select_expression, table_name, and table_join reference DTX 1.0 table names (rd.PH_Name, v.address_1, __CUSTOMER_SELECTOR_FROM__). These are not portable to DTX 2.0's Django ORM schema.

If Option A (build reporting system): - The field metadata (name, label, description, display_format, category) is valuable and should inform the new report field definitions. - The SQL fragments must be completely rewritten against DTX 2.0's schema. - The report_definition junction data shows which fields users actually used in their reports — useful for prioritizing which fields to implement first.

If Option B (BI tool): - Use field metadata as a requirements spec for BI tool configuration. - SQL fragments are irrelevant (BI tool generates its own queries).

If Option C/D: - Archive for reference.


Decision #4: report_data → Don't migrate (materialized view)

The report_data table (93 columns) is a denormalized materialized view — a pre-computed snapshot of the entire hierarchy with all compliance statuses, alert dates, boolean flags, and custom fields flattened into a single wide table. It is refreshed periodically (batch create_time timestamps visible in sample).

Don't migrate regardless of architecture decision. This is a computed cache, not source data. All the underlying data is being migrated through Areas 2–9. DTX 2.0 should either: - Query source tables directly (with proper indexes) - Build its own materialized views from DTX 2.0 models - Use BI tool caching/aggregation

The 93-column schema is useful as a requirements document — it shows exactly what data points the reporting system needs access to.


Decision #5: report_master_schedule → Scheduled report delivery

The report_master_schedule table (16 columns) defines scheduled report delivery: recurrence pattern, time of day, day of week/month, and email recipients (PHP serialized).

If Option A (build reporting system): - Migrate schedule configurations with: - recurrence → new schedule model recurrence field - tod / dow_bitmap / dom → timing fields - email → deserialize PHP arrays to JSON email lists - report_id → remap to new report IDs - customer_id → remap to Account - start_date → handle 0000-00-00 zero-dates (map to NULL or migration date) - do_not_send_email → deserialize or drop (exclusion lists)

If Option B/C/D: - Archive. Scheduled delivery would be configured natively in the BI tool or deferred.

Data quality notes: - PHP serialized email and do_not_send_email columns need deserialization. - Many start_date values are 0000-00-00 (MySQL zero-date sentinel). - Some rows have empty recurrence fields (likely one-time or manual triggers). - Bulk update timestamps (2025-07-09, 2023-06-29) suggest batch migrations touched many schedules.


Decision #6: report_master_log + report_master_schedule_log → Don't migrate

report_master_log (8 columns) — Audit trail of manual report executions (who ran which report, when, from where).

report_master_schedule_log (5 columns) — Execution log of scheduled report deliveries (was it sent successfully, when).

Don't migrate. These are historical operational logs with no value in a new reporting system. They document DTX 1.0 usage patterns, not data that needs to be preserved.

Keep CSVs as archive. The log data could be useful for understanding which reports were most used (to prioritize DTX 2.0 report implementation).

Usage insights from the logs: - Most popular reports: report_id 1 (Account Master), 55 (Link Customer Compliance), 127 (DOC-Not in Compliance), 135 (COI-Not in Compliance) - Execution pattern: Heavy manual usage during US business hours; overnight scheduled runs - Scheduled report failures in early Jan 2023 suggest system startup issues (resolved within 2 days)


Decision #7: Client-specific report rights from Area 1

Area 1 (Users & Auth) identified and dropped several client-specific report rights:

Right ID Description Area 1 Decision
1000 Special report for Vornado Dropped (per-client hack)
1002 Special report for ProSight Dropped
1003 Special report for Ferrandino Dropped
1004 Special report for ELS Dropped
1005 Special report for Toyota Financial Dropped
1008 Edit Management Reports Dropped
1009 Special filters for Vornado Dropped
217 View complete report field definition Dropped

Confirmed: All client-specific report rights were correctly dropped in Area 1. These were per-client customizations baked into the rights system. DTX 2.0's permission model (Permission + UserPermission) handles access control at a more granular level.

If DTX 2.0 builds a reporting system, report access should be controlled via the standard permission system (resource='reports', action='read'/'manage') rather than per-client right IDs.


Decision #8: Deferred references from other areas

Source Area Reference Resolution
Area 7: Extraction/OCR "Extraction metrics for reports" Resolved — Extraction data migrated in Area 7. Available for reporting via Extraction model queries.
Area 8: Notifications "Notification metrics for reports" Resolved — Notification data will be generated by DTX 2.0's notification system. No historical notification metrics to migrate.
Area 11: File Repository "Repo metrics, billing data for reports" Resolved — repo_throughput and billing_data not migrated (denormalized snapshots). DTX 2.0 calculates metrics from source data.
Area 6: Requirements "Compliance metrics depend on COIRiskProfile templates" Resolved — COIRiskProfile templates migrated in Area 6. Available for compliance calculations.

Known Issues

# Issue Impact Mitigation
1 No reporting models in DTX 2.0 Entire reporting subsystem must be built or replaced. Blocks all migration decisions. Decision #1 must be resolved first.
2 SQL fragments in report_field_definition reference DTX 1.0 schema Cannot be directly ported. select_expression, table_name, table_join columns contain old table/column names. If building DTX 2.0 reporting, rewrite all field definitions against Django ORM. Use metadata (name, label, category) as requirements.
3 PHP serialized emails in report_master_schedule (email, do_not_send_email) Requires deserialization. Some contain "N;" (PHP null). Use phpserialize Python package.
4 MySQL zero-dates in report_master_schedule.start_date (0000-00-00) Invalid for PostgreSQL. Map to NULL or migration date.
5 93-column report_data table with mixed encodings "yes"/"no" strings AND "1"/"0" integers for booleans. -1 sentinel for no risk profile. Not migrated (Decision #4), but useful as field requirements reference.
6 report_field_definition SQL template placeholders (__CUSTOMER_SELECTOR_FROM__, etc.) Proprietary template system tied to DTX 1.0 query builder. Document for reference; do not attempt to port.
7 Custom reports are customer-specific Custom reports (category 5) have customer_id≠0 and no internal_name. Need Account remapping if migrated. Only relevant if Option A chosen (Decision #1).

Pre-Migration Actions

# Action Blocking?
1 Team decision on reporting architecture (Decision #1: Option A/B/C/D) Yes — blocks all other migration work in this area.
2 Inventory most-used reports: Analyze report_master_log for report execution frequency to prioritize which reports to build/configure first in DTX 2.0. No — but valuable for prioritization.
3 Extract field requirements from report_field_definition: Create a clean field inventory (name, label, category, display_format) stripped of SQL fragments, as a requirements spec for the new reporting system. No — but useful for any option.
4 Identify active scheduled reports: Query report_master_schedule for non-empty recurrence + recent schedule_log activity to find schedules that are still actively delivering reports. No — but needed if Option A is chosen.

Migration Order

Step 0: Team decides reporting architecture (Decision #1) — BLOCKING
         ↓
    Option A: Build DTX 2.0 reporting
         Step 1: Create new Report/ReportSchedule models
         Step 2: Migrate report + report_category as seed data
         Step 3: Rewrite report_field_definition against DTX 2.0 schema
         Step 4: Migrate report_definition (field ordering)
         Step 5: Migrate report_master_schedule (with PHP deserialize)
         Step 6: Skip report_data, report_master_log, report_master_schedule_log

    Option B: External BI tool
         Step 1: Configure BI tool against DTX 2.0 PostgreSQL
         Step 2: Recreate report definitions natively
         Step 3: Archive all 8 tables

    Option C/D: Defer or ad-hoc export
         Step 1: Archive all 8 tables
         Step 2: Use field inventory as future requirements

Cross-Area References

Area Reference Status
Area 1: Users & Auth Client-specific report rights (1000–1009, 217) Resolved — all dropped in Area 1
Area 2: Customers/Accounts report.customer_id, report_data.SO_ID → Account Resolved — use Area 2 mapping if Option A
Area 3: Vendors/Records report_data.PH_ID → Record Resolved — use Area 3 mapping
Area 6: Requirements Compliance metrics depend on COIRiskProfile Resolved — templates migrated in Area 6
Area 7: Extraction Extraction metrics for reports Resolved — data available via Extraction model
Area 8: Notifications Notification metrics Resolved — generated by DTX 2.0 notification system
Area 11: File Repository repo_throughput, billing_data Resolved — not migrated (computed snapshots)

Decision Summary

# Decision Status Approach
1 Reporting system architecture — Team decision required Documented See decision details below
2 report + report_category → Report templates Documented See decision details below
3 report_field_definition + report_definition → Report builder configuration Documented See decision details below
4 report_data → Don't migrate (materialized view) Documented See decision details below
5 report_master_schedule → Scheduled report delivery Documented See decision details below
6 report_master_log + report_master_schedule_log → Don't migrate Documented See decision details below
7 Client-specific report rights from Area 1 Documented See decision details below
8 Deferred references from other areas Documented See decision details below