DTX 2.0 has NO reporting models - no Report, ReportCategory, ReportData, or ReportSchedule. Only manual Excel exports exist. Team must decide reporting strategy before migration.
Migration of reporting system from DTX 1.0 to DTX 2.0. This is Phase 5 — depends on all previous areas.
DTX 1.0 has a full-featured reporting subsystem: a dynamic report builder using SQL fragment templates (report_field_definition), composable report definitions (report, report_definition), a 93-column denormalized materialized view (report_data), and scheduled report delivery with email distribution (report_master_schedule).
DTX 2.0 has no reporting models. There are no Report, ReportDefinition, ReportCategory, ReportData, ReportSchedule, or Dashboard models anywhere in the codebase. The only export functionality that exists is:
- Manual Excel export for Records and Accounts (docutrax_hierarchy/export_utils.py)
- Payment history CSV/PDF export (docutrax_billing/views.py)
- Real-time compliance status calculations (no materialized views)
This means all 8 tables in this area have no direct DTX 2.0 target model. The team must decide the reporting strategy for DTX 2.0 before any data migration can occur.
| DTX 1.0 Table | Rows (sample) | Type | DTX 2.0 Target |
|---|---|---|---|
report |
100 | Configuration | No target model |
report_category |
9 | Lookup | No target model |
report_data |
100 | Materialized view | No target model |
report_definition |
100 | Junction (report × fields) | No target model |
report_field_definition |
100 | Metadata/data dictionary | No target model |
report_master_log |
100 | Execution audit | No target model |
report_master_schedule |
100 | Schedule config | No target model |
report_master_schedule_log |
100 | Execution audit | No target model |
| Component | File | Capability |
|---|---|---|
export_record_to_xlsx() |
docutrax_hierarchy/export_utils.py:70–158 |
Single Record → Excel |
export_account_to_xlsx() |
docutrax_hierarchy/export_utils.py:161–269 |
Single Account → Excel |
DirectoryPaymentsExportView |
docutrax_billing/views.py:3130–3175 |
Payment history → CSV |
DirectoryPaymentsPDFExportView |
docutrax_billing/views.py:3178–3210 |
Payment history → PDF |
| Depends On | What's Needed |
|---|---|
| Area 2: Customers/Accounts | report_data.SO_ID → Account mapping; report.customer_id → Account |
| Area 3: Vendors/Records | report_data.PH_ID → Record mapping |
| Area 4: Certificates/Documents | Compliance status data for report_data |
| Area 5: Coverage/Insurance | Coverage compliance data |
| Area 6: Requirements/Compliance | COIRiskProfile templates (compliance metrics) |
| Area 7: Extraction/OCR | Extraction metrics |
| Area 8: Notifications | Notification metrics |
| Area 11: File Repository | Repo/billing metrics |
| Depended On By | What It Provides |
|---|---|
| (None) | This is the final area — nothing depends on it |
The reporting system has three distinct layers:
report_category (9 rows, 3 columns) — Simple lookup:
| ID | Name | Notes |
|---|---|---|
| 1 | COI | Certificate reports |
| 2 | DOC | Document reports |
| 3 | Management | Internal management reports |
| 4 | Accounts | Account-level reports |
| 5 | Custom | User-created ad-hoc reports |
| 6 | Custom Management | Custom management reports |
| 7 | Scheduled | Scheduled delivery reports |
| 8 | Special Request - SO Dashboard | Client-specific (customer_type='A') |
| 9 | Auto Scheduled | Automatically scheduled reports |
report (100 rows, 15 columns) — Report templates:
- Two tiers: system reports (customer_id=0, internal_name like "COI-Expired", "ACCT-Master") and custom reports (customer-specific, no internal_name).
- Prefix convention: ACCT-*, COI-*, DOC-*, MAN-*, SRS-* (scheduled).
- Format flags: csv_format (always 1), pdf_format (rarely 1).
- industry_sector: Only "Construction" appears; most are NULL.
report_field_definition (100+ rows, 14 columns) — Data dictionary with SQL fragments:
- Defines every available report field with: category, name, label, description, display_format, and raw SQL expressions (select_expression, table_name, table_join).
- SQL uses template placeholders: __CUSTOMER_SELECTOR_FROM__, __CUSTOMER_SELECTOR_WHERE__, __NUM_DAYS_COMPLIANT__, __BETWEEN_INCLUSION__.
- Field categories: Account Profile, Account Data, Account Data - Ping, Broker Profile, Carrier, Client Profile.
- These SQL fragments reference DTX 1.0 table and column names. They cannot be used directly in DTX 2.0.
report_definition (100+ rows, 4 columns) — Junction table mapping report_id × report_field_id with ordering. Defines which fields appear in each report and in what column order.
report_data (100+ rows, 93 columns) — Denormalized materialized view:
- Flattens the full hierarchy: SO (Service Org) → CH (Client/Certificate Holder) → PH (Policyholder/Vendor)
- Contains every compliance status, alert date, custom field, boolean flag, and count pre-computed for fast reporting.
- Key column groups:
- SO fields (7): ID, Name, Status, Industry, Account Manager
- CH fields (6): ID, Name, Status, Create/Inactivated times
- PH fields (60+): ID, Name, all status timestamps, COI compliance, DOC compliance, alert dates, boolean flags, custom fields 1–15, counts
- Sentinel values: PH_DOC_RiskProfile_ID = -1 (no profile assigned)
- Mixed boolean encoding: "yes"/"no" strings AND "1"/"0" integers in different columns
report_master_schedule (100 rows, 16 columns) — Scheduled report delivery:
- recurrence: monthly_recurrence, daily_recurrence, weekly_recurrence, does_not_repeat_recurrence, or empty
- tod: Time of day (hour, 0–13)
- dow_bitmap: Day of week bitmask (same encoding as notification dow_bitmap)
- dom: Day of month (for monthly)
- email: PHP serialized email recipient arrays
- do_not_send_email: PHP serialized exclusion lists
- start_date: Many rows have 0000-00-00 (MySQL zero-date)
report_master_log (100 rows, 8 columns) — Manual execution audit:
- Tracks every report run: user_id, customer_id, ip_address, timestamp
- scheduled flag (always 0 in sample — all manual runs)
report_master_schedule_log (100 rows, 5 columns) — Scheduled execution audit: - Tracks scheduled report deliveries: schedule_id, create_time, sent (0/1), sent_time - Early entries show failures (sent=0), later entries show success — suggests system startup issues
DTX 2.0 has no reporting models today. The team must decide the reporting strategy before any migration can occur. The options are:
| Option | Description | Migration Impact |
|---|---|---|
| A. Build DTX 2.0 reporting system | Create Report, ReportDefinition, ReportSchedule models. Build a new report builder UI. | Migrate report templates (Layer 1) as seed data for the new system. Report field definitions need complete rewrite (SQL fragments reference DTX 1.0 schema). |
| B. Use external BI tool | Integrate Metabase, Looker, Power BI, or similar against DTX 2.0's PostgreSQL database. | No model migration needed. Configure BI tool to query DTX 2.0 tables directly. Recreate report definitions in the BI tool's native format. |
| C. Build ad-hoc export system | Extend existing Excel/CSV export to support configurable field selection and filtering. | Minimal migration. Use report_field_definition as a reference for which fields users need. No schedule/materialized view needed. |
| D. Defer entirely | Launch DTX 2.0 without reporting. Add it later based on user demand. | No migration. Archive all 8 tables for reference. |
This decision blocks all other decisions in this area.
The report table (15 columns) contains report templates — both system-wide templates (categories 1–4, 7) and customer-specific custom reports (category 5).
If Option A (build reporting system):
- Migrate system reports (customer_id=0) as seed data for the new system.
- Custom reports (customer_id≠0) need customer_id → Account remapping.
- report_category (9 rows) can serve as seed data for report categorization.
- report_internal_name prefix convention (ACCT-/COI-/DOC-/MAN-/SRS-) provides a natural taxonomy.
If Option B/C/D: - Archive for reference. Report definitions would be recreated natively in the chosen tool/framework.
The report_field_definition table (14 columns) is the data dictionary powering the dynamic report builder. Each row defines a reportable field with its SQL expression, display format, and category. The report_definition junction table maps which fields appear in each report.
Critical limitation: The SQL fragments in select_expression, table_name, and table_join reference DTX 1.0 table names (rd.PH_Name, v.address_1, __CUSTOMER_SELECTOR_FROM__). These are not portable to DTX 2.0's Django ORM schema.
If Option A (build reporting system):
- The field metadata (name, label, description, display_format, category) is valuable and should inform the new report field definitions.
- The SQL fragments must be completely rewritten against DTX 2.0's schema.
- The report_definition junction data shows which fields users actually used in their reports — useful for prioritizing which fields to implement first.
If Option B (BI tool): - Use field metadata as a requirements spec for BI tool configuration. - SQL fragments are irrelevant (BI tool generates its own queries).
If Option C/D: - Archive for reference.
The report_data table (93 columns) is a denormalized materialized view — a pre-computed snapshot of the entire hierarchy with all compliance statuses, alert dates, boolean flags, and custom fields flattened into a single wide table. It is refreshed periodically (batch create_time timestamps visible in sample).
Don't migrate regardless of architecture decision. This is a computed cache, not source data. All the underlying data is being migrated through Areas 2–9. DTX 2.0 should either: - Query source tables directly (with proper indexes) - Build its own materialized views from DTX 2.0 models - Use BI tool caching/aggregation
The 93-column schema is useful as a requirements document — it shows exactly what data points the reporting system needs access to.
The report_master_schedule table (16 columns) defines scheduled report delivery: recurrence pattern, time of day, day of week/month, and email recipients (PHP serialized).
If Option A (build reporting system):
- Migrate schedule configurations with:
- recurrence → new schedule model recurrence field
- tod / dow_bitmap / dom → timing fields
- email → deserialize PHP arrays to JSON email lists
- report_id → remap to new report IDs
- customer_id → remap to Account
- start_date → handle 0000-00-00 zero-dates (map to NULL or migration date)
- do_not_send_email → deserialize or drop (exclusion lists)
If Option B/C/D: - Archive. Scheduled delivery would be configured natively in the BI tool or deferred.
Data quality notes:
- PHP serialized email and do_not_send_email columns need deserialization.
- Many start_date values are 0000-00-00 (MySQL zero-date sentinel).
- Some rows have empty recurrence fields (likely one-time or manual triggers).
- Bulk update timestamps (2025-07-09, 2023-06-29) suggest batch migrations touched many schedules.
report_master_log (8 columns) — Audit trail of manual report executions (who ran which report, when, from where).
report_master_schedule_log (5 columns) — Execution log of scheduled report deliveries (was it sent successfully, when).
Don't migrate. These are historical operational logs with no value in a new reporting system. They document DTX 1.0 usage patterns, not data that needs to be preserved.
Keep CSVs as archive. The log data could be useful for understanding which reports were most used (to prioritize DTX 2.0 report implementation).
Usage insights from the logs: - Most popular reports: report_id 1 (Account Master), 55 (Link Customer Compliance), 127 (DOC-Not in Compliance), 135 (COI-Not in Compliance) - Execution pattern: Heavy manual usage during US business hours; overnight scheduled runs - Scheduled report failures in early Jan 2023 suggest system startup issues (resolved within 2 days)
Area 1 (Users & Auth) identified and dropped several client-specific report rights:
| Right ID | Description | Area 1 Decision |
|---|---|---|
| 1000 | Special report for Vornado | Dropped (per-client hack) |
| 1002 | Special report for ProSight | Dropped |
| 1003 | Special report for Ferrandino | Dropped |
| 1004 | Special report for ELS | Dropped |
| 1005 | Special report for Toyota Financial | Dropped |
| 1008 | Edit Management Reports | Dropped |
| 1009 | Special filters for Vornado | Dropped |
| 217 | View complete report field definition | Dropped |
Confirmed: All client-specific report rights were correctly dropped in Area 1. These were per-client customizations baked into the rights system. DTX 2.0's permission model (Permission + UserPermission) handles access control at a more granular level.
If DTX 2.0 builds a reporting system, report access should be controlled via the standard permission system (resource='reports', action='read'/'manage') rather than per-client right IDs.
| Source Area | Reference | Resolution |
|---|---|---|
| Area 7: Extraction/OCR | "Extraction metrics for reports" | Resolved — Extraction data migrated in Area 7. Available for reporting via Extraction model queries. |
| Area 8: Notifications | "Notification metrics for reports" | Resolved — Notification data will be generated by DTX 2.0's notification system. No historical notification metrics to migrate. |
| Area 11: File Repository | "Repo metrics, billing data for reports" | Resolved — repo_throughput and billing_data not migrated (denormalized snapshots). DTX 2.0 calculates metrics from source data. |
| Area 6: Requirements | "Compliance metrics depend on COIRiskProfile templates" | Resolved — COIRiskProfile templates migrated in Area 6. Available for compliance calculations. |
| # | Issue | Impact | Mitigation |
|---|---|---|---|
| 1 | No reporting models in DTX 2.0 | Entire reporting subsystem must be built or replaced. Blocks all migration decisions. | Decision #1 must be resolved first. |
| 2 | SQL fragments in report_field_definition reference DTX 1.0 schema | Cannot be directly ported. select_expression, table_name, table_join columns contain old table/column names. |
If building DTX 2.0 reporting, rewrite all field definitions against Django ORM. Use metadata (name, label, category) as requirements. |
| 3 | PHP serialized emails in report_master_schedule (email, do_not_send_email) |
Requires deserialization. Some contain "N;" (PHP null). |
Use phpserialize Python package. |
| 4 | MySQL zero-dates in report_master_schedule.start_date (0000-00-00) |
Invalid for PostgreSQL. | Map to NULL or migration date. |
| 5 | 93-column report_data table with mixed encodings | "yes"/"no" strings AND "1"/"0" integers for booleans. -1 sentinel for no risk profile. |
Not migrated (Decision #4), but useful as field requirements reference. |
| 6 | report_field_definition SQL template placeholders (__CUSTOMER_SELECTOR_FROM__, etc.) |
Proprietary template system tied to DTX 1.0 query builder. | Document for reference; do not attempt to port. |
| 7 | Custom reports are customer-specific | Custom reports (category 5) have customer_id≠0 and no internal_name. Need Account remapping if migrated. | Only relevant if Option A chosen (Decision #1). |
| # | Action | Blocking? |
|---|---|---|
| 1 | Team decision on reporting architecture (Decision #1: Option A/B/C/D) | Yes — blocks all other migration work in this area. |
| 2 | Inventory most-used reports: Analyze report_master_log for report execution frequency to prioritize which reports to build/configure first in DTX 2.0. | No — but valuable for prioritization. |
| 3 | Extract field requirements from report_field_definition: Create a clean field inventory (name, label, category, display_format) stripped of SQL fragments, as a requirements spec for the new reporting system. | No — but useful for any option. |
| 4 | Identify active scheduled reports: Query report_master_schedule for non-empty recurrence + recent schedule_log activity to find schedules that are still actively delivering reports. | No — but needed if Option A is chosen. |
Step 0: Team decides reporting architecture (Decision #1) — BLOCKING
↓
Option A: Build DTX 2.0 reporting
Step 1: Create new Report/ReportSchedule models
Step 2: Migrate report + report_category as seed data
Step 3: Rewrite report_field_definition against DTX 2.0 schema
Step 4: Migrate report_definition (field ordering)
Step 5: Migrate report_master_schedule (with PHP deserialize)
Step 6: Skip report_data, report_master_log, report_master_schedule_log
Option B: External BI tool
Step 1: Configure BI tool against DTX 2.0 PostgreSQL
Step 2: Recreate report definitions natively
Step 3: Archive all 8 tables
Option C/D: Defer or ad-hoc export
Step 1: Archive all 8 tables
Step 2: Use field inventory as future requirements
| Area | Reference | Status |
|---|---|---|
| Area 1: Users & Auth | Client-specific report rights (1000–1009, 217) | Resolved — all dropped in Area 1 |
| Area 2: Customers/Accounts | report.customer_id, report_data.SO_ID → Account |
Resolved — use Area 2 mapping if Option A |
| Area 3: Vendors/Records | report_data.PH_ID → Record |
Resolved — use Area 3 mapping |
| Area 6: Requirements | Compliance metrics depend on COIRiskProfile | Resolved — templates migrated in Area 6 |
| Area 7: Extraction | Extraction metrics for reports | Resolved — data available via Extraction model |
| Area 8: Notifications | Notification metrics | Resolved — generated by DTX 2.0 notification system |
| Area 11: File Repository | repo_throughput, billing_data | Resolved — not migrated (computed snapshots) |
| # | Decision | Status | Approach |
|---|---|---|---|
| 1 | Reporting system architecture — Team decision required | Documented | See decision details below |
| 2 | report + report_category → Report templates | Documented | See decision details below |
| 3 | report_field_definition + report_definition → Report builder configuration | Documented | See decision details below |
| 4 | report_data → Don't migrate (materialized view) | Documented | See decision details below |
| 5 | report_master_schedule → Scheduled report delivery | Documented | See decision details below |
| 6 | report_master_log + report_master_schedule_log → Don't migrate | Documented | See decision details below |
| 7 | Client-specific report rights from Area 1 | Documented | See decision details below |
| 8 | Deferred references from other areas | Documented | See decision details below |