Investigate how to integrate new FRA report type into data parsing pipeline #3400

lhuxraft · 2025-01-02T16:52:19Z

Background

This spike aims to explore and determine the best approach for integrating the new FRA report type into the data parsing pipeline. The focus will be on understanding the requirements for handling the new report type and evaluating how to modify the current parsing logic to accommodate these changes. The investigation should clarify whether new parsing functions or updates to existing logic are necessary, and how to handle reparsing gracefully for FRA files.

Acceptance Criteria

A clear understanding of the changes required to the parsing pipeline to handle the new FRA report type.
A proposed approach for implementing the parsing logic for FRA files (e.g., adding a dedicated function vs. extending the current one).
A list of potential challenges or technical concerns, including how to handle reparsing.
Documentation of any recommended changes to the DataFile model or related components.
A decision on how a new FRA schema definition is implemented for parsing.

Tasks

Review the current data parsing pipeline and identify areas that need to be modified for the new FRA report type.
Investigate the best approach for integrating the new FRA sections into the existing parsing functions (e.g., should parse_FRA() be created, or should we extend parse_datafile()?).
Determine if reparsing logic needs to be updated to handle FRA files differently from other data files.
Explore whether existing RowSchema and Field classes can support creation of FRA schema definitions or if modifications/new classes are needed
Document any technical constraints, limitations, or challenges that may impact the implementation.
Explore creating separate python package for schema_defs for versioning related to Enhance seed_db to be schema-aware and support versioning #3168 -- add versioning to records (t1 records, parserError, other objects).

Supporting Documentation

Error Report Mockup. Note that the validation rules section (below) is just for note purposes, not spec.

Frontend screenshots:

Open Questions

Does creating a parse_FRA() function make more sense rather than trying to shoehorn FRA functionality into parse_datafile()?
- Refactoring parse_datafile() to be modular and having specific parameters/identifier for FRA or TANF. Split off more functions underneath parse_datafile to handle differing types.
Is the addition of an FRA schema definition sufficient with the three undefined report types for parsing?
- Creation of specific subclasses for ParserError, Field, etc. appear to be necessary
- calling upon a parent class for file type and/or schema_def itself.

Ticket/work pairing

Tech Memo on refactoring parse_datafile() to move away from the functional programming approach to a more class-based using parent "Parser" class to handle datafile types

cover broad changes to changes needed in unit testing
Brief-ish documentation around how error report generation implementation will be different for FRA using Version A (inline errors on existing upload) -- different rows/columns, etc.

Tech memo on parser engine versioning (via python package or other)

Microservice app exploration? Separate git repo, divorced from TANF-app, versioning managed by DRF?

Tech Memo on new RowSchema and Field classes along with other new subclass/rework

lhuxraft added dev spike labels Jan 2, 2025

lhuxraft assigned andrew-jameson Jan 2, 2025

lhuxraft changed the title ~~Investigate correct method for parsing FRA~~ Investigate how to integrate new FRA report type into data parsing pipeline Jan 2, 2025

lhuxraft added office hours Refined Ticket has been refined at the backlog refinement labels Jan 3, 2025

andrew-jameson removed office hours Refined Ticket has been refined at the backlog refinement labels Jan 8, 2025

andrew-jameson mentioned this issue Jan 13, 2025

[Tech Memo] Refactoring backend parsing logic for FRA report integration #3416

Open

8 tasks

lhuxraft closed this as completed Jan 14, 2025

This was referenced Jan 15, 2025

[Tech memo] Subclass improvements for parsing modularity to support FRA #3426

Open

[Tech memo] Parser versioning #3429

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate how to integrate new FRA report type into data parsing pipeline #3400

Investigate how to integrate new FRA report type into data parsing pipeline #3400

lhuxraft commented Jan 2, 2025 •

edited

Loading

Investigate how to integrate new FRA report type into data parsing pipeline #3400

Investigate how to integrate new FRA report type into data parsing pipeline #3400

Comments

lhuxraft commented Jan 2, 2025 • edited Loading

Background

Acceptance Criteria

Tasks

Supporting Documentation

Open Questions

Ticket/work pairing

lhuxraft commented Jan 2, 2025 •

edited

Loading