Extract Tracker

These are the tables associated with extract tracking/handling. Links to other subjects will be provided as noted.

Extract Dataset Type

This table tracks the relationship between extract files and dataset types.

extract_dataset_type
Column Name Column Type Column Description
extract_id Integer Foreign key to the Extract Tracking table.
dataset_type_id Integer Foreign key to the Dataset Type table.

Extract Dependency

This table tracks the interdependencies between extract files.

extract_dependency
Column Name Column Type Column Description
parent_extract_id Integer Foreign key to the Extract Tracking table. The parent extract of the relationship.
child_extract_id Integer Foreign key to the Extract Tracking table. The child extract of the relationship.

Extract Process Tracking

This table tracks the association between extract files and process runs.

extract_process_tracking
Column Name Column Type Column Description
extract_tracking_id Integer Foreign key to the Extract Tracking table.
process_tracking_id Integer Foreign key to the Process Tracking table.
extract_process_status_id Integer Status of the extract from the process run. Foreign key to the Extract Status table.
extract_process_event_date_time Datetime/timestamp The date/time of the status change for the extract.

Extract Status

This table is a lookup of system and user provided extract statuses.

extract_status_lkup
Column Name Column Type Column Description
extract_status_id Auto incrementing integer sequence System key for the extract status
extract_status_name String(75) Unique name of the extract status type

Some default extract status types are provided on initialization.

Default Extract Status Types
Extract Status Type Description
initializing The extract file is being written to and/or is not ready for use.
ready The extract file is ready to be used.
loading The extract file is being used/loaded by a process run.
loaded The extract file has successfully been loaded by a process run.
archived The extract file has successfully been archived and can only be reprocessed if moved back out of archive location.
deleted The extract file has successfully been removed from the archive and can no longer be retrieved.
error Something went wrong in the writing/processing of the extract file. Until resolved, file is unusable.

Custom extract status types can be added, but can not currently be utilized by the ProcessTracker framework.

Extract Tracking

This table is the core of the extract tracking subsystem.

extract_tracking
Column Name Column Type Column Description
extract_id Auto incrementing integer sequence System key for the extract file
extract_filename String(750) The unique filename of the extract file
extract_location_id Integer Where the extract file can be located. Foreign key to Location
extract_status_id Integer The current status of the extract file. Foreign key to Extract Status
extract_registration_date_time Datetime/timestamp The date/time that the extract was initially registered into the system.
extract_write_low_date_time Datetime/timestamp The earliest derived datetime for data processed in this extract at write. Optional audit field.
extract_write_high_date_time Datetime/timestamp The latest derived datetime for data processed in this extract at write. Optional audit field.
extract_write_record_count Integer For the given extract file at write, the total number of records processed. Optional audit field.
extract_read_low_date_time Datetime/timestamp The earliest derived datetime for data processed in this extract at read. Optional audit field.
extract_read_high_date_time Datetime/timestamp The latest derived datetime for data processed in this extract at read. Optional audit field.
extract_read_record_count Integer For the given extract file at read, the total number of records processed. Optional audit field.

Location

This table tracks extract file locations.

location_lkup
Column Name Column Type Column Description
location_id Auto incrementing integer sequence System key for the file location
location_name String(750) Unique optional name of the location. Will be derived from the filepath if not provided.
location_path String(750) Unique filepath.
location_type_id Integer The type of location for given filepath. Foreign key to Location Type.
location_file_count The number of files currently in the given location. Integer

Location Type

This table tracks extract file location types.

location_type_lkup
Column Name Column Type Column Description
location_type_id Auto incrementing integer sequence System key for the location type
location_type_name String(25) The unique name of the type of location.

Some default location types are provided on initialization.

Default Location Types
Location Type Description
S3 S3 bucket location
Local Filesystem Local filesystem location