DETECT 5-Year Follow-Up Evaluation

R
FastLink
Public Health
Elder Mistreatment
EMS
APS
Record Linkage
Data Cleaning
Fuzzy Matching
Reproducible Analysis
Health Informatics
Manuscript
Data Analysis
Work Product
Academic Research
Long
Mentorship
Extended public health informatics project analyzing the DETECT elder mistreatment screening tool embedded in EMS software. Developed a scalable fuzzy matching pipeline for fragmented APS records, linked across systems, and conducted stratified agreement and temporal trend analyses. Co-authored academic manuscript and mentored team members through high-volume data wrangling and linkage design.
Author

Morrigan M.

Published

August 1, 2025

Modified

July 14, 2025

Project Summary

This project was a continuation of DETECT, a screening questionnaire embedded in MedStar’s EMS electronic patient care record (ePCR) software designed to help medics identify potential elder mistreatment (EM). The analysis covered EMS records from July 1, 2019 to May 31, 2023, matched to Texas Adult Protective Services (APS) data (originally spread across 5 relational CSV files). The focus was to evaluate screening fidelity, reporting intent, and APS follow-through over a longer time horizon.

Tech Stack & Project Constraints

All work was completed on a single local workstation with no access to server-grade infrastructure or high-performance computing. This required careful memory optimization, modular workflow design, and rigorous reproducibility — particularly while managing over 500K APS records and 92K EMS records.

Core Tools & Libraries (R)

  • tidyverse, here, janitor — wrangling, reproducible structure
  • fastLink — high-volume probabilistic record linkage
  • codebookr, data.table, readxl — documentation and input handling
  • irr — interrater reliability & agreement metrics
  • ggplot2, patchwork, consort — statistical visualizations and diagrams

Collaboration & Cloud Tools

  • GitHub — team-based code development and versioning
  • OneDrive — secure PHI-compliant cloud data storage
  • Quarto — reproducible and annotated reports
  • Microsoft Word, Canva, Visio — manuscript drafting and presentation

My Contributions

APS Data Cleaning & Standardization

  • Cleaned and validated APS subject identifiers
  • Applied USPS-linked ZIP validation and state-county mappings
  • Parsed and restructured race/ethnicity and address strings

Subject Identification & Deduplication

  • Performed within-set fastLink matching on APS data to form unique APS subject IDs
  • Chunked matching by identifier completeness to manage memory constraints
  • Reduced 378K APS identifiers to 370K deduplicated IDs (~2% reduction)

EMS–APS Linkage

  • Benchmarked cross-set matching variables and created optimized maps between EMS and APS subjects
  • Built temporal linkage models connecting EMS responses to APS intakes and investigations
  • Consolidated investigation-level allegations and standardized APS outcome flags

Subject-Level Data Construction

  • Consolidated demographic fields using reviewed heuristics (most common value, then first-in)
  • Derived subject-level outcomes (e.g. any reported mistreatment)

Analytical Work

  • Explored response patterns, demographic stratifications, and temporal trends
  • Assessed agreement between screening results, medic intent, APS intake, and APS outcome
  • Constructed multi-layer consort flow diagrams

Documentation & Mentorship

  • Authored full codebooks for derived data sets
  • Co-authored academic manuscript (currently in revision)
  • Mentored junior colleague in data wrangling, documentation, and writing process

Code Highlights

The MedStar data was cleaned and provided within-set subject IDs by a junior colleague, using the process I’d previously developed in the DETECT 1-Year Pilot Evaluation project.

Data Cleaning & ID Creation

File Description
data_unique_person_01_aps_01_cleaning.qmd APS cleaning & address standardization
data_unique_person_01_aps_03_fl_chunk_cleaning.qmd Deduplicated APS fuzzy-match chunks
data_unique_person_01_aps_04_fl_chunk_folding.qmd Merged final APS subject IDs

EMS–APS Matching & Linkage

File Description
data_unique_person_02_ms-aps_01_fl_generation.qmd Cross-dataset fuzzy matching
data_unique_person_02_ms-aps_02_fl_cleaning.qmd Map creation between ID tiers
data_record_linkage_01_medstar_aps.qmd Temporal linkage of EMS and APS investigations/intakes

Analysis

File Focus Description
analysis_01_demographics.qmd Demographics Cleaned, consolidated subject characteristics
analysis_02_response_patterns.qmd Screening Responses Stratified by demographics, intent, outcome
analysis_03_agreement.qmd Agreement Kappa, correlations across screening/reporting layers
analysis_04_temporal_trends.qmd Temporal Trends Longitudinal patterns in screening and reporting
analysis_05_consort_diagrams.qmd Cohort Flow Multi-layer consort tables and diagrams

Reflections

This project pushed the limits of public health data infrastructure—requiring thoughtful deduplication, linkage theory, and scalable analysis under compute constraints. I developed chunked fuzzy matching workflows and robust mapping strategies to integrate fragmented APS records with EMS screening data. As a secondary author on the manuscript, I also mentored a junior colleague through the analysis and writing process—making this not just a technical milestone, but a leadership one.

Back to top