Skip to main content

POINT: Pipeline for Offline Conversion and Integration of Geocodes and Neighborhood Data


AUTHORS

Guo KS , McCoy AB , Reese T , Wright A , Rosenbloom T , Liu S , Russo EM , Steitz BD , . Applied clinical informatics. 2023 8 4; ().

ABSTRACT

Objective Geocoding, the process of converting addresses into precise geographic coordinates, allows researchers and health systems to obtain neighborhood-level estimates of social determinants of health. This information supports opportunities to personalize care and interventions for individual patients based on the environments where they live. We developed an integrated offline geocoding pipeline to streamline the process of obtaining address-based variables, which can be integrated into existing data processing pipelines. Materials and Methods POINT is a web-based, containerized, application for geocoding addresses that can be deployed offline and made available to multiple users across an organization. Our application supports use through both a graphical user interface (GUI) and application programming interface (API) to query geographic variables, by census tract, without exposing sensitive patient data. We evaluated our application’s performance using two datasets: one consisting of 1 million nationally representative addresses sampled from Open Addresses, and the other consisting of 3,096 previously geocoded patient addresses. Results 99.4% and 99.8% of addresses in the Open Addresses and patient addresses datasets respectively were geocoded successfully. Tract assignment was concordant with reference in greater than 90% of addresses for both datasets. Among successful geocodes, median (IQR) distances from reference coordinates were 52.5 (26.5-119.4) meters and 14.5 (10.9-24.6) meters for the two datasets. Discussion POINT successfully geocodes more addresses and yields similar accuracy to existing solutions, including the United States Census Bureau’s official geocoder. Addresses are protected health information and cannot be shared with common online geocoding services. POINT is an offline solution that enables scalability to multiple users and integrates downstream mapping to neighborhood-level variables with a pipeline that allows users to incorporate additional datasets as they become available. Conclusion As health systems and researchers continue to explore and improve health equity, it is essential to quickly and accurately obtain neighborhood variables in a HIPAA-compliant way.



Tags: