-
Notifications
You must be signed in to change notification settings - Fork 9
/
README.Rmd
183 lines (122 loc) · 10.1 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r setup, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
library(npi)
library(tibble)
data("npis")
nyc <- npis
```
# npi <img src="man/figures/logo.png" align="right" height="139" />
> Access the U.S. National Provider Identifier Registry API
<!-- badges: start -->
[![Status at rOpenSci Software Peer Review](https://badges.ropensci.org/505_status.svg)](https://github.com/ropensci/software-review/issues/505)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![lifecycle](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![R-CMD-check](https://github.com/ropensci/npi/workflows/R-CMD-check/badge.svg)](https://github.com/ropensci/npi/actions)
[![Codecov test coverage](https://codecov.io/gh/ropensci/npi/branch/master/graph/badge.svg)](https://app.codecov.io/gh/ropensci/npi?branch=master)
[![DOI](https://zenodo.org/badge/122857655.svg)](https://zenodo.org/badge/latestdoi/122857655)
[![CRAN status](https://www.r-pkg.org/badges/version/npi)](https://CRAN.R-project.org/package=npi)
<!-- badges: end -->
Use R to access the U.S. National Provider Identifier (NPI) Registry API (v2.1) by the Center for Medicare and Medicaid Services (CMS): https://npiregistry.cms.hhs.gov/. Obtain rich administrative data linked to a specific individual or organizational healthcare provider, or perform advanced searches based on provider name, location, type of service, credentials, and many other attributes. `npi` provides convenience functions for data extraction so you can spend less time wrangling data and more time putting data to work.
Analysts working with healthcare and public health data frequently need to join data from multiple sources to answer their business or research questions. Unfortunately, joining data in healthcare is hard because so few entities have unique, consistent identifiers across organizational boundaries. NPI numbers, however, do not suffer from these limitations, as all U.S. providers meeting certain common criteria must have an NPI number in order to be reimbursed for the services they provide. This makes NPI numbers incredibly useful for joining multiple datasets by provider, which is the primary motivation for developing this package.
## Installation
There are three ways to install the `npi` package:
1. Install from CRAN:
```{r install-cran, eval = FALSE}
install.packages("npi")
library(npi)
```
2. Install from [R-universe](https://ropensci.org/r-universe/):
```{r install-r-universe, eval = FALSE}
install.packages("npi", repos = "https://ropensci.r-universe.dev")
library(npi)
```
3. Install from GitHub using the `devtools` package:
```{r install-github, eval = FALSE}
devtools::install_github("ropensci/npi")
library(npi)
```
## Usage
`npi` exports four functions, all of which match the pattern "npi_*":
* `npi_search()`: Search the NPI Registry and return the response as a [tibble](https://tibble.tidyverse.org/) with high-cardinality data organized into list columns.
* `npi_summarize()`: A method for displaying a nice overview of results from `npi_search()`.
* `npi_flatten()`: A method for flattening one or more list columns from a search result, joined by NPI number.
* `npi_is_valid()`: Check the validity of one or more NPI numbers using the official NPI enumeration standard.
### Search the registry
`npi_search()` exposes nearly all of the NPPES API's [search parameters](https://npiregistry.cms.hhs.gov/registry/help-api). Let's say we wanted to find up to 10 providers with primary locations in New York City:
```{r search-nyc, eval = FALSE}
nyc <- npi_search(city = "New York City")
```
```{r print-nyc}
# Your results may differ since the data in the NPPES database changes over time
nyc
```
The full search results have four regular vector columns, `npi`, `enumeration_type`, `created_date`, and `last_updated_date` and seven list columns. Each list column is a collection of related data:
* `basic`: Basic profile information about the provider
* `other_names`: Other names used by the provider
* `identifiers`: Other provider identifiers and credential information
* `taxonomies`: Service classification and license information
* `addresses`: Location and mailing address information
* `practice_locations`: Provider's practice locations
* `endpoints`: Details about provider's endpoints for health information exchange
A full list of the possible fields within these list columns can be found on the [NPPES API Help page](https://npiregistry.cms.hhs.gov/registry/Json-Conversion-Field-Map).
If you're comfortable [working with list columns](https://r4ds.had.co.nz/many-models.html), this may be all you need from the package. However, `npi` also provides functions that can help you summarize and transform your search results.
## Working with search results
`npi` has two main helper functions for working with search results: `npi_summarize()` and `npi_flatten()`.
### Summarizing results
Run `npi_summarize()` on your results to see a more human-readable overview of your search results. Specifically, the function returns the NPI number, provider's name, enumeration type (individual or organizational provider), primary address, phone number, and primary taxonomy (area of practice):
```{r summarize-nyc}
npi_summarize(nyc)
```
### Flattening results
As seen above, the data frame returned by `npi_search()` has a nested structure. Although all the data in a single row relates to one NPI, each list column contains a list of one or more values corresponding to the NPI for that row. For example, a provider's NPI record may have multiple associated addresses, phone numbers, taxonomies, and other attributes, all of which live in the same row of the data frame.
Because nested structures can be a little tricky to work with, the `npi` includes `npi_flatten()`, a function that transforms the data frame into a flatter (i.e., unnested and merged) structure that's easier to use. `npi_flatten()` performs the following transformations:
* unnest the list columns
* prefix the name of each unnested column with the name of its original list column
* left-join the data together by NPI
`npi_flatten()` supports a variety of approaches to flattening the results from `npi_search()`. One extreme is to flatten everything at once:
```{r flatten-all}
npi_flatten(nyc)
```
However, due to the number of fields and the large number of potential combinations of values, this approach is best suited to small datasets. More likely, you'll want to flatten a small number of list columns from the original data frame in one pass, repeating the process with other list columns you want and merging after the fact. For example, to flatten basic provider and provider taxonomy information, supply the corresponding list columns as a vector of names to the `cols` argument:
```{r flatten-two}
# Flatten basic provider info and provider taxonomy, preserving the relationship
# of each to NPI number and discarding other list columns.
npi_flatten(nyc, cols = c("basic", "taxonomies"))
```
### Validating NPIs
Just like credit card numbers, NPI numbers can be mistyped or corrupted in transit. Likewise, officially-issued NPI numbers have a [check digit](https://en.wikipedia.org/wiki/Check_digit) for error-checking purposes. Use `npi_is_valid()` to check whether an NPI number you've encountered is validly constructed:
```{r valid_npi_ex}
# Validate NPIs
npi_is_valid(1234567893)
npi_is_valid(1234567898)
```
Note that this function doesn't check whether the NPI numbers are activated or deactivated (see [#22](https://github.com/ropensci/npi/issues/22#issuecomment-787642817)). It merely checks for the number's consistency with the NPI specification. As such, it can help you detect and handle data quality issues early.
## Set your own user agent
A [user agent](https://en.wikipedia.org/wiki/User_agent) is a way for the software interacting with an API to tell it who or what is making the request. This helps the API's maintainers understand what systems are using the API. By default, when `npi` makes a request to the NPPES API, the request header references the name of the package and the URL for the repository (e.g., '`r paste(paste0("npi/", utils::packageVersion("npi")), "(https://github.com/ropensci/npi)")`'). If you want to set a custom user agent, update the value of the `npi_user_agent` option. For example, for version 1.0.0 of an app called "my_app", you could run the following code:
```{r, eval = FALSE}
options(npi_user_agent = "my_app/1.0.0")
```
## Package Website
`npi` has a [website](https://docs.ropensci.org/npi/) with release notes, documentation on all user functions, and examples showing how the package can be used.
## Reporting Bugs
Did you spot a bug? I'd love to hear about it at the [issues page](https://github.com/ropensci/npi/issues).
## Code of Conduct
Please note that this package is released with a [Contributor
Code of Conduct](https://ropensci.org/code-of-conduct/).
By contributing to this project, you agree to abide by its terms.
## Contributing
Interested in learning how you can contribute to npi? Head over to the [contributor guide](https://docs.ropensci.org/npi/CONTRIBUTING.html)—and thanks for considering!
## How to cite this package
For the latest citation, see the [Authors and Citation](https://docs.ropensci.org/npi/authors.html) page on the package website.
## License
MIT (c) [Frank Farach](https://github.com/frankfarach)
This package's logo is licensed under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/deed.en) and co-created by [Frank Farach](https://github.com/frankfarach) and [Sam Parmar](https://github.com/parmsam). The logo uses a modified version of an [image](https://commons.wikimedia.org/wiki/File:Rod_of_Asclepius_(Search).svg) of the [Rod of Asclepius](https://en.wikipedia.org/wiki/Rod_of_Asclepius) and a magnifying glass that is attributed to Evanherk, GFDL.