Unveiling the Power of get_cms_meta_data() in healthyR.data

code
rtip
healthyrdata
Author

Steven P. Sanderson II, MPH

Published

May 28, 2024

Introduction

Hey, R users! 🌟 Today, we’re going to look at a great new addition to the healthyR.data package—the get_cms_meta_data() function! This function is a helpful tool for retrieving and analyzing metadata from CMS (Centers for Medicare & Medicaid Services) datasets. Whether you’re a healthcare analyst, data scientist, or R programming fan, you’ll find this function very useful. Let’s break it down and explore how it works.

Overview of get_cms_meta_data()

The get_cms_meta_data() function lets you retrieve metadata from CMS datasets easily. You can customize your search using various parameters, ensuring you get precisely the data you need. Here’s the syntax:

get_cms_meta_data(
  .title = NULL,
  .modified_date = NULL,
  .keyword = NULL,
  .identifier = NULL,
  .data_version = "current",
  .media_type = "all"
)

Arguments:

  • .title: Search by title.
  • .modified_date: Search by modified date (format: “YYYY-MM-DD”).
  • .keyword: Search by keyword.
  • .identifier: Search by identifier.
  • .data_version: Choose between “current”, “archive”, or “all”. Default is “current”.
  • .media_type: Filter by media type (“all”, “csv”, “API”, “other”). Default is “all”.

Return Value:

A tibble containing data links and relevant metadata about the datasets.

Details:

The function fetches JSON data from the CMS data URL and extracts relevant fields to create a tidy tibble. It selects specific columns, handles nested lists by unnesting them, cleans column names, and processes dates and media types to make the data more useful for analysis. The columns in the returned tibble include:

  • title
  • description
  • landing_page
  • modified
  • keyword
  • described_by
  • fn
  • has_email
  • identifier
  • start
  • end
  • references
  • distribution_description
  • distribution_title
  • distribution_modified
  • distribution_start
  • distribution_end
  • media_type
  • data_link

Practical Examples

Let’s see the get_cms_meta_data() function in action with a couple of examples.

Example 1: Basic Usage

First, we’ll load the necessary libraries and fetch some metadata:

# Library Loads
library(healthyR.data)
library(dplyr)

# Get data
cms_data <- get_cms_meta_data()
glimpse(cms_data)
Rows: 107
Columns: 19
$ title                    <chr> "Accountable Care Organization Participants",…
$ description              <chr> "The Accountable Care Organization Participan…
$ landing_page             <chr> "https://data.cms.gov/medicare-shared-savings…
$ modified                 <date> 2024-01-29, 2024-04-23, 2024-01-12, 2024-01-…
$ keyword                  <list> <"Medicare", "Value-Based Care", "Coordinate…
$ described_by             <chr> "https://data.cms.gov/resources/accountable-c…
$ fn                       <chr> "Shared Savings Program - CM", "Shared Saving…
$ has_email                <chr> "[email protected]", "SharedSa…
$ identifier               <chr> "https://data.cms.gov/data-api/v1/dataset/976…
$ start                    <date> 2014-01-01, 2017-01-01, 2021-01-01, 2021-01-…
$ end                      <date> 2024-12-31, 2024-12-31, 2021-12-31, 2021-12-…
$ references               <chr> "https://data.cms.gov/resources/acos-aco-part…
$ distribution_description <chr> "latest", "latest", "latest", "latest", "late…
$ distribution_title       <chr> "Accountable Care Organization Participants",…
$ distribution_modified    <date> 2024-01-29, 2024-04-23, 2024-01-12, 2024-01-…
$ distribution_start       <date> 2024-01-01, 2024-01-01, 2021-01-01, 2021-01-…
$ distribution_end         <date> 2024-12-31, 2024-12-31, 2021-12-31, 2021-12-…
$ media_type               <chr> "API", "API", "API", "API", "API", "API", "AP…
$ data_link                <chr> "https://data.cms.gov/data-api/v1/dataset/976…
# Attributes
atb <- attributes(cms_data)
atb$names
 [1] "title"                    "description"             
 [3] "landing_page"             "modified"                
 [5] "keyword"                  "described_by"            
 [7] "fn"                       "has_email"               
 [9] "identifier"               "start"                   
[11] "end"                      "references"              
[13] "distribution_description" "distribution_title"      
[15] "distribution_modified"    "distribution_start"      
[17] "distribution_end"         "media_type"              
[19] "data_link"               
atb$class
[1] "cms_meta_data" "tbl_df"        "tbl"           "data.frame"   
atb$url
[1] "https://data.cms.gov/data.json"
atb$date_retrieved
[1] "2024-05-28 10:20:18 EDT"
atb$parameters
$.data_version
[1] "current"

$.media_type
[1] "all"

$.title
NULL

$.modified_date
NULL

$.keyword
NULL

$.identifier
NULL

In this example, we’re simply calling get_cms_meta_data() without any parameters. This fetches the default dataset metadata. The glimpse() function from the dplyr package provides a quick overview of the data structure.

Example 2: Custom Search by Keyword and Title

Now, let’s refine our search by specifying a keyword and title:

get_cms_meta_data(
  .keyword = "nation",
  .title = "Market Saturation & Utilization State-County"
) |>
  glimpse()
Rows: 1
Columns: 19
$ title                    <chr> "Market Saturation & Utilization State-County"
$ description              <chr> "The Market Saturation and Utilization State-…
$ landing_page             <chr> "https://data.cms.gov/summary-statistics-on-u…
$ modified                 <date> 2024-04-02
$ keyword                  <list> <"National", "States & Territories", "Countie…
$ described_by             <chr> "https://data.cms.gov/resources/market-satur…
$ fn                       <chr> "Market Saturation - CPI"
$ has_email                <chr> "[email protected]"
$ identifier               <chr> "https://data.cms.gov/data-api/v1/dataset/89…
$ start                    <date> 2023-10-01
$ end                      <date> 2023-12-31
$ references               <chr> "https://data.cms.gov/resources/market-satura…
$ distribution_description <chr> "latest"
$ distribution_title       <chr> "Market Saturation & Utilization StateCounty"
$ distribution_modified    <date> 2024-04-02
$ distribution_start       <date> 2023-10-01
$ distribution_end         <date> 2023-12-31
$ media_type               <chr> "API"
$ data_link                <chr> "https://data.cms.gov/data-api/v1/dataset/890…

In this example, we filter the metadata by the keyword “nation” and the title “Market Saturation & Utilization State-County”. The pipe operator (|>) is used to pass the result directly into the glimpse() function for a quick preview.

Breaking Down the Code

Let’s break down the code blocks to understand what they’re doing:

Basic Usage

  1. Load Libraries:

    library(healthyR.data)
    library(dplyr)

    We load the healthyR.data package to access the get_cms_meta_data() function and the dplyr package for data manipulation.

  2. Fetch Metadata:

    cms_data <- get_cms_meta_data()

    We call get_cms_meta_data() without any parameters to get the default dataset metadata.

  3. Preview Data:

    glimpse(cms_data)

    The glimpse() function gives us a quick look at the structure and contents of the fetched metadata.

Conclusion

The get_cms_meta_data() function is a versatile and flexible tool for accessing CMS metadata, making your data analysis tasks more efficient and effective. Whether you’re looking for specific datasets or just exploring the available metadata, this function has got you covered.

Try out get_cms_meta_data() in your next R project and explore the potential of CMS data with ease! Happy coding! 🚀