Distribution Statistics with {TidyDensity}

code
rtip
tidydensity
Author

Steven P. Sanderson II, MPH

Published

December 21, 2022

Introduction

If you’re working with statistical distributions in R, you may be interested in the {TidyDensity} package. This package provides a set of functions for creating, manipulating, and visualizing probability distributions in a tidy format. One of these functions is tidy_chisquare(), which allows you to create a chi-square distribution with a specified number of degrees of freedom and a non-centrality parameter.

Once you’ve created a chi-square distribution using tidy_chisquare(), you may want to get some summary statistics about the distribution. This is where the util_chisquare_stats_tbl() function comes in handy. This function takes a chi-square distribution (created with tidy_chisquare()) as input and returns a tibble with several statistics about the distribution.

Some of the statistics included in the table are:

  • Mean: The mean of the chi-square distribution, also known as the expected value.
  • Variance: The variance of the chi-square distribution, which is a measure of how spread out the data is.
  • Skewness: The skewness of the chi-square distribution, which is a measure of the symmetry of the data.
  • Kurtosis: The kurtosis of the chi-square distribution, which is a measure of the peakedness of the data.

To use the util_chisquare_stats_tbl() function, you’ll need to install and load the {TidyDensity} package first. Then, you can create a chi-square distribution using tidy_chisquare() and pass it to util_chisquare_stats_tbl() like this:

# install and load TidyDensity
install.packages("TidyDensity")
library(TidyDensity)
library(dplyr)

# create a chi-square distribution with 5 degrees of freedom
distribution <- tidy_chisquare(.df = 5)

# get statistics about the distribution
util_chisquare_stats_tbl(distribution) |>
  glimpse()

The output will be a table with the mean, variance, skewness, and kurtosis of the chi-square distribution. These statistics can be useful for understanding the characteristics of the distribution and making statistical inferences.

Overall, the {TidyDensity} package is a useful tool for working with statistical distributions in R. The util_chisquare_stats_tbl() function is just one of many functions available in the package that can help you analyze and understand your data. Give it a try and see how it can help with your statistical analysis!

Function

Let’s take a look at the full function call.

util_chisquare_stats_tbl(.data)

Let’s take a look at the arguments that get supplied to the function parameters.

  • .data - The data being passed from a tidy_ distribution function.

Example

Now for a full example with output.

library(TidyDensity)
library(dplyr)

tidy_chisquare() %>%
  util_chisquare_stats_tbl() %>%
  glimpse()
Rows: 1
Columns: 17
$ tidy_function     <chr> "tidy_chisquare"
$ function_call     <chr> "Chisquare c(1, 1)"
$ distribution      <chr> "Chisquare"
$ distribution_type <chr> "continuous"
$ points            <dbl> 50
$ simulations       <dbl> 1
$ mean              <dbl> 1
$ median            <dbl> 0.3333333
$ mode              <chr> "undefined"
$ std_dv            <dbl> 1.414214
$ coeff_var         <dbl> 1.414214
$ skewness          <dbl> 2.828427
$ kurtosis          <dbl> 15
$ computed_std_skew <dbl> 1.132669
$ computed_std_kurt <dbl> 3.894553
$ ci_lo             <dbl> 0.002189912
$ ci_hi             <dbl> 6.521727

Voila!