Skip to contents

Takes a numeric vector and will return a tibble with the winsorized values.

Usage

hai_winsorized_truncate_augment(.data, .value, .fraction, .names = "auto")

Arguments

.data

The data being passed that will be augmented by the function.

.value

This is passed rlang::enquo() to capture the vectors you want to augment.

.fraction

A positive fractional between 0 and 0.5 that is passed to the stats::quantile paramater of probs.

.names

The default is "auto"

Value

An augmented tibble

Details

Takes a numeric vector and will return a winsorized vector of values that have been truncated if they are less than or greater than some defined fraction of a quantile. The intent of winsorization is to limit the effect of extreme values.

Author

Steven P. Sanderson II, MPH

Examples

suppressPackageStartupMessages(library(dplyr))

len_out <- 24
by_unit <- "month"
start_date <- as.Date("2021-01-01")

data_tbl <- tibble(
  date_col = seq.Date(from = start_date, length.out = len_out, by = by_unit),
  a = rnorm(len_out),
  b = runif(len_out)
)

hai_winsorized_truncate_augment(data_tbl, a, .fraction = 0.05)
#> # A tibble: 24 × 4
#>    date_col        a     b winsor_trunc_a
#>    <date>      <dbl> <dbl>          <dbl>
#>  1 2021-01-01  1.11  0.220          1.11 
#>  2 2021-02-01 -0.285 0.138         -0.285
#>  3 2021-03-01  0.373 0.593          0.373
#>  4 2021-04-01 -0.167 0.510         -0.167
#>  5 2021-05-01  0.184 0.165          0.184
#>  6 2021-06-01 -0.442 0.122         -0.442
#>  7 2021-07-01 -0.404 0.676         -0.404
#>  8 2021-08-01  0.895 0.860          0.895
#>  9 2021-09-01 -0.354 0.142         -0.354
#> 10 2021-10-01 -0.128 0.533         -0.128
#> # ℹ 14 more rows