Skip to contents

The function will calculate the ratio between 2 variables, the numerator and denominator. The numerator : By default, it will change all NAs to 0. If numerator_NA_to_0 is set to FALSE, rows with missing values will be filtered out.

The denominator : All rows with missing value will be filtered out (cannot be changed). In addition, by default, all rows with a value equal to 0 will be filtered out, if filter_denominator_0 is set to TRUE, they will be kept.

Usage

create_analysis_ratio(
  design,
  group_var = NA,
  analysis_var_numerator,
  analysis_var_denominator,
  numerator_NA_to_0 = TRUE,
  filter_denominator_0 = TRUE,
  level = 0.95
)

Arguments

design

design survey

group_var

dependent variable(s), variable to group by. If no dependent variable, it should be NA or empty string. If more than one variable, it should be one string with each variable separated by comma, e.g. "groupa, groupb" to group for groupa and groupb. NA is default for no grouping.

analysis_var_numerator

character string with the numerator column name.

analysis_var_denominator

character string with the denominator column name.

numerator_NA_to_0

Will turn all NA of the numerator into 0's, default TRUE.

filter_denominator_0

Will remove all rows with 0's in the denominator, default TRUE.

level

the confidence level. 0.95 is default

Value

a data frame with the ratio for each group

Examples

school_ex <- data.frame(
  hh = c("hh1", "hh2", "hh3", "hh4"),
  num_children = c(3, 0, 2, NA),
  num_enrolled = c(3, NA, 0, NA),
  num_attending = c(1, NA, NA, NA),
  group = c("a", "a", "b", "b")
)
me_design <- srvyr::as_survey(school_ex)
# Default value will give a ratio of 0.2 as there are 1 child out of 5 attending school.
# numerator: 1 child from hh1 and 0 from hh3. denominator: 3 from hh1 and 2 from hh3. In the
# hh3, the num_attending is NA because there is a skip logic, there cannot be a child attending as
# none are enrolled. By default, the function has the argument numerator_NA_to_0 set to TRUE to
# turn that NA into a 0.
# n and n_total are 2 as 2 households were included in the calculation. hh2 was not included in
# the calculation of totals. The argument filter_denominator_0 set to TRUE removes that row.
create_analysis_ratio(me_design,
                      analysis_var_numerator = "num_attending",
                      analysis_var_denominator = "num_children"
)
#> # A tibble: 1 × 13
#>   analysis_type analysis_var  analysis_var_value group_var group_var_value  stat
#>   <chr>         <chr>         <chr>              <chr>     <chr>           <dbl>
#> 1 ratio         num_attendin… NA %/% NA          NA        NA                0.2
#> # ℹ 7 more variables: stat_low <dbl>, stat_upp <dbl>, n <int>, n_total <dbl>,
#> #   n_w <dbl>, n_w_total <dbl>, analysis_key <chr>
# If numerator_NA_to_0 is set to FALSE, ratio will be 1/3, as hh3 with 2 children and NA for
# attending will be removed with the na.rm = T inside the survey_ratio calculation.
# n and n_total is 1 as only 1 household was used.
create_analysis_ratio(me_design,
                      analysis_var_numerator = "num_attending",
                      analysis_var_denominator = "num_children",
                      numerator_NA_to_0 = FALSE
)
#> Warning: There were 2 warnings in `dplyr::summarise()`.
#> The first warning was:
#>  In argument: `srvyr::survey_ratio(...)`.
#> Caused by warning in `qt()`:
#> ! NaNs produced
#>  Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#> # A tibble: 1 × 13
#>   analysis_type analysis_var  analysis_var_value group_var group_var_value  stat
#>   <chr>         <chr>         <chr>              <chr>     <chr>           <dbl>
#> 1 ratio         num_attendin… NA %/% NA          NA        NA              0.333
#> # ℹ 7 more variables: stat_low <dbl>, stat_upp <dbl>, n <int>, n_total <dbl>,
#> #   n_w <dbl>, n_w_total <dbl>, analysis_key <chr>
# If filter_denominator_0 is set to FALSE, ratio will be 0.2 as there are 1 child out of 5
# attending school.
# The number of household counted, n and n_total, is equal to 3 instead 2. The household with 0
# child is counted in the totals.
# (01 + 0 + 0) / (3 + 0 + 2)
create_analysis_ratio(me_design,
                      analysis_var_numerator = "num_attending",
                      analysis_var_denominator = "num_children",
                      filter_denominator_0 = FALSE
)
#> # A tibble: 1 × 13
#>   analysis_type analysis_var  analysis_var_value group_var group_var_value  stat
#>   <chr>         <chr>         <chr>              <chr>     <chr>           <dbl>
#> 1 ratio         num_attendin… NA %/% NA          NA        NA                0.2
#> # ℹ 7 more variables: stat_low <dbl>, stat_upp <dbl>, n <int>, n_total <dbl>,
#> #   n_w <dbl>, n_w_total <dbl>, analysis_key <chr>
# For weights and group:
set.seed(8988)
somedata <- data.frame(
  groups = rep(c("a", "b"), 50),
  children_518 = sample(0:5, 100, replace = TRUE),
  children_enrolled = sample(0:5, 100, replace = TRUE)
) %>%
  dplyr::mutate(children_enrolled = ifelse(children_enrolled > children_518,
                                           children_518,
                                           children_enrolled
  ))
somedata[["weights"]] <- ifelse(somedata$groups == "a", 1.33, .67)
create_analysis_ratio(srvyr::as_survey(somedata, weights = weights, strata = groups),
                      group_var = NA,
                      analysis_var_numerator = "children_enrolled",
                      analysis_var_denominator = "children_518",
                      level = 0.95
)
#> # A tibble: 1 × 13
#>   analysis_type analysis_var  analysis_var_value group_var group_var_value  stat
#>   <chr>         <chr>         <chr>              <chr>     <chr>           <dbl>
#> 1 ratio         children_enr… NA %/% NA          NA        NA              0.639
#> # ℹ 7 more variables: stat_low <dbl>, stat_upp <dbl>, n <int>, n_total <dbl>,
#> #   n_w <dbl>, n_w_total <dbl>, analysis_key <chr>
create_analysis_ratio(srvyr::as_survey(somedata, weights = weights, strata = groups),
                      group_var = "groups",
                      analysis_var_numerator = "children_enrolled",
                      analysis_var_denominator = "children_518",
                      level = 0.95
)
#> # A tibble: 2 × 13
#>   analysis_type analysis_var  analysis_var_value group_var group_var_value  stat
#>   <chr>         <chr>         <chr>              <chr>     <chr>           <dbl>
#> 1 ratio         children_enr… NA %/% NA          groups    a               0.670
#> 2 ratio         children_enr… NA %/% NA          groups    b               0.578
#> # ℹ 7 more variables: stat_low <dbl>, stat_upp <dbl>, n <int>, n_total <dbl>,
#> #   n_w <dbl>, n_w_total <dbl>, analysis_key <chr>