library(pollster)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(knitr)
library(ggplot2)It’s common to want to view a crosstab of two variables by a third
variable, for instance educational attainment by sex and
marital status. The function crosstab_3way accomplishes
this. Row and cell percents are both supported; column percents are
not.
illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 39) %>%
  crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"),
                n = FALSE) %>%
  kable(digits = 0, caption = "Educational attainment by sex and marital status among Illinois residents ages 35+",
        format = "html")| sex | maritalstatus | LT HS | HS | Some Col | AA | BA | Post-BA | 
|---|---|---|---|---|---|---|---|
| Male | Married | 7 | 28 | 16 | 8 | 24 | 17 | 
| Male | Never Married | 13 | 35 | 19 | 11 | 15 | 8 | 
| Female | Married | 6 | 28 | 16 | 10 | 24 | 16 | 
| Female | Never Married | 11 | 27 | 21 | 8 | 17 | 15 | 
Three-way crosstabs plot well as small multiples using ggplot facets.
illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 34) %>%
  crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"), 
                format = "long") %>%
  ggplot(aes(educ6, pct, fill = maritalstatus)) +
  geom_bar(stat = "identity", position = position_dodge()) +
  facet_wrap(facets = vars(sex)) +
  labs("Educational attainment by sex and marital status",
       subtitle = "Illinois residents ages 40+") +
  theme(legend.position = "top")The same plot can be made with margin of errors as well. (See the “crosstabs” vignette for a more detailed discussion of margin of errors.)
illinois %>%
  # filter for recent years & limited ages
  filter(year > 2009,
         age > 34) %>%
  moe_crosstab_3way(x = sex, y = educ6, z = maritalstatus, weight = weight,
                remove = c("widow/divorced/sep"), format = "long") %>%
  ggplot(aes(educ6, pct, fill = maritalstatus)) +
  geom_bar(stat = "identity", position = position_dodge(),
           alpha = 0.5) +
  geom_errorbar(aes(ymin = (pct - moe), ymax = (pct + moe),
                    color = maritalstatus),
                position = position_dodge()) +
  facet_wrap(facets = vars(sex)) +
  labs(title = "Educational attainment by sex and marital status",
       subtitle = "Illinois residents ages 35+",
       caption = "Current Population Survey, 2010-2018") +
  theme(legend.position = "top")
#> Your data includes weights equal to zero. These are removed before calculating the design effect.If the x-variable in your crosstab uniquely identifies survey waves
for which the weights were independently generated, it is best practice
to calculate the design effect independently for each wave.
moe_wave_crosstab_3way does just that. All of the arguments
remain the same as in moe_crosstab_3way.
moe_wave_crosstab_3way(df = illinois, x = sex, y = educ6, z = year, weight = weight)
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Your data includes weights equal to zero. These are removed before calculating the design effect.
#> Joining with `by = join_by(year)`
#> # A tibble: 144 × 6
#>     year sex    educ6      pct   moe        n
#>    <dbl> <fct>  <fct>    <dbl> <dbl>    <dbl>
#>  1  1996 Male   LT HS    15.1   1.80 3889089.
#>  2  1996 Male   HS       32.5   2.35 3889089.
#>  3  1996 Male   Some Col 20.3   2.02 3889089.
#>  4  1996 Male   AA        6.11  1.20 3889089.
#>  5  1996 Male   BA       17.7   1.91 3889089.
#>  6  1996 Male   Post-BA   8.38  1.39 3889089.
#>  7  1996 Female LT HS    14.2   1.65 4193383.
#>  8  1996 Female HS       34.8   2.25 4193383.
#>  9  1996 Female Some Col 22.8   1.98 4193383.
#> 10  1996 Female AA        6.72  1.18 4193383.
#> # … with 134 more rows