Chapter 6 More about Factors

This was part of a impromptu session learning about factors.

6.1 Making a factor variable out of disease

We’re adding a fourth value, BRCA to our levels here.

6.4 Another thing about factors

Factor levels also specify the permissible values.

In this example, LUSC and BRCA are the permissible values. We pass a character vector into them, and you can see those values (BLCA, CESC) are recoded as NAs

## [1] LUSC LUSC BRCA <NA> BRCA <NA> <NA>
## Levels: LUSC BRCA

6.7 fct_collapse

fct_collapse() lets you collapse multiple categories into one category.

##  disease_collapse   n   percent
##              LUSC 836 0.7256944
##             other 316 0.2743056
##              BRCA   0 0.0000000

6.8 Other really useful forcats functions

fct_recode() - lets you recode values manually.

fct_other() - lets you define what categories are in an other variable.