-Dealing with Factors

Factors are important in statistical modeling and are treated specially by modelling functions like lm() and glm().

#Creating, Converting & Inspecting Factors

# create a factor string
gender <- factor(c("male", "female", "female", "male", "female"))
gender
## [1] male   female female male   female
## Levels: female male

# inspect to see if it is a factor class
class(gender)
## [1] "factor"

# show that factors are just built on top of integers
typeof(gender)
## [1] "integer"

# See the underlying representation of factor
unclass(gender)
## [1] 2 1 1 2 1
## attr(,"levels")
## [1] "female" "male"

# what are the factor levels?
levels(gender)
## [1] "female" "male"

# show summary of counts
summary(gender)
## female   male 
##      3      2

If we have a vector of character strings or integers we can easily convert to factors:

#Ordering, Revaluing, & Dropping Factor Levels

Ordering Levels:

Create ordinal factors with ordered = TRUE argument

Revalue Levels:

To recode factor levels I usually use the revalue() function from the plyrpackage.

Using the :: notation allows you to access the revalue() function without having to fully load the plyr package.

Dropping Levels:

When you want to drop unused factor levels, use droplevels():

Last updated