Use of ~ (tilde) in R programming Language

Ankita picture Ankita · Feb 20, 2013 · Viewed 137.5k times · Source

I saw in a tutorial about regression modeling the following command :

myFormula <- Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width

What exactly does this command do, and what is the role of ~ (tilde) in the command?

Answer

Spacedman picture Spacedman · Feb 20, 2013

The thing on the right of <- is a formula object. It is often used to denote a statistical model, where the thing on the left of the ~ is the response and the things on the right of the ~ are the explanatory variables. So in English you'd say something like "Species depends on Sepal Length, Sepal Width, Petal Length and Petal Width".

The myFormula <- part of that line stores the formula in an object called myFormula so you can use it in other parts of your R code.


Other common uses of formula objects in R

The lattice package uses them to specify the variables to plot.
The ggplot2 package uses them to specify panels for plotting.
The dplyr package uses them for non-standard evaulation.