The dataset can be downloaded from http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/
Getting the following error:
formula(formula, data = data) :
invalid model formula in ExtractVars
Using the following code:
install.packages("rpart")
library("rpart")
# you'll need to change the following from windows to work on a linux box:
mydata <- read.csv(file="c:/Users/md7968/downloads/winequality-red.csv")
# grow tree
fit <- rpart(YouSweetBoy ~ "residual sugar" + "citric acid", method = "class", data = mydata
Mind you I've changed the delimiters in the CSV file to commas.
perhaps it's not reading the data correctly. Forgive me, I'm new to R and not a very good programmer.
Look at names(mydata)
. When you create a data.frame, read.table()
will turn "bad" column names into good ones. You can't (well, shouldn't) have a space in a column name so R changes spaces to periods. Plus, you should never have quoted strings in a formula. Try
fit <- rpart(quality ~ residual.sugar + citric.acid, method = "class", data = mydata)
(I have no idea what "YouSweetBoy" was supposed to be since that wasn't in the dataset so i changed it to "quality").