Unable to convert data frame to h2o object

Boudewijn Aasman picture Boudewijn Aasman · Jul 16, 2015 · Viewed 7.2k times · Source

I am running the h2o package in Rstudio Version 0.99.447. I run version 10.9.5 OSX.

I would like to set up a local cluster within R, following the steps of this tutorial: http://blenditbayes.blogspot.co.uk/2014/07/things-to-try-after-user-part-1-deep.html

The first step does not seem to be a problem. What does seem to be a problem is converting my data frame to a proper h2o object.

library(mlbench)
dat = BreastCancer[,-1] #reading in data set from mlbench package

library(h2o)
localH2O <- h2o.init(ip = "localhost", port = 54321, startH2O = TRUE) #sets up the cluster
dat_h2o <- as.h2o(localH2O, dat, key = 'dat') #this returns an error message

The above statement as.h2o results in the following error message

Error in as.h2o(localH2O, dat, key = "dat") : 
unused argument (key = "dat")

If I remove the "key" parameter, letting the data reside in the H2O key-value store under a machine generated name, the following error message comes up.

Error in .h2o.doSafeREST(conn = conn, h2oRestApiVersion = h2oRestApiVersion,  
Unexpected CURL error: Empty reply from server

This question asks the same thing as me, but the solution leads me to the same error.

Does anyone have experience with this problem? I'm not entirely sure how to approach this.

Answer

Amy Wang picture Amy Wang · Jul 20, 2015

The syntax for importing a frame from R into H2O has changed since the last stable release of H2O-Classic and the latest stable release of H2O-3.0. I believe you used a H2O-3.0 release which means some of the arguments in the functions has since changed, the ambiguous "key" argument has been changed to "destination_frame".

H2O-3.0 will behave differently in that it will make note that the first 5 columns are ordered factors in the R data frame; and at the moment we don't have a way of preserving orders for categorical columns. However, to reproduce the same results as the one posted on http://blenditbayes.blogspot.co.uk/2014/07/things-to-try-after-user-part-1-deep.html you'll have to for now write the frame to disk as a CSV and import it into H2O.

library(mlbench)
dat = BreastCancer[,-1] #reading in data set from mlbench package

library(h2o)
localH2O <- h2o.init(ip = "localhost", port = 54321, startH2O = TRUE)

#dat_h2o <- as.h2o(dat, destination_frame = 'dat') 
## Will return a "Provided column type c("ordered", "enum") is unknown." error

pathToData <- paste0(normalizePath("~/Downloads/"), "/dat.csv")
write.table(x = dat, file = pathToData, row.names = F, col.names = T)
dat_h2o <- h2o.importFile(path = pathToData, destination_frame = "dat")

For R data.frames that do not have ordered factor columns you can simply use h2o_frame <- as.h2o(object = df) where class(df) is a data.frame.