How to use 'hclust' as function call in R

neversaint picture neversaint · Dec 3, 2013 · Viewed 26.5k times · Source

I tried to construct the clustering method as function the following ways:

mydata <- mtcars

# Here I construct hclust as a function
hclustfunc <- function(x) hclust(as.matrix(x),method="complete")

# Define distance metric
distfunc <- function(x) as.dist((1-cor(t(x)))/2)

# Obtain distance
d <- distfunc(mydata)

# Call that hclust function
fit<-hclustfunc(d)

# Later I'd do
# plot(fit)

But why it gives the following error:

Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") : 
  missing value where TRUE/FALSE needed

What's the right way to do it?

Answer

Gavin Simpson picture Gavin Simpson · Dec 3, 2013

Do read the help for functions you use. ?hclust is pretty clear that the first argument d is a dissimilarity object, not a matrix:

Arguments:

       d: a dissimilarity structure as produced by ‘dist’.

Update

As the OP has now updated their question, what is need is

hclustfunc <- function(x) hclust(x, method="complete")
distfunc <- function(x) as.dist((1-cor(t(x)))/2)
d <- distfunc(mydata)
fit <- hclustfunc(d)

Original

What you want is

hclustfunc <- function(x, method = "complete", dmeth = "euclidean") {    
    hclust(dist(x, method = dmeth), method = method)
}

and then

fit <- hclustfunc(mydata)

works as expected. Note you can now pass in the dissimilarity coefficient method as dmeth and the clustering method.