Graphing results of dbscan in R

Question 1

Graphing results of dbscan in R

r visualization cluster-analysis data-visualization dbscan

droops · Jul 26, 2011 · Viewed 7k times · Source

Answer

Answer

If you look at the help page (?dbscan) it is organized like all others into sections labeled Description, Usage, Arguments, Details and Value. The Value section describes what the function dbscan returns. In this case it is simply a list (a standard R data type) with a few components.

The cluster component is simply an integer vector whose length it equal to the number of rows in your data that indicates which cluster each observation is a member of. So you can use this vector to subset your data to extract only those clusters you'd like and then plot just those data points.

For example, if we use the first example from the help page:

set.seed(665544)
n <- 600
x <- cbind(runif(10, 0, 10)+rnorm(n, sd=0.2), runif(10, 0, 10)+rnorm(n,
    sd=0.2))
ds <- dbscan(x, 0.2)

we can then use the result, ds to plot only the points in clusters 1-3:

#Plot only clusters 1, 2 and 3
plot(x[ds$cluster %in% 1:3,])

Question 2

Your comments, suggestions, or solutions are/will be greatly appreciated, thank you.

I'm using the fpc package in R to do a dbscan analysis of some very dense data (3 sets of 40,000 points between the range -3, 6).

I've found some clusters, and I need to graph just the significant ones. The problem is that I have a single cluster (the first) with about 39,000 points in it. I need to graph all other clusters but this one.

The dbscan() creates a special data type to store all of this cluster data in. It's not indexed like a data frame would be (but maybe there is a way to represent it as such?).

I can graph the dbscan type using a basic plot() call. But, like I said, this will graph the irrelevant 39,000 points.

tl;dr: how do I graph only specific clusters of a dbscan data type?

Graphing results of dbscan in R

Answer

Related questions