r: Plotting each column against each column

Jonathan Rhein picture Jonathan Rhein · Apr 12, 2016 · Viewed 7.1k times · Source

I have a dataframe ("data") with 7 columns (2 Factor, 5 num). The first column is containing the names of 7 different countries and in the following columns I have collected data for different parameters (like population, GDP etc.) characterizing each country. In the last column a factor variable assigns which continent the respective country belongs to.

The data looks like this:

structure(list(Country = structure(c(5L, 4L, 7L, 2L, 1L, 6L, 
3L), .Label = c("Brazil", "Chile", "China", "France", "Germany", 
"India", "Netherlands"), class = "factor"), GDP = c(0.46, 0.57, 
0.75, 0.56, 0.28, 0.88, 1), Population = c(0.18, 0.09, 0.54, 
0.01, 0.02, 0.17, 0.84), Birth.rate = c(87.21, 18.34, 63.91, 
14.21, 5.38, 51.19, 209.26), Income = c(43.89, 18.23, 63.91, 
12.3, 0.1, 14.61, 160.82), Savings = c(43.32, 0.11, 0, 1.91, 
5.29, 36.58, 50.38), Continent = structure(c(2L, 2L, 2L, 3L, 
3L, 1L, 1L), .Label = c("Asia", "Europe", "South America"), class = "factor")), .Names = c("Country", 
"GDP", "Population", "Birth.rate", "Income", "Savings", "Continent"
), class = "data.frame", row.names = c(NA, -7L))

I need some sort of loop function which plots (e.g. scatter plot) every single column against each other so that in the end every column (except the first and the last, i.e. the two factor variables) has been plotted against all other columns but each in a single plot chart (not all plots in one). Preferably all these plots are being saved to some folder on my local machine.

Also it would be great if the x and y axis are already labeled according to the respective two columns that are plotted against each other. Moreover it would be convenient to have a label next to each point in the plot displaying the respective country name. Lastly it would be nice to have three different colors for the points of the countries according to the three different continents.

So far I only have a piece of code that goes like

for (i in seq(1,length(data),1)) {
   plot(data[,i], ylab=names(data[i]), xlab="Country", 
   text(i, labels=Country, pos=4, cex =.5)) 
} 

As you can see it only plots each column against the first column ("Country") which is not what I want in the end.

Do you have any idea how I could achieve this? Thank you!

Answer

Worice picture Worice · Apr 15, 2016

You can use pairs() directly from R. Note that dt represents your dataset.

pairs(dt)

enter image description here

dt <- structure(list(Country = structure(c(5L, 4L, 7L, 2L, 1L, 6L, 
3L), .Label = c("Brazil", "Chile", "China", "France", "Germany", 
"India", "Netherlands"), class = "factor"), GDP = c(0.46, 0.57, 
0.75, 0.56, 0.28, 0.88, 1), Population = c(0.18, 0.09, 0.54, 
0.01, 0.02, 0.17, 0.84), Birth.rate = c(87.21, 18.34, 63.91, 
14.21, 5.38, 51.19, 209.26), Income = c(43.89, 18.23, 63.91, 
12.3, 0.1, 14.61, 160.82), Savings = c(43.32, 0.11, 0, 1.91, 
5.29, 36.58, 50.38), Continent = structure(c(2L, 2L, 2L, 3L, 
3L, 1L, 1L), .Label = c("Asia", "Europe", "South America"), class =      "factor")), .Names = c("Country",  
"GDP", "Population", "Birth.rate", "Income", "Savings", "Continent"
), class = "data.frame", row.names = c(NA, -7L))