I have an R matrix named ddd. When I enter this, everything works fine:
i <- 1
shapiro.test(ddd[,y])
ad.test(ddd[,y])
stem(ddd[,y])
print(y)
The calls to Shapiro Wilk, Anderson Darling, and stem all work, and extract the same column.
If I put this code in a "for" loop, the calls to Shapiro Wilk, and Anderson Darling stop working, while the the stem & leaf call and the print call continue to work.
for (y in 7:10) {
shapiro.test(ddd[,y])
ad.test(ddd[,y])
stem(ddd[,y])
print(y)
}
The decimal point is 1 digit(s) to the right of the |
0 | 0
0 | 899999
1 | 0
[1] 7
The same thing happens if I try and write a function. SW & AD do not work. The other calls do.
> D <- function (y) {
+ shapiro.test(ddd[,y])
+ ad.test(ddd[,y])
+ stem(ddd[,y])
+ print(y) }
> D(9)
The decimal point is at the |
9 | 000
9 |
10 | 00000
[1] 9
Why don't all the calls behave the same way?
In a loop, automatic printing is turned off, as it is inside a function. You need to explicitly print
something in both cases if you want to see the output. The [1] 9
things you are getting is because you are explicitly printing the values of y
.
Here is an example of how you might want to consider going about doing this.
> DF <- data.frame(A = rnorm(100), B = rlnorm(100))
> y <- 1
> shapiro.test(DF[,y])
Shapiro-Wilk normality test
data: DF[, y]
W = 0.9891, p-value = 0.5895
So we have automatic printing. In the loop we would have to do this:
for(y in 1:2) {
print(shapiro.test(DF[,y]))
}
If you want to print more tests out, then just add them as extra lines in the loop:
for(y in 1:2) {
writeLines(paste("Shapiro Wilks Test for column", y))
print(shapiro.test(DF[,y]))
writeLines(paste("Anderson Darling Test for column", y))
print(ad.test(DF[,y]))
}
But that isn't very appealing unless you like reading through reams of output. Instead, why not save the fitted test objects and then you can print them and investigate them, maybe even process them to aggregate the test statistics and p-values into a table? You can do that using a loop:
## object of save fitted objects in
obj <- vector(mode = "list", length = 2)
## loop
for(y in seq_along(obj)) {
obj[[y]] <- shapiro.test(DF[,y])
}
We can then look at the models using
> obj[[1]]
Shapiro-Wilk normality test
data: DF[, y]
W = 0.9891, p-value = 0.5895
for example, or using lapply
, which takes care of setting up the object we use to store the results for us:
> obj2 <- lapply(DF, shapiro.test)
> obj2[[1]]
Shapiro-Wilk normality test
data: X[[1L]]
W = 0.9891, p-value = 0.5895
Say now I wanted to extract the W
and p-value
data, we can process the object storing all the results to extract the bits we want, e.g.:
> tab <- t(sapply(obj2, function(x) c(x$statistic, x$p.value)))
> colnames(tab) <- c("W", "p.value")
> tab
W p.value
A 0.9890621 5.894563e-01
B 0.4589731 1.754559e-17
Or for those with a penchant for significance stars:
> tab2 <- lapply(obj2, function(x) c(W = unname(x$statistic),
+ `p.value` = x$p.value))
> tab2 <- data.frame(do.call(rbind, tab2))
> printCoefmat(tab2, has.Pvalue = TRUE)
W p.value
A 0.9891 0.5895
B 0.4590 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
This has got to be better than firing output to the screen that you then have to pour through?