Expressionset - phenodata

Avin picture Avin · Sep 9, 2011 · Viewed 10.5k times · Source

I must start by saying that I'm just beginning to program using R. I'm unable to create expressionset of my data. When I try to put assaydata and phenodata together to make expressionset, I get an error:

Error in validObject(.Object) : " invalid class ""ExpressionSet"" object: sampleNames differ between assayData and phenoData"

Please take a look at the sample data, the phenodata table I made and R-program. I guess that the phenodata should be modified to get this working.

Please let me know how to solve this and alter phenodata.

AssayData                                                       
    0h-1    0h-2    6h-1    6h-2    12h-1   12h-2   24h-1   24h-2   48h-1   48h-2   72h-1   72h-2   96h-1   96h-2
171407  4.021342514 4.021342514 6.847201005 6.847201005 3.189312274 3.189312274 3.322687671 3.322687671 4.929574559 4.929574559 4.040127938 4.040127938 3.181587044 3.181587044
171415  267.8091012 267.8091012 358.8511895 358.8511895 266.4562608 266.4562608 210.259177  210.259177  243.1496956 243.1496956 248.2780935 248.2780935 235.7079055 235.7079055
171426  13.3620332  13.3620332  5.581083074 5.581083074 12.5236932  12.5236932  8.433621131 8.433621131 13.07390505 13.07390505 12.94673202 12.94673202 23.43214156 23.43214156
171453  37.65310777 37.65310777 27.88942772 27.88942772 54.7409581  54.7409581  78.86045287 78.86045287 63.61655487 63.61655487 67.31327606 67.31327606 62.35426899 62.35426899

PhenoData                                                       
        condition   time    rep                                         
0h-1    Control 0   1                                           
0h-2    Control 0   2                                           
6h-1    treatment   6   1                                           
6h-2    treatment   6   2                                           
12h-1   treatment   12  1                                           
12h-2   treatment   12  2                                           
24h-1   treatment   24  1                                           
24h-2   treatment   24  2                                           
48h-1   treatment   48  1                                           
48h-2   treatment   48  2                                           
72h-1   treatment   72  1                                           
72h-2   treatment   72  2                                           
96h-1   treatment   96  1                                           
96h-2   treatment   96  2   

My Code:

library(""Biobase"")                                                        
library(""betr"")                                                                                                   
exprs <- as.matrix(read.table(""Timecourse-Assaydata.txt"", header=TRUE, sep=""\t"", row.names=1, as.is=TRUE))                                                      
pData <- read.table(""Timecourse-Phenodata.txt"", row.names=1, header=TRUE, sep=""\t"")                                                     
metadata <- data.frame(labelDescription = c(""Hour of treatment"", ""Treatment time"", ""number of replicates""), row.names = c(""condition"", ""time"", ""rep""))                                                      
phenoData <- new(""AnnotatedDataFrame"", data = pData, varMetadata = metadata)                                                  

exprspop <- new(""ExpressionSet"", exprs = exprs, phenoData = phenoData)    

Error in validObject(.Object) : " invalid class ""ExpressionSet"" object: sampleNames differ between assayData and phenoData"

Answer

Martin Morgan picture Martin Morgan · Sep 9, 2011

The correct place for this question is on the Bioconductor support site. It's better to provide a reproducible example that captures the essence of the problem; creating the reproducible example often helps to identify the reason for the problem.

library(Biobase)

exprs <- matrix(0, nrow=5, ncol=3,
                dimnames=list(letters[1:5], LETTERS[1:3]))
pData <- data.frame(id=c("foo", "bar", "baz"),
                    row.names=c("x", "y", "z"))
phenoData <- AnnotatedDataFrame(data=pData)

leading to

> ExpressionSet(exprs, phenoData=phenoData)
Error in validObject(.Object) : 
  invalid class "ExpressionSet" object: sampleNames differ between assayData and
phenoData

The problem is that the colname of exprs (i.e., the names of the samples in the experiment) differ frrom the row.names of pData (i.e., the description of the samples)

> row.names(pData)
[1] "x" "y" "z"
> colnames(exprs)
[1] "A" "B" "C"

and the solution is to make them the same

> colnames(exprs) <- row.names(pData)
> eset <- ExpressionSet(exprs, phenoData=phenoData)
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 5 features, 3 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: x y z
  varLabels: id
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

Additional elements can be added to an existing ExpressionSet using assayDataReplace(), e.g.,

> assayDataElement(eset, "foo") <- sqrt(exprs)
> eset
ExpressionSet (storageMode: lockedEnvironment)
assayData: 5 features, 3 samples 
  element names: exprs, foo 
protocolData: none
phenoData
  sampleNames: x y z
  varLabels: id
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

or from the start

> env = new.env()
> env$exprs = exprs
> env$sqrt = sqrt(exprs)
> lockEnvironment(env)
> ExpressionSet(env, pData=pData)
ExpressionSet (storageMode: environment)
assayData: 5 features, 3 samples 
  element names: exprs, sqrt 
protocolData: none
phenoData: none
featureData: none
experimentData: use 'experimentData(object)'
Annotation: