I am working with plm package and I have problem with random and within models, which are giving errors which says "empty model". However, the model is not empty. In the source code for plm.fit, where the error originates it says something like (writing from the top of my head...)
X <- model.matrix(formula,data, lhs=1,...)
if (ncol(X) == 0) stop("empty model")
however if I try to replicate this behaviour with the commands I am inputing into the original function, it gives ncol(X) is 17 or something like that.
My code is (data deleted...):
library(sampleSelection)
library(foreign)
library(censReg)
library(plm)
library(micEcon)
library(ggplot2)
data <- read.dta('kpfull1.dta')
summary(data)
attach(data)
data$profit_share <- p91/tnsvp
data$debt_assets <- d91/naba
data$naba3 <- naba^3
data$difprofit <- p91-p90
data$agri <- (mind==1)*1
data$hi <- (mind==2)*1
data$li <- (mind==3)*1
data$constr <- (mind==4)*1
data$trans <- (mind==5)*1
data$trade <- (mind==6)*1
data$rd <- (mind==7)*1
data$ser <- (mind==8)*1
data$fin <- (mind==9)*1
data$for1 <- data[,7]
detach(data)
data1 <- data
panel <- pdata.frame(data, c("num","rnd"))
testovaci <- plm(tb ~ profit_share + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, data = panel, model = "within")
summary(testovaci)
model.matrix(tb ~ profit_share + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, data)
model.matrix(tb ~ profit_share + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, panel)
Thanks, Tomáš Křehlík.
Answered in R-help by Giovanni Millo:
Dear Tomas, dear list,
a follow-up, as in the meantime I got the data by private communication. The problem, as I suspected, is hidden in some lack of time variability in the data. In fact, OLS works fine:
%
fm
is the formula,data
is the dataset,panel
is the dataset transformed as apdata.frame
summary(lm(fm, data))
% (output ok, omitted)
as does plm "pooling"
summary(plm(fm, panel, model="pooling"))
Oneway (individual) effect Pooling Model
% (output ok, omitted)
but FE fails:
summary(plm(fm, panel, model="within"))
Errore in plm.fit(formula,
data, model, effect, random.method, inst.method) : empty model
as do the various RE methods
summary(plm(fm, panel, model="random"))
Errore in plm.fit(formula,
data, model = "within", effect = effect) : empty model
...and if you look at the error message, it is clear that it is the within/FE part that has problems (RE methods are based on FE for estimating the error components). In fact, trying to panel-difference any right-hand side variable results in all zeros (NaNs are for variance shares, which are 0/0), e.g. the first one:
summary(diff(panel$profit_share))
total sum of squares : 0 id time
NaN NaN
but it is really the same for each. So the (within transformed) model is actually empty, as the original error message says. Now I don't have time to look deeply into the data, but the rhs variables all look time-constant to me...
The takeaway for panel guys on the list, therefore, is: main cause for data-induced error is bad indices, second one is bad data variability; first step to diagnose it is running
lm()
and thenplm(..., model="pooling")
.lm()
fails=> bad data, bad formula;plm(..., "pooling")
fails=> something basically wrong with indices; other panel methods fail=> most likely data variability problems.Best, Giovanni
PS I tried to circumvent the issue by ML estimation of an RE model but no way, I got a singular matrix error: so the data really are ill-conditioned
library(nlme) remod<-lme(tb ~ profit_share + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, random=~1|num, data=data) Errore in solve.default(estimates[dimE[1] - (p:1), dimE[2] - (p:1), drop = FALSE]) : il sistema è numericamente singolare: valore di condizione di reciprocità = 3.93401e-25