Problems with within and random models in plm package

krhlk picture krhlk · Jul 10, 2012 · Viewed 11.3k times · Source

I am working with plm package and I have problem with random and within models, which are giving errors which says "empty model". However, the model is not empty. In the source code for plm.fit, where the error originates it says something like (writing from the top of my head...)

X <- model.matrix(formula,data, lhs=1,...)
if (ncol(X) == 0) stop("empty model")

however if I try to replicate this behaviour with the commands I am inputing into the original function, it gives ncol(X) is 17 or something like that.

My code is (data deleted...):

library(sampleSelection)
library(foreign)
library(censReg)
library(plm)
library(micEcon)
library(ggplot2)

data <- read.dta('kpfull1.dta')
summary(data)
attach(data)

data$profit_share <- p91/tnsvp
data$debt_assets <- d91/naba
data$naba3 <- naba^3
data$difprofit <- p91-p90
data$agri <- (mind==1)*1
data$hi <- (mind==2)*1
data$li <- (mind==3)*1
data$constr <- (mind==4)*1
data$trans <- (mind==5)*1
data$trade <- (mind==6)*1
data$rd <- (mind==7)*1
data$ser <- (mind==8)*1
data$fin <- (mind==9)*1
data$for1 <- data[,7]
detach(data)
data1 <- data
panel <- pdata.frame(data, c("num","rnd"))

testovaci <- plm(tb ~ profit_share  + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, data = panel, model = "within")
summary(testovaci)

model.matrix(tb ~ profit_share  + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, data)

model.matrix(tb ~ profit_share  + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, panel)

Thanks, Tomáš Křehlík.

Answer

krhlk picture krhlk · Jul 10, 2012

Answered in R-help by Giovanni Millo:

Dear Tomas, dear list,

a follow-up, as in the meantime I got the data by private communication. The problem, as I suspected, is hidden in some lack of time variability in the data. In fact, OLS works fine:

% fm is the formula, data is the dataset, panel is the dataset transformed as a pdata.frame

summary(lm(fm, data))

% (output ok, omitted)

as does plm "pooling"

summary(plm(fm, panel, model="pooling"))

Oneway (individual) effect Pooling Model

% (output ok, omitted)

but FE fails:

summary(plm(fm, panel, model="within"))

Errore in plm.fit(formula, data, model, effect, random.method, inst.method) : empty model

as do the various RE methods

summary(plm(fm, panel, model="random"))

Errore in plm.fit(formula, data, model = "within", effect = effect) : empty model

...and if you look at the error message, it is clear that it is the within/FE part that has problems (RE methods are based on FE for estimating the error components). In fact, trying to panel-difference any right-hand side variable results in all zeros (NaNs are for variance shares, which are 0/0), e.g. the first one:

summary(diff(panel$profit_share))

total sum of squares : 0 id time NaN NaN

but it is really the same for each. So the (within transformed) model is actually empty, as the original error message says. Now I don't have time to look deeply into the data, but the rhs variables all look time-constant to me...

The takeaway for panel guys on the list, therefore, is: main cause for data-induced error is bad indices, second one is bad data variability; first step to diagnose it is running lm() and then plm(..., model="pooling"). lm() fails=> bad data, bad formula; plm(..., "pooling") fails=> something basically wrong with indices; other panel methods fail=> most likely data variability problems.

Best, Giovanni

PS I tried to circumvent the issue by ML estimation of an RE model but no way, I got a singular matrix error: so the data really are ill-conditioned

library(nlme) remod<-lme(tb ~ profit_share + debt_assets + naba + naba3 + for1 + dom + difprofit + agri + hi + li + constr + trans + trade + rd + ser + fin, random=~1|num, data=data) Errore in solve.default(estimates[dimE[1] - (p:1), dimE[2] - (p:1), drop = FALSE]) : il sistema è numericamente singolare: valore di condizione di reciprocità = 3.93401e-25