Getting an error "(subscript) logical subscript too long" while training SVM from e1071 package in R

Question 1

Getting an error "(subscript) logical subscript too long" while training SVM from e1071 package in R

r svm

Ayush Raj Singh · Jun 14, 2013 · Viewed 54k times · Source

Answer

Answer

There might a difference in the number of levels in one of the factors in 'test' dataset.

run str(test) and check that the factor variables have the same levels as corresponding variables in the 'train' dataset.

ie the example below shows my.test$foo only has 4 levels.....

str(my.train)
'data.frame':   554 obs. of  7 variables:
 ....
 $ foo: Factor w/ 5 levels "C","Q","S","X","Z": 2 2 4 3 4 4 4 4 4 4 ...

str(my.test)
'data.frame':   200 obs. of  7 variables:
 ...
 $ foo: Factor w/ 4 levels "C","Q","S","X": 3 3 3 3 1 3 3 3 3 3 ...

Question 2

I am training svm using my traindata. (e1071 package in R). Following is the information about my data.

> str(train)
'data.frame':   891 obs. of  10 variables:
$ survived: int  0 1 1 1 0 0 0 0 1 1 ...
$ pclass  : int  3 1 3 1 3 3 1 3 3 2 ...
$ name    : Factor w/ 15 levels "capt","col","countess",..: 12 13 9 13 12 12 12 8 13 13 
$ sex     : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
$ age     : num  22 38 26 35 35 ...
$ ticket  : Factor w/ 533 levels "110152","110413",..: 516 522 531 50 473 276 86 396 
$ fare    : num  7.25 71.28 7.92 53.1 8.05 ...
$ cabin   : Factor w/ 9 levels "a","b","c","d",..: 9 3 9 3 9 9 5 9 9 9 ...
$ embarked: Factor w/ 4 levels "","C","Q","S": 4 2 4 4 4 3 4 4 4 2 ...
$ family  : int  1 1 0 1 0 0 0 4 2 1 ...

I train it as the following.

library(e1071)
model1 <- svm(survived~.,data=train, type="C-classification")

No problem here. But when I predict as:

pred <- predict(model1,test)

I get the following error:

Error in newdata[, object$scaled, drop = FALSE] : 
(subscript) logical subscript too long

I also tried removing "ticket" predictor from both train and test data. But still same error. What is the problem?

Getting an error "(subscript) logical subscript too long" while training SVM from e1071 package in R

Answer

Related questions