My goal is to take 'matr', sort it by column c1, and keep unique(c1) where c2 = 1. For example, from this code...
c1 = c("a",'a','a','b','b','b','b','c','c')
c2 = c(1,1,0,1,1,0,1,0,0)
matr = as.data.frame(cbind(c1,c2))
one = sqldf('select distinct(c1),c2 from matr where c2 = 1')
vs = sqldf('select distinct(c1),c0,c2 from matr group by c1')
sqldf('select matr.*,one.* from matr
left outer join one
where one.c1 = matr.c1')
arrive at:
c1 c2
a 1
b 1
c 0
For some reason, i lose row c in the left join. I can only arrive at (through other joins)
c1 c2
a 1
b 1
You want on
rather than where
in the last sql statement. With where
it is performing the join and then afterwards applies the where
clause whereas with on
it does the join relative to the on
condition.
> sqldf('select matr.*, one.* from matr left outer join one on one.c1 = matr.c1')
c1 c2 c1 c2
1 a 1 a 1
2 a 1 a 1
3 a 0 a 1
4 b 1 b 1
5 b 1 b 1
6 b 0 b 1
7 b 1 b 1
8 c 0 <NA> <NA>
9 c 0 <NA> <NA>