I have a table with an auto-increment primary key:
create table rt_table
(
rtID int PRIMARY KEY AUTO_INCREMENT,
rt_user_id BIGINT, /*user being retweeted*/
rt_user_name varchar(70), /*user name of rt_user_id*/
source_user_id BIGINT, /*user tweeting rt_user_id*/
source_user_name varchar(70), /*user name of source_user_id*/
tweet_id BIGINT, /*fk to table tweets*/
FOREIGN KEY (tweet_id) references tweets(tweet_id)
);
I wish to populate this table from parts of another table:
insert into rt_table
select rt_user_id, (select user_name from users u where u.user_id = t.rt_user_id),
source_user_id, (select user_name from users u where u.user_id = t.source_user_id),
tweet_id
from tweets t
where rt_user_id != -1;
I get an error which says the number of columns do not match up, which is because of the primary key (which is an auto-incremented value and thus does not need to be set). How do I get around this?
You need to explicitly list the columns in the insert
statement:
insert into rt_table (rt_user_id, rt_user_name, source_user_id, source_user_name, tweet_id)
select rt_user_id, (select user_name from users u where u.user_id = t.rt_user_id),
source_user_id, (select user_name from users u where u.user_id = t.source_user_id),
tweet_id
from tweets t
where rt_user_id != -1;
Also, I think it is better form to use explicit joins, rather than nested selects:
insert into rt_table (rt_user_id, rt_user_name, source_user_id, source_user_name, tweet_id)
select t.rt_user_id, u.user_name, t.source_user_id, su.user_name, t.tweet_id
from tweets t left outer join
users u
on t.rt_user_id = u.user_id left outer join
users su
on t.source_user_id = su.user_id
where rt_user_id != -1;
This often (but not always) helps the optimizer find the best query plan.