I have data in two tables.
The first table has a Primary Key called PKID
PKID DATA
0 myData0
1 myData1
2 myData2
The second table has the PKID column from table 1 as a foreign key
PKID_FROM_TABLE_1 U_DATA
0 unique0
0 unique1
0 unique2
1 unique3
1 unique4
1 unique5
2 unique6
2 unique7
2 unique8
The basic SELECT statement I am making now is
SELECT a.PKID, a.DATA, b.U_DATA
FROM table1 as a
INNER JOIN table2 as b
ON a.PKID = b.PKID_FROM_TABLE_1
This produces a table like this:
PKID DATA U_DATA
0 myData0 unique0
0 myData0 unique1
0 myData0 unique2
1 myData1 unique3
1 myData1 unique4
1 myData1 unique5
2 myData2 unique6
2 myData2 unique7
2 myData2 unique8
What I would like is the following table:
PKID DATA U_DATA1 U_DATA2 U_DATA3
0 myData0 unique0 unidque1 unique2
1 myData1 unique3 unidque4 unique5
2 myData2 unique6 unidque7 unique8
If it helps, each PKID will have exactly 3 entries in table2.
Is something like this possible in MySQL?
This is one way to get the result.
This approach uses correlated subqueries. Each subquery uses an ORDER BY
clause to sort the related rows from table2, and uses the LIMIT
clause to retrieve the 1st, 2nd and 3rd rows.
SELECT a.PKID
, a.DATA
, (SELECT b1.U_DATA FROM table2 b1
WHERE b1.PKID_FROM_TABLE_1 = a.PKID
ORDER BY b1.U_DATA LIMIT 0,1
) AS U_DATA1
, (SELECT b2.U_DATA FROM table2 b2
WHERE b2.PKID_FROM_TABLE_1 = a.PKID
ORDER BY b2.U_DATA LIMIT 1,1
) AS U_DATA2
, (SELECT b3.U_DATA FROM table2 b3
WHERE b3.PKID_FROM_TABLE_1 = a.PKID
ORDER BY b3.U_DATA LIMIT 2,1
) AS U_DATA3
FROM table1 a
ORDER BY a.PKID
FOLLOWUP
@gliese581g points out that there may be performance issues with this approach, with a large number of rows returned by the outer query, since each subquery in the SELECT list gets executed for each row returned in the outer query.
It should go without saying that this approach cries out for an index:
ON table2 (PKID_FROM_TABLE_1, U_DATA)
-or, at a minimum-
ON table2 (PKID_FROM_TABLE_1)
It's likely the latter index already exists, if there's a foreign key defined. The former index would allow the query to be satisfied entirely from the index pages ("Using index"), without the need for a sort operation ("Using filesort").
@glies581g is quite right to point out that performance of this approach can be problematic on "large" sets.