How to construct a SQLite query to GROUP by ORDER?

Eric picture Eric · Sep 26, 2012 · Viewed 39k times · Source

I've got a rather interesting situation. I have a SQLite database full of addresses and messages (addresses are not unique; messages are). Each message also has a date associated with it. What I want to do is select the first message's address, message, date, and how many messages are associated with the address.

So, I thought, "I can just GROUP by the address to only get one message per address, then ORDER these by the date, and also fetch the COUNT of the address column."

I did that, and it works... kinda. It fetches the correct count, fetches only one message per address, and orders them by date--but it does not select the most recent message for the address. It appears to be arbitrary.

As an example, I have three messages (earliest to latest) A, B, C from address Y, and three messages D, E, F from address Z. The query may fetch messages B and E, then sort them by date. It should fetch messages C and F, and sort those by date.

Here is what I have so far:

// Expanded version:
Cursor cursor = db.query(
    /* FROM */ "messages_database",
    /* SELECT */ new String[]{ "*", "COUNT(address) AS count" },
    /* WHERE */ null,
    /* WHERE args */ null,
    /* GROUP BY */ "address",
    /* HAVING */ null,
    /* ORDER BY */ "date DESC"
);

// Or, same code on one line:
Cursor cursor = db.query("messages_database", new String[]{ "*", "COUNT(address) AS count" }, null, null, "address", null, "date DESC");

I feel like this may have to do with the HAVING clause, but I really don't know. I've used MySQL a lot with PHP, but never had to touch HAVING before. I tried setting my HAVING clause to "MAX(date)", but it had no effect. If I set my GROUP BY clause to be "address, date", then they are sorted by date, but of course they are all individual instead of grouped (since the dates differ).

Google searches have proved fruitless; queries like "android sqlite order before group" and "android sqlite group by order" yield no related results.

How can I select the one latest message for each address without removing my GROUP clause (as COUNT() relies upon this)? Do I need two queries?

Edit: Based on the answer @Adrian linked me to in the comments, I came up with two queries which both produced the same result; one row, in which the count was 7 (which is the total number of addresses, not messages per address), and the address shown was not that of the latest message.

The two queries were:

Cursor cursor = db.rawQuery(
      "SELECT t.*, COUNT(t.message_content) AS count "
    + "FROM messages_database t "
    + "INNER JOIN ("
    + "    SELECT address, MAX(date) AS maxdate "
    + "    FROM messages_database "
    + "    GROUP BY address "
    + ") ss ON t.address = ss.address AND t.date = ss.maxdate",
    null
);
Cursor cursor = db.rawQuery(
      "SELECT t1.*, COUNT(t1.message_content) AS count "
    + "FROM messages_database t1 "
    + "LEFT OUTER JOIN messages_database t2 "
    + "ON (t1.address = t2.address AND t1.date < t2.date) "
    + "WHERE t2.address IS NULL",
    null
);

Answer

CL. picture CL. · Sep 26, 2012

SQLite has an extension that makes problems much easier:
If you are using the MAX() or MIN() aggregate functions, and if you are selecting other columns at the same time without using them in an aggregate function or grouping by them, then the resulting values for those columns are guaranteed to come out of the same record that has the maximum/minimum value. (This is not allowed in other SQL dialects, and was introduced in SQLite 3.7.11.)

So, for your problem, you can use a query like this:

SELECT *, COUNT(address) AS count, MAX(date)
FROM messages_database
GROUP BY address

If you don't have SQLite 3.7.11 (which is likely on most Android versions) or using another SQL engine, the following query will work:

SELECT *,
       (SELECT COUNT(address) AS count
        FROM messages_database m2
        WHERE m1.address = m2.address)
FROM messages_database m1
WHERE date = (SELECT MAX(date)
              FROM messages_database m3
              WHERE m1.address = m3.address)
GROUP BY address