MySQL SUM Query is extremely slow

Hamed Kamrava picture Hamed Kamrava · Apr 6, 2017 · Viewed 9.9k times · Source

There is a table called transactions with ~6 million rows. Below query counts the current user balance. Here is the log after I enabled slow_query_log = 'ON' :

# Time: 170406  9:51:48
# User@Host: root[root] @  [xx.xx.xx.xx]
# Thread_id: 13  Schema: main_db  QC_hit: No
# Query_time: 38.924823  Lock_time: 0.000034  Rows_sent: 1  Rows_examined: 773550
# Rows_affected: 0
SET timestamp=1491456108;
SELECT SUM(`Transaction`.`amount`) as total
    FROM `main_db`.`transactions` AS `Transaction`
    WHERE `Transaction`.`user_id` = 1008
      AND `Transaction`.`confirmed` = 1
    LIMIT 1;

As you can see it took ~38 seconds !

Here is transactions table EXPLAIN :

enter image description here

This query sometimes run fast (about ~1 second) and sometimes really slow!

Any help would be great appreciated.

P.S:

It's InnoDB and transactions table has frequent INSERT and SELECT operations.

I tried running the query with SQL_NO_CACHE, but it is still sometimes fast, sometimes slow.

transactions Table Schema :

CREATE TABLE `transactions` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `ref_id` varchar(40) COLLATE utf8_persian_ci NOT NULL,
  `payment_id` tinyint(3) unsigned NOT NULL,
  `amount` decimal(10,1) NOT NULL,
  `created` datetime NOT NULL,
  `private_note` varchar(6000) COLLATE utf8_persian_ci NOT NULL,
  `public_note` varchar(200) COLLATE utf8_persian_ci NOT NULL,
  `confirmed` tinyint(3) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=13133663 DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci

MySQL is running on a VPS with 12GB RAM and 9 Logical CPU cores.

Here is a part of my.cnf :

# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
default_storage_engine  = InnoDB
# you can't just change log file size, requires special procedure
innodb_buffer_pool_size = 9G
innodb_log_buffer_size  = 8M
innodb_file_per_table   = 1
innodb_open_files       = 400
innodb_io_capacity      = 400
innodb_flush_method     = O_DIRECT
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64


# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address           = 127.0.0.1
#
# * Fine Tuning
#
max_connections         = 500
connect_timeout         = 5
wait_timeout            = 600
max_allowed_packet      = 16M
thread_cache_size       = 128
sort_buffer_size        = 4M
bulk_insert_buffer_size = 16M
tmp_table_size          = 32M
max_heap_table_size     = 32M

Answer

Rick James picture Rick James · Apr 7, 2017

(Yes, I am adding another answer. Justification: It addresses the underlying problem a different way.)

The underlying problem seems to be that there is an ever-growing "transaction" table from which is derived various statistics, such as SUM(amount). The performance of this will only get worse and worse as the table(s) grow.

The basis for this Answer will be to look at the data in two ways: "History" and "Current". Transactions is the History. A new table would be the Current totals for each User. But I see multiple ways to do that. Each involves some form of subtotal(s) so as to avoid adding 773K rows to get the answer.

  • The traditional banking way... Each night tally up the day's Transactions and add them to Current.
  • The Materialized View way... Each time a row is added to Transactions, increment Current.
  • Hybrid: Keep daily subtotals in a "Summary Table". Sum those subtotals to get the SUM through last night.

More discussion in my blog on Summary Tables.

Note that the up-to-the-second balance for the banking or hybrid way is a little tricky:

  1. Get last night's amount
  2. Add any Transactions that occurred during the day.

Any of the approaches will be a lot faster than scanning all 773K rows for the user, but it will be more complex code.