How much is performance improved when using LIMIT in a SQL sentence?

Oscar Mederos picture Oscar Mederos · Apr 19, 2011 · Viewed 8.5k times · Source

Let's suppose I have a table in my database with 1.000.000 records.

If I execute:

SELECT * FROM [Table] LIMIT 1000

Will this query take the same time as if I have that table with 1000 records and just do:

SELECT * FROM [Table]

?

I'm not looking for if it will take exactly the same time. I just want to know if the first one will take much more time to execute than the second one.

I said 1.000.000 records, but it could be 20.000.000. That was just an example.

Edit:
Of course that when using LIMIT and without using it in the same table, the query built using LIMIT should be executed faster, but I'm not asking that...

To make it generic:

Table1: X records
Table2: Y records

(X << Y)

What I want to compare is:

SELECT * FROM Table1

and

SELECT * FROM Table2 LIMIT X

Edit 2:
Here is why I'm asking this:

I have a database, with 5 tables and relationships between some of them. One of those tables will (I'm 100% sure) contain about 5.000.000 records. I'm using SQL Server CE 3.5, Entity Framework as the ORM and LINQ to SQL to make the queries.

I need to perform basically three kind of non-simple queries, and I was thinking about showing to the user a limit of records (just like lot of websites do). If the user wants to see more records, the option he/she has is to restrict more the search.

So, the question came up because I was thinking about doing this (limiting to X records per query) or if storing in the database only X results (the recent ones), which will require to do some deletions in the database, but I was just thinking...

So, that table could contain 5.000.000 records or more, and what I don't want is to show the user 1000 or so, and even like this, the query still be as slow as if it would be returning the 5.000.000 rows.

Answer

RichardTheKiwi picture RichardTheKiwi · Apr 19, 2011

TAKE 1000 from a table of 1000000 records - will be 1000000/1000 (= 1000) times faster because it only needs to look at (and return) 1000/1000000 records. Since it does less, it is naturally faster.

The result will be pretty (pseudo-)random, since you haven't specified any order in which to TAKE. However, if you do introduce an order, then one of two below becomes true:

  1. The ORDER BY clause follows an index - the above statement is still true.
  2. The ORDER BY clause cannot use any index - it will be only marginally faster than without the TAKE, because
    • it has to inspect ALL records, and sort by ORDER BY
    • deliver only a subset (TAKE count)
    • so it is not faster in the first step, but the 2nd step involves less IO/network than ALL records

If you TAKE 1000 records from a table of 1000 records, it will be equivalent (with little significant differences) to TAKE 1000 records from 1 billion, as long as you are following the case of (1) no order by, or (2) order by against an index