How can I use Nhibernate to retrieve data when the "WHERE IN()" have thousands of values? (too many parameters in the sql)

Kenneth_hj picture Kenneth_hj · Jun 9, 2011 · Viewed 9.6k times · Source

The problem: Nhibernate parses each value in the "WHERE IN()" sql as parameters and MS SQL server doesn't support enough parameters (over 2000).

I am using Nhibernate with Linq to retrive my data from the SQL server and I need to load alot of entities based on already known ID's.

My code looks something like this:

int[] knownIds = GetIDsFromFile();
var loadedEntities = _Repository.GetAll()
                                .Where(x => knownIds.Contains(x.ID))
                                .ToList();

Which give an sql like this:

SELECT id, name FROM MyTable 
WHERE id IN (1 /* @p0 */,2 /* @p1 */,3 /* @p2 */,4 /* @p3 */, 5 /* @p4 */)

If there is too many values in knownIds, then this code will throw an Exception because of the many parameters that NHibernate uses.

I think the best solution would be if I could make NHibernate use only 1 parameter for the whole "WHERE IN()", but I don't know how to do this:

SELECT id, name FROM MyTable WHERE id IN (1, 2, 3, 4, 5 /* @p0 */)

I'll be glad to hear any ideas of how to solve this - either by extending the LINQ provider or by other means. One solution is to simple do the query x times (knownIds.Count / 1000), but I rather want a generic solution which would work for all my entities.

I have tried looking at extending the LINQ provider by searching google and Stackoverflow, however I can't find a solution and I dont have any experience with either HQL or the treebuilder. Here are a few of the sites I have been at:

UPDATE: I know it ain't good practice by having so many values in the IN clause, but I don't know a better solution for what I want to do.
Consider a company where all the customers pay for the company's services one time each month. The company don't handle the payments itself, but have another company to collect the money. One time each month the company receives a file containing the status of these payments: if they have been paid or not. The file only contains the ID of the specific payment, and not the ID of the customer. A company with 3000 monthly customers, will then make 3000 LogPayments each month, where the status need to be updated. After 1 year there will be around 36.000 LogPayments, so just loading them all doesn't seem like a good solution either.

MY SOLUTION: Thanks for all the usefull answers. In the end I choosed to use a combination of the answers. For this specific case I did something like Fourth suggested, as that would increase performance a great deal. However I have allso implemented the generic method Stefan Steinegger suggested, because I like that I can do this, if that is what I really want. Besides, I don't want my program to crash with an exception, so in the future I will allso use this ContainsAlot-method as a safeguard.

Answer

Stefan Steinegger picture Stefan Steinegger · Jun 9, 2011

See this similar question: NHibernate Restrictions.In with hundreds of value

I usually set up several queries, which all get for instance 1000 entries. Just split you array of ids into several pieces.

Something like this:

// only flush the session once. I have a using syntax to disable
// autoflush within a limited scope (without direct access to the
// session from the business logic)
session.Flush();
session.FlushMode = FlushMode.Never;

for (int i = 0; i < knownIds; i += 1000)
{
  var page = knownIds.Skip(i).Take(1000).ToArray();
  loadedEntities.AddRange(
    Repository.GetAll()
      .Where(x => page.Contains(x.ID)));
}

session.FlushMode = FlushMode.Auto;

Generic implementation using criteria (only filtering a single property, which is a common case):

public IList<T> GetMany<TEntity, TProp>(
  Expression<Func<TEntity, TProp>> property,
  IEnumerable<TProp> values)
{
    string propertyName = ((System.Linq.Expressions.MemberExpression)property.Body).Member.Name;

    List<T> loadedEntities = new List<T>();

    // only flush the session once. 
    session.Flush();
    var previousFlushMode = session.FlushMode;
    session.FlushMode = FlushMode.Never;

    for (int i = 0; i < knownIds; i += 1000)
    {
      var page = knownIds.Skip(i).Take(1000).ToArray();

      loadedEntities.AddRange(session
        .CreateCriteria(typeof(T))
        .Add(Restriction.PropertyIn(propertyName, page)
        .List<TEntity>();
    }

    session.FlushMode = previousFlushMode;
    return loadedEntities;
}

To be used like this:

int[] ids = new [] {1, 2, 3, 4, 5 ....};
var entities = GetMany((MyEntity x) => x.Id, ids);

string[] names = new [] {"A", "B", "C", "D" ... };
var users = GetMany((User x) => x.Name, names);