I am wondering about the state of connection and impact on code performance by 'yield' while iterating over data reader object

Sanjeev Rai picture Sanjeev Rai · Mar 13, 2013 · Viewed 18.5k times · Source

Here is my sample code that I am using to fetch data from database: on DAO layer:

public IEnumerable<IDataRecord> GetDATA(ICommonSearchCriteriaDto commonSearchCriteriaDto)
{
    using(DbContext)
    {
        DbDataReader reader = DbContext.GetReader("ABC_PACKAGE.GET_DATA", oracleParams.ToArray(), CommandType.StoredProcedure);
        while (reader.Read())
        {
            yield return reader;
        }
    }
}

On BO layer I am calling the above method like:

List<IGridDataDto> GridDataDtos = MapMultiple(_costDriversGraphDao.GetGraphData(commonSearchCriteriaDto)).ToList();

on mapper layer MapMultiple method is defined like:

public IGridDataDto MapSingle(IDataRecord dataRecord)
{
    return new GridDataDto
    {
        Code = Convert.ToString(dataRecord["Code"]),
        Name = Convert.ToString(dataRecord["Name"]),
        Type = Convert.ToString(dataRecord["Type"])     
    };
}
public IEnumerable<IGridDataDto> MapMultiple(IEnumerable<IDataRecord> dataRecords)
{
    return dataRecords.Select(MapSingle);
}

The above code is working well and good but I am wondering about two concerns with the above code.

  1. How long data reader’s connection will be opened?
  2. When I consider code performance factor only, Is this a good idea to use ‘yield return’ instead of adding record into a list and returning the whole list?

Answer

Marc Gravell picture Marc Gravell · Mar 13, 2013
  1. your code doesn't show where you open/close the connection; but the reader here will actually only be open while you are iterating the data. Deferred execution, etc. The only bit of your code that does this is the .ToList(), so it'll be fine. In the more general case, yes: the reader will be open for the amount of time you take to iterate it; if you do a .ToList() that will be minimal; if you do a foreach and (for every item) make an external http request and wait 20 seconds, then yes - it will be open for longer.
  2. Both have their uses; the non-buffered approach is great for huge results that you want to process as a stream, without ever having to load them into a single in-memory list (or even have all of them in memory at a time); returning a list keeps the connection closed quickly, and makes it easy to avoid accidentally using the connection while it already has an open reader, but is not ideal for large results

If you return an iterator block, the caller can decide what is sane; if you always return a list, they don't have much option. A third way (that we do in dapper) is to make the choice theirs; we have an optional bool parameter which defaults to "return a list", but which the caller can change to indicate "return an iterator block"; basically:

bool buffered = true

in the parameters, and:

var data = QueryInternal<T>(...blah...);
return buffered ? data.ToList() : data;

in the implementation. In most cases, returning a list is perfectly reasonable and avoids a lot of problems, hence we make that the default.