Considering the code below:
Dataview someView = new DataView(sometable)
someView.RowFilter = someFilter;
if(someView.count > 0) { …. }
Quite a number of articles which say Datatable.Select() is better than using DataViews, but these are prior to VS2008.
Solved: The Mystery of DataView's Poor Performance with Large Recordsets
Array of DataRecord vs. DataView: A Dramatic Difference in Performance
Googling on this topic I found some articles/forum topics which mention Datatable.Select() itself is quite buggy(not sure on this) and underperforms in various scenarios.
On this(Best Practices ADO.NET) topic on msdn it is suggested that if there is primary key defined on a datatable the findrows() or find() methods should be used insted of Datatable.Select().
This article here (.NET 1.1) benchmarks all the three approaches plus a couple more. But this is for version 1.1 so not sure if these are valid still now. Accroding to this DataRowCollection.Find() outperforms all approaches and Datatable.Select() outperforms DataView.RowFilter.
So I am quite confused on what might be the best approach on finding rows in a datatable. Or there is no single good way to do this, multiple solutions exist depending upon the scenario?
You are looking for the "best approach on finding rows in a datatable", so I first have to ask: "best" for what? I think, any technique has scenarios where it might fit better then the others.
First, let's look at DataView.RowFilter
: A DataView has some advantages in Data Binding. Its very view-oriented so it has powerful sorting, filtering or searching features, but creates some overhead and is not optimized for performance. I would choose the DataView.RowFilter
for smaller recordsets and/or where you take advantage of the other features (like, a direct data binding to the view).
Most facts about the DataView, which you can read in older posts, still apply.
Second, you should prefer DataTable.Rows.Find
over DataTable.Select
if you want just a single hit. Why? DataTable.Rows.Find returns only a single row. Essentially, when you specify the primary key, a binary tree is created. This has some overhead associated with it, but tremendously speeds up the retrieval.
DataTable.Select
is slower, but can come very handy if you have multiple criteria and don't care about indexed or unindexed rows: It can find basically everything but is not optimized for performance. Essentially, DataTable.Select has to walk the entire table and compare every record to the criteria that you passed in.
I hope you find this little overview helpful.
I'd suggest to take a look at this article, it was helpful for me regarding performance questions. This post contains some quotes from it.
A little UPDATE: By the way, this might seem a little out of scope of your question, but its nearly always the fastest solution to do the filtering and searching on the backend. If you want the simplicity and have an SQL Server as backend and .NET3+ on client, go for LINQ-to-SQL. Searching Linq objects is very comfortable and creates queries which are performed on server side. While LINQ-to-Objects is also a very comfortable but also slower technique. In case you didn't know already....