I am trying to read in a table from an Access database, and then sort the data in that table out to a number of text files. The clincher is that the filename to write to is dependant upon values in each record. This is officially my first C# application so you can consider me "green". I should also mention that I'm working off an Access database just until I can get the code hammered out, ultimately it'll be pulling from a SQL server with millions of records.
I have code working now, but the problem is that there are a ton of File Open / Close operations. I want to only open each file once for writing since it will be writing these files to a network drive. This is essentially a glue app running on a server - so there are some other restrictions too - I can't save to a local drive then copy to the network. I can't sort the query prior to pulling. I can't adversely affect server resources while running.
Probably the best way to do this is with a Hash table. Check if the file has been opened, if not, open it and save the file handle in the Hash Table. Then close them all at once when finished. However I cannot find an example of how to use multiple StreamWriter objects simultaneously.
I expected to find the answer to this relatively easily but I can't seem to find a solution tot his. My suspicion is that StreamWriter is the wrong class to be using for this.
The closest previous question I've been able to find is from a CodeProject page. On that page they say that the practice of keeping file hands open is bad and should be avoided, but the page doesn't explain why nor offer example alternatives. There is a suggestion to load the entire data set into memory and then operate on it, but that's not an option for me as there will be too much data in the tables.
Here's what I have so far.
String strConnection;
String strQuery;
String strPunchFileNameTemplate;
// Define our Variables
strConnection = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=ClockData.accdb";
strQuery = @"SELECT * FROM ClockPunches";
strPunchFileNameTemplate = @"C:\PUNCHES\%READER%.TXT";
// OleDbConnection implements iDisposable interface, so we must scope out its usage.
// Set up Connection to our data source
using (OleDbConnection ConnObj = new OleDbConnection(strConnection)) {
// Create a Command with our Query String
OleDbCommand CmdObj = new OleDbCommand(strQuery,ConnObj);
// Open our Connection
ConnObj.Open();
// OledbDataReader implements iDisposable interface, so we must scope out its usage.
// Execute our Reader
using (OleDbDataReader ReaderObj = CmdObj.ExecuteReader(CommandBehavior.KeyInfo)) {
// Load the source table's schema into memory (a DataTable object)
DataTable TableObj = ReaderObj.GetSchemaTable();
// Parse through each record in the Reader Object
while(ReaderObj.Read()) {
// Extract PunchTime, CardNumber, and Device to separate variables
DateTime dtTime = ReaderObj.GetDateTime(ReaderObj.GetOrdinal("PunchTime"));
Int16 intID = ReaderObj.GetInt16(ReaderObj.GetOrdinal("CardNumber"));
String strReader = ReaderObj.GetString(ReaderObj.GetOrdinal("Device"));
// Translate the device name into a designated filename (external function)
strReader = GetDeviceFileName(strReader);
// Put our dynamic filename into the path template
String pathStr = strPunchFileNameTemplate.Replace("%READER%",strReader);
// Check to see if the file exists. New files need an import Header
Boolean FileExistedBool = File.Exists(pathStr);
// StreamWrite implements iDisposable interface, so we must scope out its usage.
// Create a Text File for each Device, Append if it exists
using (StreamWriter outSR = new StreamWriter(pathStr, true)) {
// Write our Header if required
if (FileExistedBool == false) {
outSR.WriteLine("EXAMPLE FILE HEADER");
}
// Set up our string we wish to write to the file
String outputStr = dtTime.ToString("MM-dd-yyyy HH:mm:ss") + " " + intID.ToString("000000");
// Write the String
outSR.WriteLine(outputStr);
// End of StreamWriter Scope - should automatically close
}
}
// End of OleDbDataREader Scope - should automatically close
}
// End of OleDbConnection Scope - should automatically close
}
It is a quite interesting problem you got your self into.
The problem with caching the file handlers is that a huge amount of file handlers can drain the system for resources making the program and windows perform bad.
If the number of devices in your database is not too high (less than 100) i think it would be safe to cache the handles.
Alternatively you could cache a million records distribute them to the different devices and save some and then read some more records.
You could place the records in a Dictionary like this:
class PunchInfo
{
public PunchInfo(DateTime time, int id)
{
Id = id;
Time = time;
}
public DateTime Time;
public int Id;
}
Dictionary<string, List<PunchInfo>> Devices;
int Count = 0;
const int Limit = 1000000;
const int LowerLimit = 90 * Limit / 100;
void SaveRecord(string device, int id, DateTime time)
{
PunchInfo info = new PunchInfo(time, id);
List<PunchInfo> list;
if (!Devices.TryGetValue(device, out list))
{
list = new List<PunchInfo>();
Devices.Add(device, list);
}
list.Add(info);
Count++;
if (Count >= Limit)
{
List<string> writeDevices = new List<string>();
foreach(KeyValuePair<string, List<PunchInfo>> item in Devices)
{
writeDevices.Add(item.Key);
Count -= item.Value.Count;
if (Count < LowerLimit) break;
}
foreach(string device in writeDevices)
{
List<PunchInfo> list = Devices[device];
Devices.Remove(device);
SaveDevices(device, list);
}
}
}
void SaveAllDevices()
{
foreach(KeyValuePair<string, List<PunchInfo>> item in Devices)
SaveDevices(item.Key, item.Value);
Devices.Clear();
}
This way you will avoid opening and closing files and have a lot of open files.
One million records takes up 20 MB memory, you could easily raise that to 10 million records without problems.