Edit Sept 26
See below for the full background. tl;dr: A data grid control is causing odd exceptions, and I am looking for help isolating the cause and finding a solution.
I've narrowed this down a bit further. I have been able to reproduce the behavior in a smaller test app with more reliable triggering of the erratic behavior.
I can definitely rule out both threading and (I think) memory issues. The new app uses no Tasks or other threading/asynchronous features, and I can trigger the unhandled exception simply by adding properties that return a constant to the class of objects shown in the DataGrid. This indicates to me that the problem is either in unmanaged resource exhaustion or something I haven't thought of yet.
The revised program is structured like this. I have created a user control called EntityCollectionGridView
which has a label and a data grid. In the control's Loaded event handler, I assign a List<TestClass>
to the data grid with 1000 or 10000 rows, letting the grid generate the columns. This user control is instantiated 2-4 times in MainPage.xaml in the page's OnNavigatedTo
event (or Loaded
, it doesn't seem to matter). If an exception occurs, it occurs immediately after MainPage is shown.
The interesting thing is, the behavior doesn't seem to vary with the number of rows shown (it will work reliably with 10000 rows or fail reliably with only 1000 rows in each grid) but rather with the total number of columns in all the grids loaded at a given time. With 20 properties to show, 4 grids works fine. With 35 properties and 4 grids, the exception is thrown. But if I eliminate two grids, the same class with 35 properties will work fine.
Note that all of the properties I add to TestClass
to jump from 20 to 35 columns are of the form:
public string StringXYZ { get { return "asdfasdfasdfasdfasf"; } }
So, there's no additional memory in the backing data (and again, I don't think memory pressure is the problem anyway).
What do you all think? Again, the handles/user objects/etc in Task Manager look good, but is there something else I might be missing?
Original post
I have been working on a port of the Silverlight Toolkit DataGrid to WinRT, and it has done well enough in simple tests (a variety of configurations and up to 10000 rows). However, as I have tried to embed it into another WinRT app I have run into a sporadic exception (of type System.Exception, raised in the App.UnhandledException handler) that is proving very difficult to debug.
Not enough quota is available to process this command. (Exception from HRESULT: 0x80070718)
The error is consistently reproducible, but not deterministically. That is, I can make it happen every time I run the App, but it doesn't always happen by performing the same exact set of steps the same number of times. The error seems to occur on page transitions (whether navigating to a new page forward or back to a previous page), and not (for instance) when changing the ItemsSource of the datagrid.
The application structure is basically recursive access through a hierarchy, with a page shown at each hierarchy level. On the page for the current node in the hierarchy, each child node and some grandchild nodes are shown, and for any subnode a datagrid may be shown. In practice, I consistently reproduce this with the following navigation structure:
Root page: shows no datagrid
Child page: shows one datagrid and a few listviews
Grandchild page: shows two datagrids, one bound to the
same source as Child page, the other one empty
A typical test scenario is, start at Root, move to Child, move to Grandchild, move back to Child, and then when I try to navigate to Grandchild again, it fails with the exception I mentioned above. But it might fail the first time I hit Grandchild, or it might let me move back and forth a few times before failing.
The call stack has only one managed frame on it, which is the unhandled exception event handler. This is very unhelpful. Switching to mixed mode debugging, I get the following:
WinRTClient.exe!WinRTClient.App.InitializeComponent.AnonymousMethod__14(object sender, Windows.UI.Xaml.UnhandledExceptionEventArgs e) Line 50 + 0x20 bytes C#
[Native to Managed Transition]
Windows.UI.Xaml.dll!DirectUI::CFTMEventSource<Windows::UI::Xaml::IUnhandledExceptionEventHandler,Windows::UI::Xaml::IApplication,Windows::UI::Xaml::IUnhandledExceptionEventArgs>::Raise(Windows::UI::Xaml::IApplication * pSource, Windows::UI::Xaml::IUnhandledExceptionEventArgs * pArgs) Line 327 C++
Windows.UI.Xaml.dll!DirectUI::Application::RaiseUnhandledExceptionEventHelper(long hrEncountered, unsigned short * pszErrorMessage, unsigned int * pfIsHandled) Line 920 + 0xa bytes C++
Windows.UI.Xaml.dll!DirectUI::ErrorHelper::CallAUHandler(unsigned int errorCode, unsigned int * pfIsHandled, wchar_t * * pbstrErrorMessage) Line 39 + 0x14 bytes C++
Windows.UI.Xaml.dll!DirectUI::ErrorHelper::ProcessUnhandledErrorForUserCode(long error) Line 82 + 0x10 bytes C++
Windows.UI.Xaml.dll!AgCoreCallbacks::CallAUHandler(unsigned int errorCode) Line 1104 + 0x8 bytes C++
Windows.UI.Xaml.dll!CCoreServices::ReportUnhandledError(long errorXR) Line 6582 C++
Windows.UI.Xaml.dll!CXcpDispatcher::Tick() Line 1126 + 0xb bytes C++
Windows.UI.Xaml.dll!CXcpDispatcher::OnReentrancyProtectedWindowMessage(HWND__ * hwnd, unsigned int msg, unsigned int wParam, long lParam) Line 653 C++
Windows.UI.Xaml.dll!CXcpDispatcher::WindowProc(HWND__ * hwnd, unsigned int msg, unsigned int wParam, long lParam) Line 401 + 0x24 bytes C++
user32.dll!_InternalCallWinProc@20() + 0x23 bytes
user32.dll!_UserCallWinProcCheckWow@36() + 0xbd bytes
user32.dll!_DispatchMessageWorker@8() + 0xf8 bytes
user32.dll!_DispatchMessageW@4() + 0x10 bytes
Windows.UI.dll!Windows::UI::Core::CDispatcher::ProcessMessage(int bDrainQueue, int * pbAnyMessages) Line 121 C++
Windows.UI.dll!Windows::UI::Core::CDispatcher::ProcessEvents(Windows::UI::Core::CoreProcessEventsOption options) Line 184 + 0x10 bytes C++
Windows.UI.Xaml.dll!CJupiterWindow::RunCoreWindowMessageLoop() Line 416 + 0xb bytes C++
Windows.UI.Xaml.dll!CJupiterControl::RunMessageLoop() Line 714 + 0x5 bytes C++
Windows.UI.Xaml.dll!DirectUI::DXamlCore::RunMessageLoop() Line 2539 + 0x5 bytes C++
Windows.UI.Xaml.dll!DirectUI::FrameworkView::Run() Line 91 C++
twinapi.dll!`Windows::ApplicationModel::Core::CoreApplicationViewAgileContainer::RuntimeClassInitialize'::`55'::<lambda_A2234BA2CCD64E2C>::operator()(void * pv) Line 560 C++
twinapi.dll!`Windows::ApplicationModel::Core::CoreApplicationViewAgileContainer::RuntimeClassInitialize'::`55'::<lambda_A2234BA2CCD64E2C>::<helper_func>(void * pv) Line 613 + 0xe bytes C++
SHCore.dll!_SHWaitForThreadWithWakeMask@12() + 0xceab bytes
kernel32.dll!@BaseThreadInitThunk@12() + 0xe bytes
ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes
This indicates to me that whatever I'm doing wrong doesn't register until after at least one cycle in the app's message loop (and I also tried breaking on all thrown exceptions using "Debug | Exceptions..." -- as far as I can tell, nothing is thrown and swallowed). The interesting stack frames I see are WindowProc
, OnReentrancyProtectedWindowMessage
, and Tick
. The msg
is 0x402 (1026), which doesn't mean anything to me. This page lists that message as used in the following contexts:
CBEM_SETIMAGELIST
DDM_CLOSE
DM_REPOSITION
HKM_GETHOTKEY
PBM_SETPOS
RB_DELETEBAND
SB_GETTEXTA
TB_CHECKBUTTON
TBM_GETRANGEMAX
WM_PSD_MINMARGINRECT
...but that doesn't mean anything much to me, either (it might not even be relevant).
The three theories I can come up with are these:
I have plenty of disk space free, and am not using the disk for anything other than configuration files anyway (and all that worked fine before adding the datagrids).
My next thought was to try to step through some of the potential 'interesting' parts of the datagrid code and jump over any that were questionable. I did try that with the datagrid's ArrangeOverride, but the exception didn't seem to care whether I did that or not. Also, I am not sure this is a useful strategy. Since the exception isn't being thrown until after a cycle on the message loop, and since I can't know for sure when it's about to happen, I would need to cover a huge number of permutations, running each permutation a whole lot of times, to isolate the problem code.
The error is thrown in both Debug and Release modes. And, as a final background note, the amount of data we're dealing with here is small, much smaller than my 10000-row runs of the datagrid in isolation. It's probably on the order of 50-100 rows, with perhaps 30-40 columns. And before the exception is thrown, the data and the grids seem to work and respond fine.
So, that's why I come to you. My two questions are:
Many thanks in advance for any help you can provide!
OK, with some critical input from Tim Heuer [MSFT], I figured out what was going on and how to get around this problem.
Surprisingly, none of my three initial guesses were correct. This wasn't about memory, threading, or system resources. Instead, it was about limitations in the Windows messaging system. Apparently it is a little like a stack overflow exception, in that when you make too many changes to the visual tree all at once, the asynchronous update queue gets so long that it trips a wire and the exception gets thrown.
In this case, the problem is that there are enough UIElements going into the data grid I am working with that allowing the grid to generate all its own columns all at once can in some cases exceed the limit. I was using a number of grids all at once, and all loading in response to page navigation events, which made it all the trickier to nail down.
Thankfully, the limitations I was running into were NOT limitations in the visual tree or the XAML UI subsystem itself, just in the messaging used to update it. This means that if I could spread out the same operations over multiple ticks of the dispatcher's clock, I could accomplish the same end result.
What I ended up doing was to instruct my data grid not to autogenerate its own columns. Instead, I embedded the grid into a user control that, when the data was loaded, would parse out the columns needed and load them into a list. Then, I called the following method:
void LoadNextColumns(List<ColumnDisplaySetup> colDef, int startIdx, int numToLoad)
{
for (int idx = startIdx; idx < startIdx + numToLoad && idx < colDef.Count; idx++)
{
DataGridTextColumn newCol = new DataGridTextColumn();
newCol.Header = colDef[idx].Header;
newCol.Binding = new Binding() { Path = new PropertyPath(colDef[idx].Property) };
dgMainGrid.Columns.Add(newCol);
}
if (startIdx + numToLoad < colDef.Count)
{
Dispatcher.RunAsync(Windows.UI.Core.CoreDispatcherPriority.Normal, () =>
{
LoadNextColumns(colDef, startIdx + numToLoad, numToLoad);
});
}
}
(ColumnDisplaySetup
is a trivial type used to house the parsed-out configuration or a configuration loaded from a file.)
This method is called with the following arguments, respectively: list of columns, 0, and my arbitrary guess of 5 as a fairly safe number of columns to load at a time; but this number is based on testing and the expectation that a good number of grids could be loading simultaneously. I asked Tim for more information that might inform this part of the process, and will report back here if I learn more about how to determine how much is safe.
In practice, this seems to work adequately, although it results in the sort of progressive rendering you'd expect, with the columns visibly popping in. I expect this could be improved both by using the maximum possible value for numToLoad
and by other UI sleight-of-hand. I may investigate hiding the grid while the columns are generated and only showing the result when everything is ready. Ultimately the decision will come down to which feels more 'fast and fluid'.
Again, I will update this answer with more information if I get it, but I hope this helps someone facing similar problems in the future. After pouring more time than I'd care to admit into the bug hunt, I don't want anyone else to have to kill themselves over this.