Research:
Similar Issue with workaround, but not actual solution to existing problem
Similar issue pointing to Microsoft End Point update as culprit
The above links are the most suited to my problem, I have also viewed every similar question listed by Stack Overflow upon creating this post, and only the above referenced questions fit my issue.
Background:
I have been using UserPrincipal.GetAuthorizationGroups
for permissions for specific page access running IIS 7.5 on Server 2008 R2 in a C#.NET 4.0 web forms site for 2 and a half years. On May 15 2013 we removed a primary Domain controller running Server 2008 (not r2) and replaced it with a Server 2012 Domain Controller. The next day we started receiving the exception listed below.
I use Principal Context for Forms Authentication. The username/pass handshake succeeds and the auth cookie is properly set, but the subsequent Principal Context call that also calls UserPrincipal.GetAuthorizationGroups
fails intermittently. We've resolved a few BPA issues that appeared in the Server 2012 Domain Controller but this has yet to resolve the issue. I also instituted a cron that runs on two separate servers. The two servers will fail at Group SID resolution at different times though they are running the same code base. (A dev environment and production environment).
The issue resolves itself temporarily upon web server reboot, and also on the dev server it will resolve itself after 12 hours of not functioning. The production server will usually stop functioning properly until a reboot without resolving itself.
At this point I am trying to refine the cron targeting specific Domain Controllers in the network as well as the new DC and using the standard LDAP query that is currently failing to yield more targeted exception times. Thus far we've found on one web server that there is no pattern to the days at which it fails, but it will recover within roughly 12 hours. The latest results show Group SID resolution failure between 8AM-8PM then it recovers, several days later it will fail at 8pm and recover at 8am then run fine for another 12 hours and fail again. We are hoping to see if it is just a specific server communication issue or to see if it is the entire set of Domain Controllers.
Exception:
Exception information:
Exception type: PrincipalOperationException
Exception message: An error (1301) occurred while enumerating the groups.
The group's SID could not be resolved.
at System.DirectoryServices.AccountManagement.SidList.TranslateSids(String target, IntPtr[] pSids)
at System.DirectoryServices.AccountManagement.SidList..ctor(SID_AND_ATTR[] sidAndAttr)
at System.DirectoryServices.AccountManagement.AuthZSet..ctor(Byte[] userSid, NetCred credentials, ContextOptions contextOptions, String flatUserAuthority, StoreCtx userStoreCtx, Object userCtxBase)
at System.DirectoryServices.AccountManagement.ADStoreCtx.GetGroupsMemberOfAZ(Principal p)
at System.DirectoryServices.AccountManagement.UserPrincipal.GetAuthorizationGroups()
Question:
Given the above information, does anyone have any idea why decommissioning the Windows Server 2008 (not r2) and implementing a new Server 2012 DC would cause UserPrincipal.GetAuthorizationGroups
to fail with the 1301 SID resolution error?
Ideas on eliminating possible causes would also be appreciated.
Disclaimer:
This is my first post to Stack Overflow, I often research here but have not joined in discussions until now. Forgive me if I should have posted elsewhere and feel free to point out better steps before posting.
UPDATE 13-JUN-2013:
On the 12th of June I addressed the possibility of items not disposed causing the issue. The time frame has been too short to determine if the adjusted code has fixed the issue, but I will continue to update as we work towards a resolution such that maybe with any luck someone here can lend a hand.
Original Code
public bool isGroupMember(string userName, ArrayList groupList)
{
bool valid = false;
PrincipalContext ctx = new PrincipalContext(ContextType.Domain, domain_server + ".domain.org:636", null, ContextOptions.Negotiate | ContextOptions.SecureSocketLayer);
// find the user in the identity store
UserPrincipal user =
UserPrincipal.FindByIdentity(
ctx,
userName);
// get the groups for the user principal and
// store the results in a PrincipalSearchResult object
PrincipalSearchResult<Principal> groups =
user.GetAuthorizationGroups();
// display the names of the groups to which the
// user belongs
foreach (Principal group in groups)
{
foreach (string groupName in groupList)
{
if (group.ToString() == groupName)
{
valid = true;
}
}
}
return valid;
}
Updated Code
public bool isGroupMember(string userName, ArrayList groupList, string domain_server)
{
bool valid = false;
try
{
using (PrincipalContext ctx = new PrincipalContext(ContextType.Domain, domain_server + ".domain.org:636", null, ContextOptions.Negotiate | ContextOptions.SecureSocketLayer))
{
// find the user in the identity store
UserPrincipal user =
UserPrincipal.FindByIdentity(
ctx,
userName);
try
{
// get the groups for the user principal and
// store the results in a PrincipalSearchResult object
using (PrincipalSearchResult<Principal> groups = user.GetAuthorizationGroups())
{
// display the names of the groups to which the
// user belongs
foreach (Principal group in groups)
{
foreach (string groupName in groupList)
{
if (group.ToString() == groupName)
{
valid = true;
}
}
group.Dispose();
}
}//end using-2
}
catch
{
log_gen("arbitrary info");
return false;
}
}//end using-1
}
catch
{
log_gen("arbitrary info");
return false;
}
return valid;
}
I have just run into this same issue and the info I have managed to track down may be helpful; as above we have seen this problem where the domain controller is running Server 2012 - firstly with a customer deployment and then replicated on our own network.
After some experimentation we found that our code would run fine on Server 2012, but hit the 1301 error code when the client system was running Server 2008. The key information about what was happening was found here:
MS blog translated from German
The hotfix referred to in the link below has fixed the problem on our test system
SID S-1-18-1 and SID S-1-18-2 can't be mapped
Hope this is helpful for someone! As many have noted this method call seems rather fragile and we will probably look at implementing some alternative approach before we hit other issues.
Gary