I am using C# and ASP.NET for this.
We receive a lot of "strange" requests on our IIS 6.0 servers and I want to log and catalog these by domain.
Eg. we get some strange requests like these:
http://www.poker.winner4ever.example.com/
http://www.hotgirls.example.com/
http://santaclaus.example.com/
the latter three are kinda obvious, but I would like to sort them all into one as "example.com" IS hosted on our servers. The rest isn't, sorry :-)
So I am looking for some good ideas for how to retrieve example.com from the above. Secondly I would like to match the m., wap., iphone etc into a group, but that's probably just a quick lookup in a list of mobile shortcuts.I could handcode this list for a start.
But is regexp the answer here or is pure string manipulation the easiest way? I was thinking of "splitting" the URL string by "." and the look for item[0] and item[1]...
Any ideas?
You can use the following nuget Nager.PublicSuffix package. It uses the same data source that browser vendors use.
nuget
PM> Install-Package Nager.PublicSuffix
Example
var domainParser = new DomainParser(new WebTldRuleProvider());
var domainName = domainParser.Get("sub.test.co.uk");
//domainName.Domain = "test";
//domainName.Hostname = "sub.test.co.uk";
//domainName.RegistrableDomain = "test.co.uk";
//domainName.SubDomain = "sub";
//domainName.TLD = "co.uk";