I'm writing some code that processes URLs, and I want to make sure i'm not leaving some strange case out...
Are there any valid characters for a host other than: A-Z, 0-9, "-" and "."?
(This includes anything that can be in subdomains, etc. Esentially, anything between :// and the first /)
Thanks!
Please see Restrictions on valid host names:
Hostnames are composed of series of labels concatenated with dots, as are all domain names1. For example, "en.wikipedia.org" is a hostname. Each label must be between 1 and 63 characters long, and the entire hostname has a maximum of 255 characters.
RFCs mandate that a hostname's labels may contain only the ASCII letters 'a' through 'z' (case-insensitive), the digits '0' through '9', and the hyphen. Hostname labels cannot begin or end with a hyphen. No other symbols, punctuation characters, or blank spaces are permitted.