What are the safe characters for making URLs

Paulo picture Paulo · Mar 29, 2009 · Viewed 169.1k times · Source

I am making a website with articles, and I need the articles to have "friendly" URLs, based on the title.

For example, if the title of my article is "Article Test", I would like the URL to be http://www.example.com/articles/article_test.

However, article titles (as any string) can contain multiple special characters that would not be possible to put literally in my URL. For instance, I know that ? or # need to be replaced, but I don't know all the others.

What characters are permissible in URLs? What is safe to keep?

Answer

Skip Head picture Skip Head · Mar 29, 2009

To quote section 2.3 of RFC 3986:

"Characters that are allowed in a URI but do not have a reserved purpose are called unreserved. These include uppercase and lowercase letters, decimal digits, hyphen, period, underscore, and tilde."

ALPHA  DIGIT  "-" / "." / "_" / "~"

Note that RFC 3986 lists fewer reserved punctuation marks than the older RFC 2396.