What does it mean to escape a string?

Brett picture Brett · May 18, 2012 · Viewed 83.5k times · Source

I was reading Does $_SESSION['username'] need to be escaped before getting into an SQL query? and it said "You need to escape every string you pass to the sql query, regardless of its origin". Now I know something like this is really basic. A Google search turned up over 20, 000 results. Stackoverflow alone had 20 pages of results but no one actually explains what escaping a string is or how to do it. It is just assumed. Can you help me? I want to learn because as always I am making a web app in PHP.

I have looked at: Inserting Escape Characters, What are all the escape characters in Java?, Cant escape a string with addcslashes(), Escape character, what does mysql_real_escape_string() really do?, How can i escape double quotes from a string in php?, MySQL_real_escape_string not adding slashes?, remove escape sequences from string in php I could go on but I am sure you get the point. This is not laziness.

Answer

Sampson picture Sampson · May 18, 2012

Escaping a string means to reduce ambiguity in quotes (and other characters) used in that string. For instance, when you're defining a string, you typically surround it in either double quotes or single quotes:

"Hello World."

But what if my string had double quotes within it?

"Hello "World.""

Now I have ambiguity - the interpreter doesn't know where my string ends. If I want to keep my double quotes, I have a couple options. I could use single quotes around my string:

'Hello "World."'

Or I can escape my quotes:

"Hello \"World.\""

Any quote that is preceded by a slash is escaped, and understood to be part of the value of the string.

When it comes to queries, MySQL has certain keywords it watches for that we cannot use in our queries without causing some confusion. Suppose we had a table of values where a column was named "Select", and we wanted to select that:

SELECT select FROM myTable

We've now introduced some ambiguity into our query. Within our query, we can reduce that ambiguity by using back-ticks:

SELECT `select` FROM myTable

This removes the confusion we've introduced by using poor judgment in selecting field names.

A lot of this can be handled for you by simply passing your values through mysql_real_escape_string(). In the example below you can see that we're passing user-submitted data through this function to ensure it won't cause any problems for our query:

// Query
$query = sprintf("SELECT * FROM users WHERE user='%s' AND password='%s'",
            mysql_real_escape_string($user),
            mysql_real_escape_string($password));

Other methods exist for escaping strings, such as add_slashes, addcslashes, quotemeta, and more, though you'll find that when the goal is to run a safe query, by and large developers prefer mysql_real_escape_string or pg_escape_string (in the context of PostgreSQL.