Why in Rails 3, <%= note.html_safe %> and <%= h note.html_safe %> give the same result?

nonopolarity picture nonopolarity · Nov 1, 2010 · Viewed 10.4k times · Source

It feels like html_safe adds an abstraction to the String class that requires understanding of what is going on, for example,

<%= '1 <b>2</b>' %>      # gives 1 &lt;b&gt;2&lt;/b&gt; in the HTML source code

<%= h '1 <b>2</b>' %>    # exactly the same as above

<%= '1 <b>2</b>'.html_safe %>      #  1 <b>2</b>  in HTML source code

<%= h '1 <b>2</b>'.html_safe %>    #  exactly the same as above

<%= h (h '1 <b>2</b>') %>  #  1 &lt;b&gt;2&lt;/b&gt;   wont' escape twice

For line 4, if we are saying, ok, we trust the string -- it is safe, but why can't we escape it? It seems that to escape it by h, the string has to be unsafe.

So on line 1, if the string is not escaped by h, it will be automatically escaped. On line 5, h cannot escape the string twice -- in other words, after < is changed to &lt;, it can't escape it one more time to &amp;lt;.

So what's happening? At first, I thought html_safe is just tagging a flag to the string, saying it is safe. So then, why does h not escape it? It seems that h and html_escape actually co-operate on using the flag:

1) If a string is html_safe, then h will not escape it

2) If a string is not html_safe, then when the string is added to the output buffer, it will be automatically escaped by h.

3) If h already escaped a string, it is marked html_safe, and therefore, escaping it one more time by h won't take any effect. (as on Line 5, and that behavior is the same even in Rails 2.3.10, but on Rails 2.3.5 h can actually escape it twice... so in Rails 2.3.5, h is a simple escape method, but some where along the line to 2.3.10, h became not as simple. But 2.3.10 won't auto escape a string, but for some reason, the method html_safe already exists for 2.3.10 (for what purpose?))

Is that how it works exactly? I think nowadays, sometimes we don't get what we want in the output and we immediately add html_safe to our variable, which can be quite dangerous, because it can introduce XSS attack that way, so understanding how it exactly works can be quite important. The above is only a guess of how it exactly work. Could it be actually a different mechanism and is there any doc that supports it?

Answer

Ben Hughes picture Ben Hughes · Nov 1, 2010

As you can see, calling html_safe on a string turns it into an html safe SafeBuffer

http://github.com/rails/rails/blob/89978f10afbad3f856e2959a811bed1982715408/activesupport/lib/active_support/core_ext/string/output_safety.rb#L87

Any operations on a SafeBuffer that could affect the string safety will be passed through h()

h uses this flag to avoid double escaping

http://github.com/rails/rails/blob/89978f10afbad3f856e2959a811bed1982715408/activesupport/lib/active_support/core_ext/string/output_safety.rb#L18

The behavior did change and I think you are mostly correct about how it works. In general you should not call html_safe unless you're sure that it is already sanitized. Like anything, you have to be careful while using it