Set UTF-8 as default for Ruby 1.9.3

Fellow Stranger picture Fellow Stranger · Dec 11, 2013 · Viewed 27k times · Source

I'm on Rails 4 and Ruby 1.9.3

I use "strange" characters very often, so I have to declare UTF-8 encoding at the top of all .rb files.

Is there any way to set UTF-8 as the default encoding for Ruby 1.9.3?


I tried all answers, but when running rake db:seed and creating an object whose attributes contain non US-ASCII valid characters, I still receive this error:

`block in trace_on': invalid byte sequence in US-ASCII (ArgumentError)

Answer

Holger Just picture Holger Just · Dec 11, 2013

To change the source encoding (i.e. the encoding your actual written source code is in), you have to use the magic comment currently:

# encoding: utf-8

It is not enough to either set the internal encoding (the encoding of the internal string representation after conversion) or the external encoding (the assumed encoding of read files). You actually have to set the magic encoding comment on top of files to set the source encoding.

In ChiliProject we have a rake task which sets the correct encoding header in all files automatically before a release.

As for encoding defaults:

  • Ruby 1.8 and below didn't knew the concept of string encodings at all. Strings were more or less byte arrays.
  • Ruby 1.9: default string encoding is US_ASCII everywhere.
  • Ruby 2.0 and above: default string encoding is UTF-8.

Thus, if you use Ruby 2.0, you could skip the encoding comment and correctly assume UTF-8 encoding everywhere by default.