Right single apostrophe vs. apostrophe?

TripShock picture TripShock · Jul 15, 2011 · Viewed 21.7k times · Source

Right single quotation mark (U+2019) vs. Apostrophe (U+0027)

What is the difference between these two characters?

I ran into this issue where I use CAtlString to load a string from a resource file, and on some Windows installations, the LoadString fails when trying to load a string that contains U+2019, but it works on some other Windows installations. The U+2019 character appears in strings in my resource file that I copied from Word, and U+0027 appears in stirngs that I hand coded. Why does LoadString (sometimes) choke on this?

Answer

bobince picture bobince · Jul 15, 2011

What is the difference between these two characters?

Arguable!

Going by the names, one would imagine that the curly ‹’› is only for use as a quotation mark, and that the straight ‹'› is only for use as a real apostrophe, an indicator of omitted letters.

However traditional typesetting practice in English is always to use a curly ‹’› to render an apostrophe. Personally—and I may be alone here—I don't like this. It can make for more ambiguous reading:

“He said, ‘It’s fish ’n’ chips’...”

with the apostrophes being straight it's (marginally) clearer where the quotation ends:

“He said, ‘It's fish 'n' chips’...”

and the apostrophe being ‘straight’ makes more sense to me because its purpose of indicating omitted letters has no inherent directionality, whereas quotation marks are clearly asymmetrical in purpose.

In traditional ASCII, of course, there are no smart quotes, so the apostrophe is always used for both...

on some Windows installations, the LoadString fails when trying to load a string that contains U+2019, but it works on some other Windows installations.

Here you are meeting the horror of the ‘ANSI’ code page. This is a default character encoding that is different across different Windows install locales. So on a machine in the Western region, you get different results when you read a resource to when you read it on a Japanese Windows.

It is highly unfortunate that Windows has varying default code pages instead of using a single global encoding like UTF-8, but it's too late to fix now. If you compile your whole application as a Unicode app (so you'll be using LoadStringW rather than LoadStringA) then you can cope with non-ASCII characters like the smart quotes much better.

If you can't move to a Unicode application you're a bit stuck. You won't be able to handle non-ASCII characters like the smart quotes globally, so stick with ASCII characters like the straight apostrophe ‹'› alone.

The U+2019 character appears in strings in my resource file that I copied from Word

Yes, Word has an annoying AutoCorrect feature that replaces all apostrophes you type with smart quotes. This is especially undesirable when you are dealing with code, where ‹’› will break the program; but it's also wrong even for plain old English, as it's not possible to correctly guess the desired direction of the quote. (It'll get one of the apostrophes in “fish 'n' chips” the wrong way round, for example.)

I suggest turning off the automatic-replace-with-smart-quotes feature. If you want the smart quotes, it's better to type them deliberately. Unfortunately they are inconvenient to type on most keyboard layouts, often requiring obscure Alt+numpad sequences. Personally I use this one to drop them onto Alt+[] keys.