General string quoting for TCL

greggo picture greggo · Mar 14, 2011 · Viewed 28.5k times · Source

I'm writing a utility (which happens to be in python) which is generating output in the form of a TCL script. Given some arbitrary string variable (not unicode) in the python, I want to produce a TCL line like

set s something

... which will set TCL variable 's' to that exact string, regardless of what strange characters are in it. Without getting too weird, I don't want to make the output messier than needed. I believe a decent approach is

  1. if the string is not empty and contains only alphanumerics, and some characters like .-_ (but definitely not $"{}\) then it can be used as-is;

  2. if it contains only printable characters and no double-quotes or curly braces (and does not end in backslash ) then simply put {} around it;

  3. otherwise, put "" around it after using \ escapes for " { } \ $ [ ] , and \nnn escapes for non-printing characters.

Question: is that the full set of characters which need escaping inside double quotes? I can't find this in the docs. And did I miss something (I almost missed that strings for (2) can't end in \ for instance).

I know there are many other strings which can be quoted by {}, but it seems difficult to identify them easily. Also, it looks like non-printing characters (in particular, newline) are OK with (2) if you don't mind them being literally present in the TCL output.

Answer

Byron Whitlock picture Byron Whitlock · Mar 14, 2011

You really only need 2 rules,

  • Escape curly braces
  • Wrap the output in curly braces

You don't need to worry about newlines, non printable characters etc. They are valid in a literal string, and TCL has excellent Unicode support.

set s { 
this is
a 
long 
string. I have $10 [10,000 cents] only curly braces \{ need \} to be escaped.
\t is not  a real tab, but '    ' is. "quoting somthing" :
{matchin` curly braces are okay, list = string in tcl}
}

Edit In light of your comment, you can do the following:

  • escape [] {} and $
  • wrap the whole output in set s [subst { $output } ]

The beauty of Tcl is it a has a very simple grammar. There are no other characters besides the 3 above needed to be escaped.

Edit 2 One last try.

If you pass subst some options, you will only need to escape \ and {}

set s [subst -nocommands -novariables { $output } ]

You would need to come up with a regex to convert non printable characters to their escaped codes however.

Good luck!