How can I include special characters (tab, newline) in a python doctest result string?

hobs picture hobs · Jan 12, 2012 · Viewed 26k times · Source

Given the following python script:

# dedupe.py
import re

def dedupe_whitespace(s,spacechars='\t '):
    """Merge repeated whitespace characters.
    Example:
    >>> dedupe_whitespace(r"Green\t\tGround")  # doctest: +REPORT_NDIFF
    'Green\tGround'
    """
    for w in spacechars:
        s = re.sub(r"("+w+"+)", w, s)
    return s

The function works as intended within the python interpreter:

$ python
>>> import dedupe
>>> dedupe.dedupe_whitespace('Purple\t\tHaze')
'Purple\tHaze'
>>> print dedupe.dedupe_whitespace('Blue\t\tSky')
Blue    Sky

However, the doctest example fails because tab characters are converted to spaces before comparison to the result string:

>>> import doctest, dedupe
>>> doctest.testmod(dedupe)

gives

Failed example:
    dedupe_whitespace(r"Green           Ground")  #doctest: +REPORT_NDIFF
Differences (ndiff with -expected +actual):
    - 'Green  Ground'
    ?       -
    + 'Green Ground'

How can I encode tab characters in a doctest heredoc string so that a test result comparison is performed appropriately?

Answer

wutz picture wutz · Jan 13, 2012

I've gotten this to work using literal string notation for the docstring:

def join_with_tab(iterable):
    r"""
    >>> join_with_tab(['1', '2'])
    '1\t2'
    """

    return '\t'.join(iterable)

if __name__ == "__main__":
    import doctest
    doctest.testmod()