Can you format pandas integers for display, like `pd.options.display.float_format` for floats?

Michael K picture Michael K · Apr 16, 2015 · Viewed 14.3k times · Source

I've seen this and this on formatting floating-point numbers for display in pandas, but I'm interested in doing the same thing for integers.

Right now, I have:

pd.options.display.float_format = '{:,.2f}'.format

That works on the floats in my data, but will either leave annoying trailing zeroes on integers that are cast to floats, or I'll have plain integers that don't get formatted with commas.

The pandas docs mention a SeriesFormatter class about which I haven't been able to find any information.

Alternatively, if there's a way to write a single string formatter that will format floats as '{:,.2f}' and floats with zero trailing decimal as '{:,d}', that'd work too.

Answer

unutbu picture unutbu · Apr 16, 2015

You could monkey-patch pandas.io.formats.format.IntArrayFormatter:

import contextlib
import numpy as np
import pandas as pd
import pandas.io.formats.format as pf
np.random.seed(2015)

@contextlib.contextmanager
def custom_formatting():
    orig_float_format = pd.options.display.float_format
    orig_int_format = pf.IntArrayFormatter

    pd.options.display.float_format = '{:0,.2f}'.format
    class IntArrayFormatter(pf.GenericArrayFormatter):
        def _format_strings(self):
            formatter = self.formatter or '{:,d}'.format
            fmt_values = [formatter(x) for x in self.values]
            return fmt_values
    pf.IntArrayFormatter = IntArrayFormatter
    yield
    pd.options.display.float_format = orig_float_format
    pf.IntArrayFormatter = orig_int_format


df = pd.DataFrame(np.random.randint(10000, size=(5,3)), columns=list('ABC'))
df['D'] = np.random.random(df.shape[0])*10000

with custom_formatting():
    print(df)

yields

      A     B     C        D
0 2,658 2,828 4,540 8,961.77
1 9,506 2,734 9,805 2,221.86
2 3,765 4,152 4,583 2,011.82
3 5,244 5,395 7,485 8,656.08
4 9,107 6,033 5,998 2,942.53

while outside of the with-statement:

print(df)

yields

      A     B     C            D
0  2658  2828  4540  8961.765260
1  9506  2734  9805  2221.864779
2  3765  4152  4583  2011.823701
3  5244  5395  7485  8656.075610
4  9107  6033  5998  2942.530551