Remove characters before and including _ in python 2.7

user2343368 picture user2343368 · May 6, 2013 · Viewed 67.3k times · Source

The following code returns into a nice readable output.

def add_line_remove_special(ta_from,endstatus,*args,**kwargs):
    try:
        ta_to = ta_from.copyta(status=endstatus)
        infile = botslib.opendata(ta_from.filename,'r')
        tofile = botslib.opendata(str(ta_to.idta),'wb')
        start = infile.readline()
        import textwrap
        lines= "\r\n".join(textwrap.wrap(start, 640))
        tofile.write(lines)
        infile.close()
        tofile.close()

This is the output, now I would like to remove all the characters until and including the _

Ichg_UNBUNOA3                                   14                2090100000015                      14                1304221445000001
MSG_BGM380                                         610809                             9  NA
MSG_DTM13720130422                           102
Grp1_RFFON test EDI
Grp2_NADBY 2090100000015                         9
Grp2_NADIV 2090100000015                         9
Grp2_NADDP 2090100000015                         9
Grp7_CUX2  EUR4
Grp8_PAT22                                                                                              5  3  D   30
Grp25_LIN1        02090100000022                     EN
Grp25_QTY47               5
Grp25_QTY12               5
Grp26_MOA203             15.00
Grp28_PRIINV        3000.00           1000PCE
Grp33_TAX7  VAT                                                                                 21.00                              S
Grp25_LIN2        02090100000039                     EN
Grp25_QTY47              10
Grp25_QTY12              10
Grp26_MOA203            350.00
Grp28_PRIINV       35000.00           1000PCE
Grp33_TAX7  VAT                                                                                 21.00                              S

How can I do this?

Answer

Martijn Pieters picture Martijn Pieters · May 6, 2013

To get all text on a line after a underscore character, split on the first _ character and take the last element of the result:

line.split('_', 1)[-1]

This will also work for lines that do not have an underscore character on the line.

Demo:

>>> 'Grp25_QTY47               5'.split('_', 1)[-1]
'QTY47               5'
>>> 'No underscore'.split('_', 1)[-1]
'No underscore'

Translating this to your code:

import textwrap

ta_to = ta_from.copyta(status=endstatus)
with botslib.opendata(ta_from.filename,'r') as infile:
    with botslib.opendata(str(ta_to.idta),'wb') as tofile:
        for line in textwrap.wrap(next(infile), 640):
            line = line.split('_', 1)[-1]
            tofile.write(line + '\r\n')