I have a EBCDIC coded mainframe file which I need to convert to an ASCII format. Which libraries/tools can I use to do that. I am most familiar with Python.
The file I received has a cookbook with it, which can be used to parse the file (part of it is below).
What do types: 'C', 'P' and 'B' mean? I'm guessing C = character, B = byte, P = packed number?
1:----------------------------------------------------------------------------------------------------------------------------------:
:LAYOUT NAME: B224E DATE: 02/20/14 PAGE 7 OF 14:
: ------- -------- --- ---:
:COBOL: PAN-NAME: NONE COPYLIB-NAME: RECB224E :
: -------------------- -------------------- :
:BAL : PAN-NAME: NONE COPYLIB-NAME: NONE :
:------------------------------------------------------------------------------:
:TYPE OF RECORD: EXTENDED SORT KEY AREA - SEGMENT "A" (OPTIONAL) :
:------------------------------------------------------------------------------:
:POSITION : LENGTH : TYPE : DESCRIPTION :
:----------:--------:------:---------------------------------------------------:
: : : : :
: : : : :
: : : : :
:001 - 001 : 1 : C : SEGMENT IDENTIFIER - "A" :
: : : : :
:002 - 003 : 2 : P : SEGMENT LENGTH :
: : : : :
:004 - ??? : ??? : C : EXTENDED SORT KEY AREA :
: : : : :
Take a look at the codecs
module. From the standard encodings table, it looks like EBCDIC is also known as cp-500
. Something like the following should work:
import codecs
with open("EBCDIC.txt", "rb") as ebcdic:
ascii_txt = codecs.decode(ebcdic, "cp500")
print(ascii_txt)
As mpez0 noted in the comments, if you're using Python 3, you can condense the code to this:
with open("EBCDIC.txt", "rt", "cp500") as ebcdic:
print(ebcdic.read())
Not having an EBCDIC file handy, I can't test this, but it should be enough to get you started.