I tried to create an array from a text file.
I saw earlier that numpy had a method loadtxt
, so I try it, but it add some junk character before each row...
# my txt file
.--``--.
.--` `--.
| |
| |
`--. .--`
`--..--`
# my python v3.4 program
import numpy as np
f = open('tile', 'r')
a = np.loadtxt(f, dtype=str, delimiter='\n')
print(a)
# my print output
["b' .--``--. '"
"b'.--` `--.'"
"b'| |'"
"b'| |'"
"b'`--. .--`'"
"b' `--..--` '"]
What are these 'b' and double quotes ? And where do they come from ? I tried some solution picked from internet, like open the file with codecs, change the dtype by 'S20', 'S11', and a lot of other things which don't work... What I expect is an array of unicode strings which look like this :
[[' .--``--. ']
['.--` `--.']
['| |']
['| |']
['`--. .--`']
[' `--..--` ']]
Info: I'm using python 3.4 and numpy from the debian stable repository
np.loadtxt
and np.genfromtxt
operate in byte mode, which is the default string type in Python 2. But Python 3 uses unicode, and marks bytestrings with this b
.
I tried some variations, in an python3 ipython
session:
In [508]: np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[508]: b' .--``--.'
In [509]: np.loadtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[509]: "b' .--``--.'"
...
In [511]: np.genfromtxt('stack33655641.txt',dtype=str,delimiter='\n')[0]
Out[511]: '.--``--.'
In [512]: np.genfromtxt('stack33655641.txt',dtype=None,delimiter='\n')[0]
Out[512]: b'.--``--.'
In [513]: np.genfromtxt('stack33655641.txt',dtype=bytes,delimiter='\n')[0]
Out[513]: b'.--``--.'
genfromtxt
with dtype=str
gives the cleanest display - except it strips blanks. I may have to use a converter to turn that off. These functions are meant to read csv
data where (white)spaces are separators, not part of the data.
loadtxt
and genfromtxt
are over kill for simple text like this. A plain file read does nicely:
In [527]: with open('stack33655641.txt') as f:a=f.read()
In [528]: print(a)
.--``--.
.--` `--.
| |
| |
`--. .--`
`--..--`
In [530]: a=a.splitlines()
In [531]: a
Out[531]:
[' .--``--.',
'.--` `--.',
'| |',
'| |',
'`--. .--`',
' `--..--`']
(my text editor is set to strip trailing blanks, hence the ragged lines).
@DSM's
suggestion:
In [556]: a=np.loadtxt('stack33655641.txt',dtype=bytes,delimiter='\n').astype(str)
In [557]: a
Out[557]:
array([' .--``--.', '.--` `--.', '| |',
'| |', '`--. .--`', ' `--..--`'],
dtype='<U16')
In [558]: a.tolist()
Out[558]:
[' .--``--.',
'.--` `--.',
'| |',
'| |',
'`--. .--`',
' `--..--`']