Despite the advice from the previous questions:
-9999 as missing value with numpy.genfromtxt()
Using genfromtxt to import csv data with missing values in numpy
I still am unable to process a text file that ends with a missing value,
a.txt:
1 2 3
4 5 6
7 8
I've tried multiple arrangements of options of missing_values
, filling_values
and can not get this to work:
import numpy as np
sol = np.genfromtxt("a.txt",
dtype=float,
invalid_raise=False,
missing_values=None,
usemask=True,
filling_values=0.0)
print sol
What I would like to get is:
[[1.0 2.0 3.0]
[4.0 5.0 6.0]
[7.0 8.0 0.0]]
but instead I get:
/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py:1641: ConversionWarning: Some errors were detected !
Line #3 (got 2 columns instead of 3)
warnings.warn(errmsg, ConversionWarning)
[[1.0 2.0 3.0]
[4.0 5.0 6.0]]
Using pandas:
import pandas as pd
df = pd.read_table('data', sep='\s+', header=None)
df.fillna(0, inplace=True)
print(df)
# 0 1 2
# 0 1 2 3
# 1 4 5 6
# 2 7 8 0
pandas.read_table
replaces missing data with NaN
s. You can replace those NaN
s with some other value using df.fillna
.
df
is a pandas.DataFrame
. You can access the underlying NumPy array with df.values
:
print(df.values)
# [[ 1. 2. 3.]
# [ 4. 5. 6.]
# [ 7. 8. 0.]]