How to read a large text file in Python?

beginagain picture beginagain · Sep 4, 2013 · Viewed 13k times · Source

I am using Enthought Canopy (a set of many different Python Library packages e.g. NumPy, Pandas,etc) for data analysis. I am trying to read a text file and create a dataframe out of it. The text file has 1180598 rows and 18 columns. All columns have numbers in them. I wrote following code for reading and naming data columns:

from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt

import pandas as pd

print 'Pandas Version ' + pd.__version__
Pandas Version 0.12.0

location=r'C:\UMAIR\Directed Studies\US-101 Data\Main Data\US-101-Main-Data\vehicle-trajectory-data\0750am-0805am\tra.txt'

df=read_csv(location, names=['Vehicle ID','Frame ID','Total Frames','Global Time','Local X','Local Y','Global X','Global Y','Vehicle Length','Vehicle Width','Vehicle Class','Vehicle Velocity','Vehicle Acceleration','Lane Identification','Preceding Vehicle','Following Vehicle','Spacing','Headway'])

df
Out[41]: 
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1180598 entries, 0 to 1180597
Data columns (total 18 columns):
Vehicle ID              1180598  non-null values
Frame ID                0  non-null values
Total Frames            0  non-null values
Global Time             0  non-null values
Local X                 0  non-null values
Local Y                 0  non-null values
Global X                0  non-null values
Global Y                0  non-null values
Vehicle Length          0  non-null values
Vehicle Width           0  non-null values
Vehicle Class           0  non-null values
Vehicle Velocity        0  non-null values
Vehicle Acceleration    0  non-null values
Lane Identification     0  non-null values
Preceding Vehicle       0  non-null values
Following Vehicle       0  non-null values
Spacing                 0  non-null values
Headway                 0  non-null values
dtypes: float64(17), object(1) 

As you can see from Out[41], the file was read to have 1 column only. What should I do to let Python know that my file has 18 columns so that it is read the way it is meant to be?

Answer

elyase picture elyase · Sep 4, 2013

This will import your dataset correctly:

df = pd.read_csv(location, names=names, header=None, delim_whitespace=True)