I have 2 CSV files: 'Data' and 'Mapping':
Device_Name
, GDN
, Device_Type
, and Device_OS
. All four columns are populated.Device_Name
column populated and the other three columns blank. Device_Name
in the Data file, map its GDN
, Device_Type
, and Device_OS
value from the Mapping file.I know how to use dict when only 2 columns are present (1 is needed to be mapped) but I don't know how to accomplish this when 3 columns need to be mapped.
Following is the code using which I tried to accomplish mapping of Device_Type
:
x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
file_map = csv.reader(in_file1, delimiter=',')
for row in file_map:
typemap = [row[0],row[2]]
x.append(typemap)
with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
writer = csv.writer(out_file, delimiter=',')
for row in csv.reader(in_file2, delimiter=','):
try:
row[27] = x[row[11]]
except KeyError:
row[27] = ""
writer.writerow(row)
It returns Attribute Error
.
After some researching, I think I need to create a nested dict, but I don't have any idea how to do this.
A nested dict is a dictionary within a dictionary. A very simple thing.
>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d
{'dict1': {'innerkey': 'value'}}
You can also use a defaultdict
from the collections
package to facilitate creating nested dictionaries.
>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d) # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}
You can populate that however you want.
I would recommend in your code something like the following:
d = {} # can use defaultdict(dict) instead
for row in file_map:
# derive row key from something
# when using defaultdict, we can skip the next step creating a dictionary on row_key
d[row_key] = {}
for idx, col in enumerate(row):
d[row_key][idx] = col
According to your comment:
may be above code is confusing the question. My problem in nutshell: I have 2 files a.csv b.csv, a.csv has 4 columns i j k l, b.csv also has these columns. i is kind of key columns for these csvs'. j k l column is empty in a.csv but populated in b.csv. I want to map values of j k l columns using 'i` as key column from b.csv to a.csv file
My suggestion would be something like this (without using defaultdict):
a_file = "path/to/a.csv"
b_file = "path/to/b.csv"
# read from file a.csv
with open(a_file) as f:
# skip headers
f.next()
# get first colum as keys
keys = (line.split(',')[0] for line in f)
# create empty dictionary:
d = {}
# read from file b.csv
with open(b_file) as f:
# gather headers except first key header
headers = f.next().split(',')[1:]
# iterate lines
for line in f:
# gather the colums
cols = line.strip().split(',')
# check to make sure this key should be mapped.
if cols[0] not in keys:
continue
# add key to dict
d[cols[0]] = dict(
# inner keys are the header names, values are columns
(headers[idx], v) for idx, v in enumerate(cols[1:]))
Please note though, that for parsing csv files there is a csv module.