Reputation: 131
I am creating a program to ingest a series of text files from 2001Q1 through 2016Q1 based upon name qualifiers which indicate the schedule/report type. The qualifiers are referred to as keys (for lack of a better name)
keys=[' RI ','RCD','RCF','RCG','RCH','RCL','RCO','RCRII']
given a path C:\files, I create a dictionary of all eligible text files
files=[]
for k in keys:
for i in os.listdir(path):
if os.path.isfile(os.path.join(path,i)) and k in i:
files.append(i)
Then I create a dictionary
df_dict={file[:-4].replace(" ","_"):pd.read_table(path+file,header=[0,1],index_col=0,error_bad_lines=False,dtype={'IDRSSD':object}, low_memory=False) for file in files}
The sample dictionary looks like: {(Schedule_RI_2001Q1:Col1 Col2 ColN), (Schedule_RCO_2001Q1:Col1 Col2 ColN), (Schedule_RI_2005Q2: Col1 Col2 ColN) }
in a key-value arrangement.
I need to create dictionaries from the main dictionary based on report type. I came up with this script (I know its amateur):
for key in keys:
for k in df_dict.keys():
for v in df_dict.values():
if key in k:
key.strip={k:v}
Regardless of using key.strip or key.strip() I receive an error message, "'str' object attribute 'strip' is read-only" or "can't assign to function call", respectively. Is there a better way to accomplish this tasks. The reason I created the aggregate dictionary is to do some data formatting and etc. Assistance in breaking out the dictionary would be greatly appreciated.
Upvotes: 0
Views: 395
Reputation: 2798
You can't directly create a dictionary on key.strip
nor key.strip()
, because well they are functions. You can however create a temporary dictionary, and use the value returned by those functions as a key in the temporary dictionary.
This is a relatively safer method:
keys = ['a', 'b']
df_dict = { 'a_2010': 1, 'a_2007': 2, 'Schedule_b_2009Q1': 3 }
for key in keys:
sub_dict[key.strip()] = {}
for k, v in df_dict.items():
if key in k:
sub_dict[key.strip()][k] = v
Output:
>>> sub_dict
{'a': {'a_2007': 2, 'a_2010': 1},
'b': {'Schedule_b_2009Q1': 3}}
If the above seems unecessarily complex, you can simply use locals()
to solve this particular problem (but it's usually not a good practice to use it everywhere):
keys = ['a', 'b', 'c']
df_dict = { 'a_2010': 1, 'a_2007': 2, 'Schedule_b_2009Q1': 3 }
for key in keys:
locals()[key.strip()] = {}
for k, v in df_dict.items():
if key in k:
locals()[key.strip()][k] = v
Output:
>>> a
{'a_2007': 2, 'a_2010': 1}
>>> b
{'Schedule_b_2009Q1': 3}
Upvotes: 1