Reputation: 211
The dictionary has the following key-value pairs:
{
'Target_Tab': 'employees',
' Target_Col': 'empp_id last_name first_name',
'Source_Col': 'emp_id l_name f_name',
'Source_Tab': 'employee'
}
I'm writing this dictionary into a CSV file and so far I've got this:
Source_Tab Source_Col Target_Tab Target_Col
employee emp_id last_name first_name employees empp_id l_name f_name
I want to write the Source _col and Target_col values in different rows. Se below is what I need:
Source_Tab Source_Col Target_Tab Target_Col
employee emp_id employees empp_id
last_name l_name
first_name f_name
My code is as follows:
import pandas as pd
d = [sdict]
d2 = []
col = ["Source_Table","Source_Columns","Target_Table","Target_Columns"]
for i in d:
temp = {}
for c in col:
if c in i:
temp[c] = i[c]
else:
temp[c] = ''
d2.append(temp)
df2 = pd.DataFrame(d2, columns=col)
df2.to_csv('test21.csv', index=False)
Upvotes: 2
Views: 324
Reputation: 862711
Use list comprehension with split
for list of Series
and join together by concat
, last replace missing values by DataFrame.fillna
and change order of columns by list col
:
d = {
'Target_Tab': 'employees',
'Target_Col': 'empp_id last_name first_name',
'Source_Col': 'emp_id l_name f_name',
'Source_Tab': 'employee'
}
col = ["Source_Tab","Source_Col","Target_Tab","Target_Col"]
df = pd.concat([pd.Series(v.split(), name=k) for k, v in d.items()], axis=1).fillna('')[col]
print (df)
Source_Tab Source_Col Target_Tab Target_Col
0 employee emp_id employees empp_id
1 l_name last_name
2 f_name first_name
Another solution:
col = ["Source_Tab","Source_Col","Target_Tab","Target_Col"]
df = pd.Series(d).str.split(expand=True).fillna('').reindex(col).T
print (df)
Source_Tab Source_Col Target_Tab Target_Col
0 employee emp_id employees empp_id
1 l_name last_name
2 f_name first_name
EDIT:
If need filter keys in source dictionary:
d = {
'Target_Tab': 'employees',
'Target_Col': 'empp_id last_name first_name',
'Source_Col': 'emp_id l_name f_name',
'Source_Tab': 'employee'
}
L = ['Source_Tab','Source_Col']
df = (pd.concat([pd.Series(v.split(), name=k) for k, v in d.items() if k in L], axis=1)
.fillna(''))
print (df)
Source_Col Source_Tab
0 emp_id employee
1 l_name
2 f_name
Upvotes: 2