Reputation: 707
This is a part of the table which I have:
type n_b
sp 2
sp 2
sp 3
avn 2
avn 4
avn 3
psp 1
psp 3
psp 5
...
Also I have a data set:
d = pd.Series({'sp':['98,00', '0,00', '68,00'], 'psp':['17,00', '7,60', '14,30'],
'avn':['15,00', '10,00', '4,30']})
I need to match the value from my data set in a new column "c_t" depending on the value in the column "type". That's what should be the result:
type n_b c_t
sp 2 98,00
sp 2 0,00
sp 3 68,00
avn 2 15,00
avn 4 10,00
avn 3 4,30
psp 1 17,00
psp 3 7,60
psp 5 14,30
...
My code looks like this:
d = pd.Series({'sp':['98,00', '0,00', '68,00'], 'psp':['17,00', '7,60', '14,30'],
'avn':['15,00', '10,00', '4,30']})
df['c_t'] = df['type'].map(d)
print (df)
But it does not work as I need it
type n_b c_t
sp 2 [98,00, 0,00, 68,00]
sp 2 [98,00, 0,00, 68,00]
sp 3 [98,00, 0,00, 68,00]
avn 2 [15,00, 10,00, 4,30]
avn 4 [15,00, 10,00, 4,30]
avn 3 [15,00, 10,00, 4,30]
psp 1 [17,00, 7,60, 14,30]
psp 3 [17,00, 7,60, 14,30]
psp 5 [17,00, 7,60, 14,30]
...
How can I fix this?
UPD: In fact, there is much more data in the file
d1 = pd.Series({'ds':['104,50', '19,00', '10,00', '30,00', '0,00', '0,00', '16,00', '21,50'],
'zkp':['33,00', '100,00', '16,00', '3,30', '9,00', '0,00', '0,00', '0,00', '4,80', '78,50'],
'dgv':['96,00', '0,00', '194,50', '61,00', '0,00', '10,00', '0,00', '28,00', '0,00', '0,00',
'11,00', '30,00', '0,00', '0,00', '0,00', '16,00', '78,50'], 'sp':['98,00', '0,00', '68,00'],
'psp':['17,00', '7,60', '14,30'],'avn':['15,00', '10,00', '4,30']})
And the table is huge:
type n_b Day_number
ds 2 1
ds 3 2
ds 1 3
ds 2 4
ds 1 5
ds 3 6
ds 2 7
ds 1 8
sp 2 1
sp 2 2
sp 1 3
avn 2 1
avn 4 2
avn 3 3
psp 1 1
psp 3 2
psp 5 3
sp 2 1
sp 2 2
sp 4 3
...
And all types(ds, zkp, dgv, sp, psp, avn) are in the file. The column "n_b" does not affect the column "c_t". In the column "Day_number" numbered days, if it helps.
And the result should be the following:
type n_b Day_number c_t
ds 2 1 104,50
ds 3 2 19,00
ds 1 3 10,00
ds 2 4 30,00
ds 1 5 0,00
ds 3 6 0,00
ds 2 7 16,00
ds 1 8 21,50
sp 2 1 98,00
sp 2 2 0,00
sp 1 3 68,00
avn 2 1 15,00
avn 4 2 10,00
avn 3 3 4,30
psp 1 1 17,00
psp 3 2 7,60
psp 5 3 14,30
sp 2 1 98,00
sp 2 2 0,00
sp 4 3 68,00
...
Upvotes: 2
Views: 213
Reputation: 862481
You can create DataFrame
from lists
first, then create helper columns g
with unique values per group with cumcount
and last join
to df
:
#get lenghts of lists
l = d.str.len()
#repeat index values by lengths
r = np.repeat(d.index.values, l)
#flattening lists
v = np.concatenate(d.values)
#DataFrame constructor
df1 = pd.DataFrame({'type':r, 'c_t':v})
print (df1)
c_t type
0 15,00 avn
1 10,00 avn
2 4,30 avn
3 17,00 psp
4 7,60 psp
5 14,30 psp
6 98,00 sp
7 0,00 sp
8 68,00 sp
df1['g'] = df1.groupby('type').cumcount()
df['g'] = df.groupby('type').cumcount()
df = df.join(df1.set_index(['type','g']), on=['type','g']).drop('g', axis=1)
print (df)
ype n_b c_t
0 sp 2 98,00
1 sp 2 0,00
2 sp 3 68,00
3 avn 2 15,00
4 avn 4 10,00
5 avn 3 4,30
6 psp 1 17,00
7 psp 3 7,60
8 psp 5 14,30
EDIT:
d1 = pd.Series({'ds':['104,50', '19,00', '10,00', '30,00', '0,00', '0,00', '16,00', '21,50'],
'zkp':['33,00', '100,00', '16,00', '3,30', '9,00', '0,00', '0,00', '0,00', '4,80', '78,50'],
'dgv':['96,00', '0,00', '194,50', '61,00', '0,00', '10,00', '0,00', '28,00', '0,00', '0,00',
'11,00', '30,00', '0,00', '0,00', '0,00', '16,00', '78,50'], 'sp':['98,00', '0,00', '68,00'],
'psp':['17,00', '7,60', '14,30'],'avn':['15,00', '10,00', '4,30']})
#get lenghts of lists
l = d1.str.len()
#repeat index values by lengths
r = np.repeat(d1.index.values, l)
#flattening lists
v = np.concatenate(d1.values)
#DataFrame constructor
df1 = pd.DataFrame({'type':r, 'c_t':v})
df1['g'] = df1.groupby('type').cumcount() + 1
df = df.join(df1.set_index(['type','g']), on=['type','Day_number'])
print (df)
type n_b Day_number c_t
0 ds 2 1 104,50
1 ds 3 2 19,00
2 ds 1 3 10,00
3 ds 2 4 30,00
4 ds 1 5 0,00
5 ds 3 6 0,00
6 ds 2 7 16,00
7 ds 1 8 21,50
8 sp 2 1 98,00
9 sp 2 2 0,00
10 sp 1 3 68,00
11 avn 2 1 15,00
12 avn 4 2 10,00
13 avn 3 3 4,30
14 psp 1 1 17,00
15 psp 3 2 7,60
16 psp 5 3 14,30
17 sp 2 1 98,00
18 sp 2 2 0,00
19 sp 4 3 68,00
Upvotes: 1
Reputation: 19947
It seems you're almost there. You've now got:
df
Out[758]:
type n_b c_t
0 sp 2 [98,00, 0,00, 68,00]
1 sp 2 [98,00, 0,00, 68,00]
2 sp 3 [98,00, 0,00, 68,00]
3 avn 2 [15,00, 10,00, 4,30]
4 avn 4 [15,00, 10,00, 4,30]
5 avn 3 [15,00, 10,00, 4,30]
6 psp 1 [17,00, 7,60, 14,30]
7 psp 3 [17,00, 7,60, 14,30]
8 psp 5 [17,00, 7,60, 14,30]
One more step to get you the desired output:
#use the row index%3 to select the element from the list under c_t column.
df.c_t=df.apply(lambda x: x.c_t[x.name%3],axis=1)
df
Out[761]:
type n_b c_t
0 sp 2 98,00
1 sp 2 0,00
2 sp 3 68,00
3 avn 2 15,00
4 avn 4 10,00
5 avn 3 4,30
6 psp 1 17,00
7 psp 3 7,60
8 psp 5 14,30
Upvotes: 1