Reputation: 164
Hello I have pandas dataframe
code LocationCod
0. 999010001-R-XQK33-4658-999- R-XQK33
1. 602444201-N-FKD2-000257999-999- N-FKD2
2. 602016601-N-TAS-777-999- N-TAS
3. 602010001-345-999-999- 345
I want to split the value in column Code based on Value in Column LocationCod which should be expand to new columns CategCode and MainCode.required table would looklike this.
code LocationCod CategCode MainCode
0. 999010001-R-XQK33-4658-999- R-XQK33 4658-999- 999010001
1. 602444201-N-FKD2-000257999-999- N-FKD2 000257999-999- 602444201
2. 602016601-N-TAS-777-999- N-TAS 777-999- 602016601
3. 602010001-345-999-999- 345 999-999- 602010001
Please help.!
Upvotes: 1
Views: 562
Reputation: 2819
you can try with:
df = pd.DataFrame({
'Code': ['999010001-R-XQK33-4658-999-', '602444201-N-FKD2-000257999-999-','602016601-N-TAS-777-999-','602010001-345-999-999- '],
'LocationCod': ['R-XQK33','N-FKD2','N-TAS',345]})
df['MainCode']=df['Code'].apply(lambda x:x.split('-')[0]+'-')
df['CategCode']=df['Code'].apply(lambda x:'-'.join(x.split('-')[-3:]))
print(df)
result:
Code LocationCod MainCode CategCode
0 999010001-R-XQK33-4658-999- R-XQK33 999010001- 4658-999-
1 602444201-N-FKD2-000257999-999- N-FKD2 602444201- 000257999-999-
2 602016601-N-TAS-777-999- N-TAS 602016601- 777-999-
3 602010001-345-999-999- 345 602010001- 999-999-
Upvotes: 0
Reputation: 34056
You can use df.apply
with a custom lambda
:
In [2270]: df[['MainCode', 'CategCode']] = pd.DataFrame(df.apply(lambda x: x['code'].split(x['LocationCod']), 1).tolist())
In [2271]: df
Out[2271]:
code LocationCod MainCode CategCode
0 999010001-R-XQK33-4658-999- R-XQK33 999010001- -4658-999-
1 602444201-N-FKD2-000257999-999- N-FKD2 602444201- -000257999-999-
2 602016601-N-TAS-777-999- N-TAS 602016601- -777-999-
3 602010001-345-999-999- 345 602010001- -999-999-
You can remove the -
from new columns to match your expected output like this:
In [2281]: df.MainCode = df.MainCode.str[:-1]
In [2280]: df.CategCode = df.CategCode.str[1:]
In [2282]: df
Out[2282]:
code LocationCod MainCode CategCode
0 999010001-R-XQK33-4658-999- R-XQK33 999010001 4658-999-
1 602444201-N-FKD2-000257999-999- N-FKD2 602444201 000257999-999-
2 602016601-N-TAS-777-999- N-TAS 602016601 777-999-
3 602010001-345-999-999- 345 602010001 999-999-
Upvotes: 1
Reputation: 1012
Try this:
df['CategCode']=[df.code.values[i].split(df.LocationCod.values[i])[-1][1:] for i in range(len(df))]
df['MainCode']=[df.code.values[i].split(df.LocationCod.values[i])[0][:-1] for i in range(len(df))]
Upvotes: 0