Reputation: 183
I have a quick question, I have the below df
df
File_Path
0 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh
1 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh
2 /data/app_next_best_action/call_nba_as11.sh
3 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing
4 sh /data/processos/current/aplicacao/AAVR/ACN10/scr/exec_fim_grupo.sh ACN10_ARQ_1
and I want to get the 4th item of the tree structure in the File_Path column.
the output should looks like this:
df
File_Path Parent_path
0 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
1 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
2 /data/app_next_best_action/call_nba_as11.sh /data/app_next_best_action/call_nba_as11.sh
3 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh /data/application/AAIN/aain-srv-motor-extracao-next/
4 sh /data/processos/current/aplicacao/AAVR/ACN10/scr/exec_fim_grupo.sh ACN10_ARQ_1 /data/processos/current/aplicacao/
In index = 2, there is no 4th item, so it gets the last, which is a file call_nba_as11.sh
Also in index=4 there is a "sh " in the begining of the file_path value, I need to escape that
could guys help me?
Upvotes: 1
Views: 82
Reputation: 260335
You can use a regex with str.extract
:
df['Parent_path'] = df['File_Path'].str.extract(r'^((?:/[^/]+){,4}/?)')
output:
File_Path Parent_path
0 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
1 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
2 /data/app_next_best_action/call_nba_as11.sh /data/app_next_best_action/call_nba_as11.sh
3 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing /data/application/AAIN/aain-srv-motor-extracao-next/
Alternative:
df['Parent_path'] = df['File_Path'].str.extract(r'^[^/]*((?:/[^/]+){,4}/?)')
Output:
File_Path Parent_path
0 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_INVE_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
1 /data/application/AANX/aanx-dataeng-slas-sysyphus/scripts/s_shell/call_iws/call_PP_NEXT_RTBA_MAU_IND_EMPF_D.sh /data/application/AANX/aanx-dataeng-slas-sysyphus/
2 /data/app_next_best_action/call_nba_as11.sh /data/app_next_best_action/call_nba_as11.sh
3 /data/application/AAIN/aain-srv-motor-extracao-next/iws/call_run_extract_default.sh cdlc_ing /data/application/AAIN/aain-srv-motor-extracao-next/
4 sh /data/processos/current/aplicacao/AAVR/ACN10/scr/exec_fim_grupo.sh ACN10_ARQ_1 /data/processos/current/aplicacao/
Upvotes: 3