Reputation: 43179
I have a pandas dataframe with hundreds of columns of antibiotic names. Each specific antibiotic is coded in the dataframe as ending in E, T, or P to indicate empirical, treatment, or prophylactic regimens.
An example excerpt from the column list is:
['MeropenemP', 'MeropenemE', 'MeropenemT', DoripenemP', 'DoripenemE', 'DoripenemT', ImipenemP', 'ImipenemE', 'ImipenemT', 'BiapenemP', 'BiapenemE', 'BiapenemT', 'PanipenemP', 'PanipenemE', 'PanipenemT','PipTazP', 'PipTazE', 'PipTazT','PiperacillinP', 'PiperacillinE', 'PiperacillinT']
A small sample of data is located here:
It is simple enough for me to separate out columns any type into separate dataframes with a regex, e.g. to select all the empirically prescribed antibiotics columns I use:
E_cols = master.filter(axis=1, regex=('[a-z]+E$'))
Each column has a binary value (0,1) for prescription of each antibiotic regimen type per person (row).
Question: How would I go about summing the rows of all columns (1's) for each type of regimen type and generating a new column for each result in the dataframe e.g. total_emperical, total_prophylactic, total_treatment.
The reason I want to add to the existing dataframe is that I wish to filter on other values for each regimen type.
Upvotes: 2
Views: 3888