Akshat Kulshreshtha
Akshat Kulshreshtha

Reputation: 47

How to seperate multiple values in a row from pandas dataframe

So, I've an excel file and that I have already converted in pandas dataframe, I've done some analysis on it already but there's an issue that I'm facing in it, that is is of how can I separate multiple values that are given in a same row, they're differentiated using a) name1 ; b) name2

As a beginner in pandas I'm unable to work upon a logic which can frame out the multiple values that are given in the column.

enter image description here

This is the dataset that I'm working on and I'm unsure how can I differentiate the multiple values that are given in the same row.

Upvotes: 0

Views: 676

Answers (1)

ChrisOram
ChrisOram

Reputation: 1434

You can use .str.split() to split the column into two and then .str.lstrip() to remove the (a) and (b):

>>> import pandas as pd
>>> df = pd.DataFrame({"Chronic medical conditions": ["(a) BP; (b) Diabetes", "(a) Diabetes; (b) high BP"]})
>>> df
  Chronic medical conditions
0       (a) BP; (b) Diabetes
1  (a) Diabetes; (b) high BP

>>> df = df["Chronic medical conditions"].str.split(';', expand=True)
>>> df.columns = ["a", "b"]  # rename columns as neccessary
>>> df
              a              b
0        (a) BP   (b) Diabetes
1  (a) Diabetes    (b) high BP

>>> df["a"] = df["a"].str.lstrip("(a) ")
>>> df["b"] = df["b"].str.lstrip(" (b)")
>>> df
           a         b
0         BP  Diabetes
1   Diabetes   high BP

Upvotes: 1

Related Questions