scott martin
scott martin

Reputation: 1293

Pandas - Extracting data from Series

I am trying to extract a seat of data from a column that is of type pandas.core.series.Series.

I tried

df['col1'] = df['details'].astype(str).str.findall(r'name\=(.*?),') 

but the above returns null

Given below is how the data looks like in column df['details']

[{'id': 101, 'name': 'Name1', 'state': 'active', 'boardId': 101, 'goal': '', 'startDate': '2019-01-01T12:16:20.296Z', 'endDate': '2019-02-01T11:16:00.000Z'}]

Trying to extract value corresponding to name field

Expected output : Name1

Upvotes: 1

Views: 14555

Answers (3)

Skander HR
Skander HR

Reputation: 620

The structure in your series is a dictionary.

[{'id': 101, 'name': 'Name1', 'state': 'active', 'boardId': 101, 'goal': '', 'startDate': '2019-01-01T12:16:20.296Z', 'endDate': '2019-02-01T11:16:00.000Z'}]

You can just point to the element 'name' from that dict with the following command

df['details'][0]['name']

If the name could be different you can get the list of the keys in the dictionary and apply your regex on that list to get your field's name.

Hope that it can help you.

Upvotes: 1

Shankar Saran Singh
Shankar Saran Singh

Reputation: 301

try this: simple, change according to your need.

import pandas as pd

df = pd.DataFrame([{'id': 101, 'name': 'Name1', 'state': 'active', 'boardId': 101, 'goal': '', 'startDate': '2019-01-01T12:16:20.296Z', 'endDate': '2019-02-01T11:16:00.000Z'}])
print(df['name'][0])    

#or if DataFrame inside a column itself

df['details'][0]['name']

NOTE: as you mentioned details is one of the dataset that you have in the existing dataset

Upvotes: 1

Marek Justyna
Marek Justyna

Reputation: 224

import pandas as pd
df = pd.DataFrame([{'id': 101, 'name': 'Name1', 'state': 'active', 'boardId': 101, 'goal': '', 'startDate': '2019-01-01T12:16:20.296Z', 'endDate': '2019-02-01T11:16:00.000Z'}])

#Name column
print(df.name)

#Find specific values in Series

indeces = df.name.str.find("Name") #Returns indeces of such values

df.iloc[index] # Returns all columns that fields name contain "Name"

df.name.iloc[index] # Returns all values from column name, which contain "Name"

Hope, this example will help you.

EDIT: Your data frame has column 'details', which contain a dict {'id':101, ...}

>>> df['details']
0    {'id': 101, 'name': 'Name1', 'state': 'active'...

And you want to get value from field 'name', so just try:

>>> df['details'][0]['name']
'Name1'

Upvotes: 1

Related Questions