Reputation: 239

Pandas - Get last element after str.split()

I use pandas and I have data and the data look like this

FirstName LastName StudentID
FirstName2 LastName2 StudentID2

Then I split it based on 'space' using str.split()

So the data will look like this in DataFrame

[[FirstName, LastName, StudentID],
[FirstName2, LastName2, StudentID2]]

How to take the StudentID for every students only and save it in new column?

Upvotes: 21

Answers (7)

ggorlen

Reputation: 56875

For OP's particular case, another approach is using .extract() and capturing the desired chunk of the string, without creating an intermediate list:

>>> import pandas as pd # 2.0.3
>>> df = pd.DataFrame({"a": ["aaa bb cc", "xx y zzzz"]})
>>> df["a"].str.extract(r"(\S+)$")
      0
0    cc
1  zzzz

The regex (\S+)$ grabs one or more non-space characters at the end of the string. (\S+)\s*$ is useful if there is trailing whitespace.

This answer shows the intermediate list idea nicely, but you can use an index rather than .get():

>>> df["a"].str.split().str[-1]
0      cc
1    zzzz
Name: a, dtype: object

Upvotes: 1

pls256

Reputation: 96

I thought I would add this simple solution which doesn't use lists or list comprehension to split an existing column/series and store the last item from the split to a new column/series in the DataFrame

import pandas as pd

data = ['FirstName LastName StudentID',
'FirstName2 LastName2 StudentID2']

df = pd.DataFrame(data=data, columns=['text'])

df['id'] = df.text.str.split(" ").str.get(-1)

Output:

index text id

0 FirstName LastName StudentID StudentID

0 FirstName2 LastName2 StudentID2 StudentID2

Upvotes: 4

Sumeet Thorat

Reputation: 501

Try the below solution:

item["x"]["y"].split(' ')[-1]

Upvotes: 40

BENY

Reputation: 323226

Using data frame constructor

pd.DataFrame(df.text.str.split(' ').tolist()).iloc[:,0]
Out[15]: 
0     FirstName
1    FirstName2
Name: 0, dtype: object

Upvotes: 0

Dani Mesejo

Reputation: 61910

You could do something like this:

import pandas as pd

data = ['FirstName LastName StudentID',
'FirstName2 LastName2 StudentID2']

df = pd.DataFrame(data=data, columns=['text'])

df['id'] = df.text.apply(lambda x: x.split()[-1])

print(df)

Output

text          id
0     FirstName LastName StudentID   StudentID
1  FirstName2 LastName2 StudentID2  StudentID2

Or, as an alternative:

df['id'] = [x.split()[-1] for x in df.text]
print(df)

Output

text          id
0     FirstName LastName StudentID   StudentID
1  FirstName2 LastName2 StudentID2  StudentID2

Upvotes: 1

finefoot

Reputation: 11224

Why not try a simple list comprehension

students = [
    ["FirstName", "LastName", "StudentID"],
    ["FirstName2", "LastName2", "StudentID2"]
]

print([student[2] for student in students])

which will print

['StudentID', 'StudentID2']

Upvotes: -2

Tim

Reputation: 2843

Use a list comprehension to take the last element of each of the split strings:

ids = [val[-1] for val in your_string.split()]

Upvotes: 8

Pandas - Get last element after str.split()

Answers (7)

Related Questions