Reputation: 11
I have a Pandas dataframe that has lists in some of the columns:
email A B
[email protected] [name1, name2] [thing1, thing2, thing3]
[email protected] [name] [thing1, thing2]
I only want to have the last element of each list in each row, like this:
email A B
[email protected] name2 thing3
[email protected] name thing2
Is there an easy way of doing it? I initially though of something in the likes of
data['newcolumn'] = data['A'][Number of row][-1]
, but I'm a little bit lost on hoiw doing the "number of row" part. Thanks!
Upvotes: 1
Views: 2981
Reputation: 11
This code may help you:
data['newA']=[data['A'][i][-1] for i in range(len(data))]
data['newB']=[data['B'][i][-1] for i in range(len(data))]
Upvotes: 1
Reputation: 34046
You can simply use Series.str[-1]
:
In [145]: df = pd.DataFrame({'email':['[email protected]', '[email protected]'], 'A':[['name1', 'name2'], ['name']], 'B':[['thing1', 'thing2', 'thing3'], ['thing1', 'thing2']]})
In [146]: df
Out[146]:
email A B
0 [email protected] [name1, name2] [thing1, thing2, thing3]
1 [email protected] [name] [thing1, thing2]
In [148]: df['A'] = df['A'].str[-1]
In [149]: df['B'] = df['B'].str[-1]
In [150]: df
Out[150]:
email A B
0 [email protected] name2 thing3
1 [email protected] name thing2
Upvotes: 1
Reputation: 47
Here is my answer to this, create a simple function called getLastValue() it takes your list in each row and returns the last value of that list. see below.
import pandas as pd
data = {
'email' : ['[email protected] ', '[email protected]'],
'A': [['name1','name2'], ['name']],
'B': [['thing1', 'thing2', 'thing3'], ['thing1', 'thing2']]
}
def getLastValue(aList):
return aList[-1]
df = pd.DataFrame(data)
df['A'] = df['A'].apply(getLastValue)
df['B'] = df['B'].apply(getLastValue)
print(df)
Upvotes: 1
Reputation: 3285
Assuming your dataframe is called df
, you could do something like the following
def return_last_element(row):
# If the row of the given column is list or a tuple, get the last element
if isinstance(row, (list, tuple)):
return row[-1]
# Otherwise just return the value
else:
return row
# Loop over all columns, and apply the function to each row of each column
for col in df.columns:
df[col] = df[col].apply(return_last_element)
Upvotes: 1