JayBee
JayBee

Reputation: 143

Shifting certain rows to the left in pandas dataframe

I have a pandas database of some sports data. The columns are name, age, birth city, birth country, rookie, weight, and problem. The original data had birthcity as "City,State" for American players, so when I used a comma delimiter the result was two variables. So now all the American players are shifted over, and I needed to make a "Problem" variable to account for the excess.

How can I shift just the Americans over to the left across thousands of observations? Thanks!

What I have (please excuse the table formatting):

Name Age BirthCity BirthCountry Rookie Weight Problem

Frank 32 Seattle   WA           USA    N      200
Jake  24 Geneva    Switzerland   Y     210

Desired:

Name Age BirthCity BirthCountry Rookie Weight

Frank 32 Seattle   USA           N     200
Jake  24 Geneva    Switzerland   Y     210

Upvotes: 4

Views: 4766

Answers (2)

jpp
jpp

Reputation: 164623

One way is to first delete the 3rd (remember Python counts 0 first) column selectively, simultaneously adding an extra column NaN. Then delete the final Problem series.

# df, start with this dataframe
#
#     Name  Age BirthCity BirthCountry Rookie Weight  Problem
# 0  Frank   32   Seattle           WA    USA      N    200.0
# 1   Jake   24    Geneva  Switzerland      Y    210      NaN

def shifter(row):
    return np.hstack((np.delete(np.array(row), [3]), [np.nan]))

mask = df['Rookie'] == 'USA'
df.loc[mask, :] = df.loc[mask, :].apply(shifter, axis=1)

df = df.drop(['Problem'], axis=1)

#     Name  Age BirthCity BirthCountry Rookie Weight
# 0  Frank   32   Seattle          USA      N    200
# 1   Jake   24    Geneva  Switzerland      Y    210

Upvotes: 3

jezrael
jezrael

Reputation: 862511

Not so easy:

#get all rows by mask
mask = df['Rookie'] == 'USA'
c = ['BirthCountry','Rookie','Weight','Problem']
#shift columns, but necessary converting to strings
df.loc[mask, c] = df.loc[mask, c].astype(str).shift(-1, axis=1)
#converting column Weight to float and then int
df['Weight'] = df['Weight'].astype(float).astype(int)
#remove column Problem
df = df.drop('Problem', axis=1)
print (df)
    Name  Age BirthCity BirthCountry Rookie  Weight
0  Frank   32   Seattle          USA      N     200
1   Jake   24    Geneva  Switzerland      Y     210

Upvotes: 2

Related Questions