Mamed
Mamed

Reputation: 772

How to create a column of strings, including the values from another column

df =

car

big.yellow
small.red
small.black

  1. I want to add each row value between + +. Desired output:
vehicle = 'The vehicle is big.yellow mine'
vehicle = 'The vehicle is small.red mine' 
vehicle = 'The vehicle is small.black mine'
  1. I need to merge all these string into 1 big string:
final_vehicle = 'The vehicle is big.yellow mine
                 The vehicle is small.red mine
                 The vehicle is small.black mine'

But the number of rows in real data is 1000+. How I can speed up?

Upvotes: 1

Views: 95

Answers (2)

Trenton McKinney
Trenton McKinney

Reputation: 62413

  1. A vectorized approach to create a string for each row value is:
  2. Combine the values into a single long string with one of the following:
    1. pandas.DataFrame.to_string as final = df.veh.to_string(index=False)
    2. str.join() as final = '\n'.join(df.veh.tolist())
import pandas as pd
import string  # for test data
import random  # for test data

# create test dataframe
random.seed(365)
df = pd.DataFrame({'car': [random.choice(string.ascii_lowercase) for _ in range(10000)]})

# display(df.head())
car
  v
  j
  w
  y
  e

# add the veh column as strings including the value from the car column
df['veh'] = 'The vehicle is ' + df.car + ' mine'

# display(df.head()
car                    veh
  v  The vehicle is v mine
  j  The vehicle is j mine
  w  The vehicle is w mine
  y  The vehicle is y mine
  e  The vehicle is e mine

# create a long string of all the values in veh
final = df.veh.to_string(index=False)

print(final)
The vehicle is v mine
The vehicle is j mine
The vehicle is w mine
The vehicle is y mine
The vehicle is e mine
...

Upvotes: 2

Milos Bijanic
Milos Bijanic

Reputation: 133

this code probably is solve the problem:

import pandas as pd
df = pd.DataFrame(columns=['id', 'car'])
df['car'] = ['big.yellow', 'small.red', 'small.black']
df['id'] = [1,1,1]

df['new'] = df.groupby('id')['car'].apply(lambda x: ('The vehicle is '+x + '\n').cumsum().str.strip())
df

Results:


id  car             new
0   1   big.yellow  The vehicle is big.yellow
1   1   small.red   The vehicle is big.yellow\nThe vehicle is smal...
2   1   small.black The vehicle is big.yellow\nThe vehicle is smal...

and final:

df['new'][len(df)-1]

is:

'The vehicle is big.yellow\nThe vehicle is small.red\nThe vehicle is small.black'

Upvotes: 0

Related Questions