New User
New User

Reputation: 149

Adding two columns in Python

I am trying to add two columns and create a new one. This new column should become the first column in the dataframe or the output csv file.

column_1 column_2
84       test
65       test

Output should be

column         column_1 column_2
trial_84_test   84      test
trial_65_test   65      test

I tried below given methods but they did not work:

sum = str(data['column_1']) + data['column_2']

data['column']=data.apply(lambda x:'%s_%s_%s' % ('trial' + data['column_1'] + data['column_2']),axis=1)

Help is surely appreciated.

Upvotes: 3

Views: 4499

Answers (3)

Alexander
Alexander

Reputation: 109626

Create sample data:

df = pd.DataFrame({'column_1': [84, 65], 'column_2': ['test', 'test']})

Method 1: Use assign to create new column, and then reorder.

>>> df.assign(column=['trial_{}_{}'.format(*cols) for cols in df.values])[['column'] + df.columns.tolist()]
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 2: Create a new series and then concatenate.

s = pd.Series(['trial_{}_{}'.format(*cols) for cols in df.values], index=df.index, name='column')
>>> pd.concat([s, df], axis=1)
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3: Insert the new values at the first index of the dataframe (i.e. column 0).

df.insert(0, 'column', ['trial_{}_{}'.format(*cols) for cols in df.values])
>>> df
          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Method 3 (alternative way to create values for new column):

df.insert(0, 'column', df.astype(str).apply(lambda row: 'test_' + '_'.join(row), axis=1))

By the way, sum is a keyword so you do not want to use it as a variable name.

Upvotes: 4

BENY
BENY

Reputation: 323326

You can using insert

df.insert(0,column='Columns',value='trial_' + df['column_1'].astype(str)+ '_'+df['column_2'].astype(str)
)
df
Out[658]: 
         Columns  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Upvotes: 0

jpp
jpp

Reputation: 164773

Do not use lambda for this, as it is just a thinly veiled loop. Here is a vectorised solution. Care needs to be taken to convert non-string values to str type.

df['column'] = 'trial_' + df['column_1'].astype(str) + '_' + df['column_2']

df = df.reindex_axis(sorted(df.columns), axis=1)  # sort columns alphabetically

Result:

          column  column_1 column_2
0  trial_84_test        84     test
1  trial_65_test        65     test

Upvotes: 3

Related Questions