Reputation: 149
I am trying to add two columns and create a new one. This new column should become the first column in the dataframe or the output csv file.
column_1 column_2
84 test
65 test
Output should be
column column_1 column_2
trial_84_test 84 test
trial_65_test 65 test
I tried below given methods but they did not work:
sum = str(data['column_1']) + data['column_2']
data['column']=data.apply(lambda x:'%s_%s_%s' % ('trial' + data['column_1'] + data['column_2']),axis=1)
Help is surely appreciated.
Upvotes: 3
Views: 4499
Reputation: 109626
Create sample data:
df = pd.DataFrame({'column_1': [84, 65], 'column_2': ['test', 'test']})
Method 1: Use assign to create new column, and then reorder.
>>> df.assign(column=['trial_{}_{}'.format(*cols) for cols in df.values])[['column'] + df.columns.tolist()]
column column_1 column_2
0 trial_84_test 84 test
1 trial_65_test 65 test
Method 2: Create a new series and then concatenate.
s = pd.Series(['trial_{}_{}'.format(*cols) for cols in df.values], index=df.index, name='column')
>>> pd.concat([s, df], axis=1)
column column_1 column_2
0 trial_84_test 84 test
1 trial_65_test 65 test
Method 3: Insert the new values at the first index of the dataframe (i.e. column 0).
df.insert(0, 'column', ['trial_{}_{}'.format(*cols) for cols in df.values])
>>> df
column column_1 column_2
0 trial_84_test 84 test
1 trial_65_test 65 test
Method 3 (alternative way to create values for new column):
df.insert(0, 'column', df.astype(str).apply(lambda row: 'test_' + '_'.join(row), axis=1))
By the way, sum
is a keyword so you do not want to use it as a variable name.
Upvotes: 4
Reputation: 323326
You can using insert
df.insert(0,column='Columns',value='trial_' + df['column_1'].astype(str)+ '_'+df['column_2'].astype(str)
)
df
Out[658]:
Columns column_1 column_2
0 trial_84_test 84 test
1 trial_65_test 65 test
Upvotes: 0
Reputation: 164773
Do not use lambda
for this, as it is just a thinly veiled loop. Here is a vectorised solution. Care needs to be taken to convert non-string values to str
type.
df['column'] = 'trial_' + df['column_1'].astype(str) + '_' + df['column_2']
df = df.reindex_axis(sorted(df.columns), axis=1) # sort columns alphabetically
Result:
column column_1 column_2
0 trial_84_test 84 test
1 trial_65_test 65 test
Upvotes: 3