user4829254
user4829254

Reputation:

Pandas: How to join csv columns of no header?

I have csv data like the following.

1,2,3,4
a,b,c,d

1,2,3,4 is not a csv header. It is data.
That values is all strings data.
I want join columns of index (of list) of 1 and 2 by Pandas.
I want get result like the following.
Result data is strings.

1,23,4
a,bc,d

Python's code is like the following.

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]
vals = lines[0]
s = vals[0] + ',' + (vals[1] + vals[2]) + ',' + vals[3] + '\n'
vals = lines[1]
s += vals[0] + ',' + (vals[1] + vals[2]) + ',' + vals[3] + '\n'
print(s)

How to you do it?

Upvotes: 1

Views: 163

Answers (4)

InnocentBystander
InnocentBystander

Reputation: 711

Since OP specified pandas, here's a solution that may work.

Once in pandas, eg with pd.read_csv() You can simply concatenate text (object) columns with +

import pandas as pd

lines = [ ['1', '2', '3', '4'],
        ['a', 'b', 'c', 'd']]
df = pd.DataFrame(lines)

df[1] = df[1]+df[2]
df.drop(columns=2, inplace=True)
df
# 0 1 3
# 0 1 23 4
# 1 a bc d

Should give you what you want in a pandas dataframe.

Upvotes: 0

Mayur Kr. Garg
Mayur Kr. Garg

Reputation: 236

You can do something like this.

import pandas as pd
lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

df = pd.DataFrame(lines)
df['new_col'] = df.iloc[:, 1] + df.iloc[:, 2]
print(df)

Output

enter image description here

You can then drop the columns you don't want.

Upvotes: 0

Alex Kosh
Alex Kosh

Reputation: 2554

If you wand to use pandas, you could create new column and remove old ones:

import pandas as pd

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

df = pd.DataFrame(lines)

# Create new column
df['new_col'] = df[1] + df[2]

print(df)
#    0  1  2  3 new_col
# 0  1  2  3  4      23
# 1  a  b  c  d      bc

# Remove old columns if needed
df.drop([1, 2], axis=1, inplace=True)

print(df)
#    0  3 new_col
# 0  1  4      23
# 1  a  d      bc

If you want columns to be in specific order, use something like this:

print(df[[0, 'new_col', 3]])
#    0 new_col  3
# 0  1      23  4
# 1  a      bc  d

But it's better to save headers in csv

Upvotes: 0

James
James

Reputation: 36691

You can loop over it using for or a list-comprehension.

lines = [
    ['1', '2', '3', '4'],
    ['a', 'b', 'c', 'd'],
]

vals = [','.join([w, f'{x}{y}', *z]) for w, x, y, *z in lines]
s = '\n'.join(vals)
print(x)

# prints:
1,23,4
a,bc,d

Upvotes: 0

Related Questions