Alex T
Alex T

Reputation: 3754

Python read data from csv using space sep except first column

Hi im wondering if there is a way to read data from csv file using pandas read_csv that every entry is separated by space except the first column:

Alabama 400 300 200
New York 400 200 100
Missouri 400 200 50
District of Columbia 450 100 250

So there would be 4 columns, with the first being state.

Upvotes: 1

Views: 241

Answers (2)

Sumanth
Sumanth

Reputation: 507

You could also use regex in splitting the string.

import regex as re
r = re.compile("([\w\s]+)\s+([\d]+)\s+([\d]+)\s+([\d]+)")
data = r.match('New York 400 200 100')
split_list = [data.group(1),data.group(2),data.group(3),data.group(4)]

Output will be: ['New York', '400', '200', '100']

You can just create a final list of lists and write it into csv file.

Upvotes: 0

jezrael
jezrael

Reputation: 863291

Use read_csv with separator not in data like | and then str.rsplit with parameter n=3 for spliting by 3 whitespace from right side and expand=True for DataFrame:

import pandas as pd

temp=u"""Alabama 400 300 200
New York 400 200 100
Missouri 400 200 50
District of Columbia 450 100 250"""
#after testing replace 'pd.compat.StringIO(temp)' to 'filename.csv'
df = pd.read_csv(pd.compat.StringIO(temp), sep="|", names=['Data'])

print (df)
                               Data
0               Alabama 400 300 200
1              New York 400 200 100
2               Missouri 400 200 50
3  District of Columbia 450 100 250


df = df['Data'].str.rsplit(n=3, expand=True)
print (df)

                      0    1    2    3
0               Alabama  400  300  200
1              New York  400  200  100
2              Missouri  400  200   50
3  District of Columbia  450  100  250

Upvotes: 3

Related Questions