Pandas DataFrame from raw string

Question

I've got a string which looks like:

a1	b1	c1
a2	b2	c2
a3	b3	c3
...

Is there an efficient and smart way to convert this kind of string into a Pandas DataFrame? StringIO seems not to be correct for this approach.

Thanks in advance!!

cs95 · Accepted Answer

StringIO works perfectly.

import io

string = 'a1	b1	c1
a2	b2	c2
a3	b3	c3'
pd.read_csv(io.StringIO(string), delim_whitespace=True, header=None)

    0   1   2
0  a1  b1  c1
1  a2  b2  c2
2  a3  b3  c3

You can also use pd.read_table or pd.read_fwf in the same manner:

pd.read_table(io.StringIO(string), header=None)

Or,

pd.read_fwf(io.StringIO(string), header=None)

    0   1   2
0  a1  b1  c1
1  a2  b2  c2
2  a3  b3  c3

In these last two examples, it is assumed that whitespace is the natural delimiter. However, your raw string must maintain a consistent structure within data.

Finally, you can also use a string splitting approach, splitting on newlines first, and then on tabs:

pd.DataFrame(list(map(str.split, string.splitlines())))

    0   1   2
0  a1  b1  c1
1  a2  b2  c2
2  a3  b3  c3

Pandas DataFrame from raw string

Answers (2)

Related Questions