Pedro Cintra
Pedro Cintra

Reputation: 177

Create Pandas DataFrame from space separated String

I have a string:

              C1     C2                       DATE     C4     C5         C6      C7
0            0.0    W04  2021-01-08 00:00:00+00:00      E    EUE         C1     157
1            0.0    W04  2021-01-08 00:00:00+00:00      E    AEU         C1     157
2            0.0    W04  2021-01-01 00:00:00+00:00      E   SADA         H1     747
3            0.0    W04  2021-01-04 00:00:00+00:00      E   SSEA         H1     747
4            0.0    W04  2021-01-05 00:00:00+00:00      E   GPEA         H1     747

It sure looks like a Pandas DataFrame because it comes from one. I need to convert it into a Pandas DataFrame.

I tried the following:

pd.read_csv(StringIO(string_file),sep=r"\s+")

but it messes with the columns and separates the DATE column into 2 columns.

Upvotes: 9

Views: 3665

Answers (1)

Arthur D.
Arthur D.

Reputation: 410

First, recreate the string:

s = """
              C1     C2                       DATE     C4     C5         C6      C7
0            0.0    W04  2021-01-08 00:00:00+00:00      E    EUE         C1     157
1            0.0    W04  2021-01-08 00:00:00+00:00      E    AEU         C1     157
2            0.0    W04  2021-01-01 00:00:00+00:00      E   SADA         H1     747
3            0.0    W04  2021-01-04 00:00:00+00:00      E   SSEA         H1     747
4            0.0    W04  2021-01-05 00:00:00+00:00      E   GPEA         H1     747
"""

Now, you can use Pandas.read_csv to import a buffer:

from io import StringIO
df = pd.read_csv(StringIO(s), sep=r"\s\s+")

From what I can tell, this results in exactly the DataFrame that you are looking for:

Screenshot of resulting DataFrame

You may want to convert the DATE column to datetime values as well:

df['DATE'] = df.DATE.astype('datetime64')

Upvotes: 6

Related Questions