kolin
kolin

Reputation: 63

read_csv with regex

csv data example:

<pre>2019-08-15 00:00:06,430 0:0 - {"info":{"name":"LTD - PUBLIC"}}</pre>
<pre>pd.read_csv(filepath, sep= ' - ', header=None, engine='python')</pre>

expected:

<pre>
date                           info
2019-08-15 00:00:06,430 0:0    {"info":{"name":"LTD - PUBLIC"}}
</pre>

error message:

ParserError: Expected 2 fields in line 1, saw 3. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

Upvotes: 1

Views: 486

Answers (1)

iamklaus
iamklaus

Reputation: 3770

use a regex sep

temp = StringIO("""  
2019-08-15 00:00:06,430 0:0 - {"info":{"name":"LTD - PUBLIC"}}
""")


df = pd.read_csv(temp, sep=r' - (?={)', engine='python',header=None)
df.rename({0:'date',1:'info'},axis=1)

Output

                          date                              info
0  2019-08-15 00:00:06,430 0:0  {"info":{"name":"LTD - PUBLIC"}}

Upvotes: 3

Related Questions