Arvind Kandaswamy
Arvind Kandaswamy

Reputation: 2231

python pandas special character as delimiter

I have a text file with a special character [˛] as a delimiter. I copy pasted this special character as a delimiter in my read_csv command and I am getting the following error:

ParserWarning: Falling back to the 'python' engine because the 
separator encoded in utf-8 is > 1 char long, and the 'c' engine does 
not support such separators; you can avoid this warning by specifying 
engine='python'.
  """Entry point for launching an IPython kernel.

Any idea how to use a special character while reading a text file?

Upvotes: 2

Views: 4946

Answers (1)

jezrael
jezrael

Reputation: 862511

You get only warning and solution for remove it is very easy - add engine='python'.

Specifying the parser engine:

Under the hood pandas uses a fast and efficient parser implemented in C as well as a python implementation which is currently more feature-complete. Where possible pandas uses the C parser (specified as engine='c'), but may fall back to python if C-unsupported options are specified. Currently, C-unsupported options include:

  • sep other than a single character (e.g. regex separators)
  • skipfooter
  • sep=None with delim_whitespace=False

Specifying any of the above options will produce a ParserWarning unless the python engine is selected explicitly using engine='python'.

import pandas as pd
from pandas.compat import StringIO

temp=u"""a˛b˛c
1˛3˛5
7˛8˛1
"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="˛", engine='python')
print (df)
   a  b  c
0  1  3  5
1  7  8  1

Upvotes: 4

Related Questions