Julia
Julia

Reputation: 95

Solving error "delimiter must be a 1-character string" while writing a dataframe to a csv file

Using this question: Pandas writing dataframe to CSV file as a model, I wrote the following code to make a csv file:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep='\s+', header=True)

But it returns the following error:

TypeError: "delimiter" must be an 1-character string

I have looked up the documentation for this here http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html but I can't figure out what I am missing, or what that error means. I also tried using (sep='\s') in the code, but got the same error.

Upvotes: 6

Views: 33538

Answers (2)

Mohamed Ali JAMAOUI
Mohamed Ali JAMAOUI

Reputation: 14699

As mentioned in the issue discussion (here), this is not considered as a pandas issue but rather a compatibility issue of python's csv module with python2.x.

The workaround to solve it is to enclose the separator with str(..). For example, here is how you can reproduce the problem, and then solve it:

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=',')

This will raise the following error:

TypeError ....              
----> 1 df.to_csv(sep=',')
TypeError: "delimiter" must be an 1-character string

The following however, will show the expected result

from __future__ import unicode_literals
import pandas as pd 
df = pd.DataFrame([['a', 'A'], ['b', 'B']])
df.to_csv(sep=str(','))

Output:

',0,1\n0,a,A\n1,b,B\n'

In your case, you should edit your code as follows:

df.to_csv('/Users/Lab/Desktop/filteredwithheading.txt', sep=str('\s+'), header=True)

Upvotes: 5

Michelle Welcks
Michelle Welcks

Reputation: 3914

Note that the although the solution to this error was using a string charcter instead of regex, pandas also raises this error when using from __future__ import unicode_literals with valid unicode characters. As of 2015-11-16, release 0.16.2, this error is still a known bug in pandas:
"to_csv chokes if not passed sep as a string, even when encoding is set to unicode" #6035

For example, where df is a pandas DataFrame:

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep='\t', encoding='utf-8')

TypeError: "delimiter" must be an 1-character string

Using a byte lteral with the specified encoding (default utf-8 with Python 3) -*- coding: utf-8 -*- will resolve this in pandas 0.16.2: (b'\t') —I haven't tested with previous versions or 0.17.0.

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import pandas as pd

df.to_csv(pdb_seq_fp, sep=b'\t', encoding='utf-8')

(Note that with versions 0.13.0 - ???, it was necessary to use pandas.compat import u; but by 0.16.2 the byte literal is the way to go.)

Upvotes: 3

Related Questions