Reputation: 1054
I'm working in some Pyspark tasks.
I am using a parquet file as source with 3 columns.
One of them it requires to export my dataframe to a text file with tab delimited. I can do this using the following operation:
`df.write.option("text").csv("output_file"`)
However, it exports a csv file not a text file. The only way that I was able to see to export a text file was to export only a single column but with that option I loose the delimiter part. For exemple:
df = df.select(concat_aws('\t',*result.columns).alias('data'))
What is the more similar way to export the text file with delimiters like I did for CSV export? For example, in Scala this is very simple to do:
df.map(row => row.mkString("\t")).write.text("")
Is there any equivalence on Python?
Thanks!
Upvotes: 0
Views: 1498
Reputation: 2407
Your attempt with the csv
method was almost correct, you only need to change the delimiter from the default (comma) to tab:
df.write.option("sep", "\t").csv("output_file")
Note that CSV is actually a text format (you can view it with a text editor; it contains tabular data where rows are separated by new line characters, and fields are separated by commas). The tab-delimited variation of it is sometimes called TSV.
Upvotes: 2