user1390150
user1390150

Reputation: 3919

Convert xlsx to csv in Linux with command line

I'm looking for a way to convert xlsx files to csv files on Linux.

I do not want to use PHP/Perl or anything like that since I'm looking at processing several millions of lines, so I need something quick. I found a program on the Ubuntu repos called xls2csv but it will only convert xls (Office 2003) files (which I'm currently using) but I need support for the newer Excel files.

Any ideas?

Upvotes: 391

Views: 387662

Answers (12)

andrewtweber
andrewtweber

Reputation: 25569

If you already have a desktop environment then I'm sure Gnumeric or LibreOffice would work well, but on a headless server (e.g. any cloud-based environment), they require dozens of dependencies that you also need to install.

I found this Python alternative: xlsx2csv

easy_install xlsx2csv
xlsx2csv file.xlsx > newfile.csv

It took two seconds to install and works like a charm.

If you have multiple sheets, you can export all at once, or one at a time:

xlsx2csv file.xlsx --all > all.csv
xlsx2csv file.xlsx --all -p '' > all-no-delimiter.csv
xlsx2csv file.xlsx -s 1 > sheet1.csv

He also links to several alternatives built in Bash, Python, Ruby, and Java.

Upvotes: 188

kaiya
kaiya

Reputation: 312

You can use script getsheets.py. Add dependencies first:

pip3 install pandas xlrd openpyxl

Then call the script: python3 getsheets.py <file.xlsx>

Upvotes: 1

user8234870
user8234870

Reputation:

You can use executable libreoffice to convert your .xlsx files to csv:

libreoffice --headless --convert-to csv ABC.xlsx

Argument --headless indicates that we don't need GUI.

Upvotes: 6

Topper Harley
Topper Harley

Reputation: 12384

As others said, executable libreoffice can convert Excel files (.xls) files to CSV. The problem for me was the sheet selection.

This LibreOffice Python script does a fine job at converting a single sheet to CSV.

Usage is:

./libreconverter.py File.xls:"Sheet Name" output.csv

The only downside (on my end) is that --headless doesn't seem to work. I have a LibreOffice window that shows up for a second and then quits.

That's OK with me; it's the only tool that does the job rapidly.

Upvotes: 3

Akavall
Akavall

Reputation: 86306

If the .xlsx file has many sheets, the -s flag can be used to get the sheet you want. For example:

xlsx2csv "my_file.xlsx" -s 2 second_sheet.csv

second_sheet.csv would contain the data of the second sheet in my_file.xlsx.

Upvotes: 10

Holger Brandl
Holger Brandl

Reputation: 11222

Use csvkit:

in2csv data.xlsx > data.csv

For details, check their excellent documentation.

Upvotes: 68

Holger Brandl
Holger Brandl

Reputation: 11222

Another option would be to use R via a small Bash wrapper for convenience:

xlsx2txt(){
echo '
require(xlsx)
write.table(read.xlsx2(commandArgs(TRUE)[1], 1), stdout(), quote=F, row.names=FALSE, col.names=T, sep="\t")
' | Rscript --vanilla - $1 2>/dev/null
}

xlsx2txt file.xlsx > file.txt

Upvotes: 15

neves
neves

Reputation: 39433

In Bash, I used this LibreOffice command (executable libreoffice) to convert all my .xlsx files in the current directory:

for i  in *.xlsx; do  libreoffice --headless --convert-to csv "$i" ; done

Close all your LibreOffice open instances before executing, or it will fail silently.

The command takes care of spaces in the filename.

I tried it again some years later, and it didn't work. This question gives some tips, but the quickest solution was to run as root (or running a sudo libreoffice). It is not elegant, but quick.

Use the command scalc.exe in Windows.

Upvotes: 51

jmcnamara
jmcnamara

Reputation: 41644

The Gnumeric spreadsheet application comes with a command line utility called ssconvert that can convert between a variety of spreadsheet formats:

$ ssconvert Book1.xlsx newfile.csv

Using exporter Gnumeric_stf:stf_csv

$ cat newfile.csv

Foo,Bar,Baz
1,2,3
123.6,7.89,
2012/05/14,,
The,last,Line

To install on Ubuntu:

apt-get install gnumeric

To install on Mac:

brew install gnumeric

Upvotes: 354

Pascal-Louis Perez
Pascal-Louis Perez

Reputation: 181

Using the Gnumeric spreadsheet application which comes which a commandline utility called ssconvert is indeed super simple:

find . -name '*.xlsx' -exec ssconvert -T Gnumeric_stf:stf_csv {} \;

and you're done!

Upvotes: 7

spiffytech
spiffytech

Reputation: 6632

You can do this with LibreOffice:

libreoffice --headless --convert-to csv $filename --outdir $outdir

For reasons not clear to me, you might need to run this with sudo. You can make LibreOffice work with sudo without requiring a password by adding this line to you sudoers file:

users ALL=(ALL) NOPASSWD: libreoffice

Upvotes: 186

Pavel Veller
Pavel Veller

Reputation: 6105

If you are OK to run Java command line then you can do it with Apache POI HSSF's Excel Extractor. It has a main method that says to be the command line extractor. This one seems to just dump everything out. They point out to this example that converts to CSV. You would have to compile it before you can run it but it too has a main method so you should not have to do much coding per se to make it work.

Another option that might fly but will require some work on the other end is to make your Excel files come to you as Excel XML Data or XML Spreadsheet of whatever MS calls that format these days. It will open a whole new world of opportunities for you to slice and dice it the way you want.

Upvotes: 4

Related Questions