Monday, February 20, 2012

xls2csv using python-uno

While several xls2csv converters exist, on a recent assignment none of them were able to convert a multi-sheet, 300+MB Excel file into CSV format.

While searching for possible solutions, I found this (dated) blog entry on using python-uno. After assembling it into a single script and updating for changed LibreOffice arguments, I made it available here. You can check the script using the sample .XLS file with three sheets.

The power of the solution lies in the LibreOffice+UNO (Universal Network Objects) platform it uses. While xls2csv looks like a trivial task, the platform automatically supports reading all spreadsheet types LibreOffice supports. This means that the script might as well be called xlsx2csv or ods2csv.

The script can become a starting point for someone trying to implement complex documents management automation scripts. Leave a comments if you do :) .


Silver said...

I have tested 4 different scripts and php classes to parse xls of just 4MB and all of them made a epic fail, i mean that sometimes the script works fine,but sometimes made a mess mixing the row contents... or just truncate the content, or support few XLS versions, or the csv show some fields and others don't.

LibreOffice show all of them just fine.

Thanks for your script! is a huge help for me.

Daniel G Zylberberg

Arie Skliarouk said...

Glad to see someone actually using my scripts!

Silver said...

Hi there, i'm Silver again, just a question.Do you know if there's a way to get the cell background color?

Thanks a lot!

Arie Skliarouk said...

No idea, but quick google search gave this:

david said...

script works like a charm!!
This saved my day. Many thanks.
BTW: I modified a bit to be python3 compliant. If somebody is interested...

Arie Skliarouk said...

My pleasure!
Yes, send me your python3 fixes and I will put them in.