Thursday, February 23, 2012

Playlist detection on python

There are several popular playlists used for internet radio streaming: asx, m3u, smil, asf, pls... The playlist extension is not always visible in the URL that an ipradio station returns. Furthermore, the MIME type is not always correct. Sometimes one has no choice but to guess the playlist type by looking for the presence of certain markers.

After fruitless searches on the internet, I wrote an python library that detects playlist type. You can download it here.

I am sure that the detection algorithm can be improved. Please write suggestions in the comments.

Easy creating and destroying of lxc vservers

One big advantage of using amazon's EC2 servce is its deceptively easy way to create new servers - it is just API call or mouseclick away. The actual physical machine allocation, OS image copying, hostname, IP, dns, firewall and routing setup are done transparently in the background. You are only required to setup the payment ;) .

Big enterprise virtualization technology providers already have a mechanism to create virtual machines with given parameters on demand. Such a system would likely be overkill for a handful of physical servers. I also doubt they support the highly effective pseudo-virtualization LXC technology.

Thus I created a python "lmachine" script for our LXC servers, that copies an OS image into a selected "slot" with already pre-allocated IP number and server name. The script also modifies a couple of system files from the OS image accordingly.

The script can be downloaded here.

Preparing an LXC image

LXC images are easily created using the debootstrap utility. To make them ready for the lmachine script, you need to set the actual IP number with the IPNUMBER string and the actual server name with the VSERVERNAME string. Here is the list of files that needs to be modified:
  • fstab
  • lxc.conf
  • root/etc/network/interfaces
  • root/etc/hosts
  • root/etc/hostname
  • root/etc/mailname
Make sure you pre-allocate IP numbers and ltestXX DNS domain names in advance. Update the lmachine script with the list of pre-allocated IP numbers.

Monday, February 20, 2012

xls2csv using python-uno

While several xls2csv converters exist, on a recent assignment none of them were able to convert a multi-sheet, 300+MB Excel file into CSV format.

While searching for possible solutions, I found this (dated) blog entry on using python-uno. After assembling it into a single script and updating for changed LibreOffice arguments, I made it available here. You can check the script using the sample .XLS file with three sheets.

The power of the solution lies in the LibreOffice+UNO (Universal Network Objects) platform it uses. While xls2csv looks like a trivial task, the platform automatically supports reading all spreadsheet types LibreOffice supports. This means that the script might as well be called xlsx2csv or ods2csv.

The script can become a starting point for someone trying to implement complex documents management automation scripts. Leave a comments if you do :) .