opendocument

Reading OpenDocument office files from Python

20 January, 2006 - 19:58

The OpenDocument file format (aka "OASIS Open Document Format for Office Applications"), is an open and free standard for office files. It's fairly easy to read OpenDocument files in/from Python. Basicly, an OpenDocument file is just a zip archive but with another extension (".ods" spreadsheets, ".odt" for text documents, ".odg" for graphics and so on). The files in the zip file are mainly some XML files, like content.xml, settings.xml and styles.xml.

Basicly, we just need two standard python modules from the nice standard Python library to extract data from a OpenDocument File: zipfile for handling the zip compression and xml.parsers.expat (or another xml parser module) for parsing the xml. A possible/simple/minimal way to do read a fictional spreadsheet file pelican.ods is as follows:

Read more...