• William James McEachran
  • About Us

Create Excel files from Legacy system with Libreoffice

Posted on April 24, 2020 by billmc

It’s well known that legacy systems (such as filePro as used here) can create CSV or HTML files that will open with Excel or other Spreadsheet programs. And as useful as CSV files are, there are occasions when a true XLS or XLSX file is preferable. You can create true Excel files from filePro data with some help from additional free/opensource software.

There are a number of methods which can be used to create true Excel files from legacy created output processes. This article explores using Libreoffice to translate legacy created HTML output into an XLS or XLSX file. I think Libreoffice’s file conversion utilities offers the most flexibility regarding input format accepted and output file formats generated.

The same methods used here can be used to translate filePro into other formats supported by Libreoffice —such as DOC/DOCX with minor modifications.

Depending on which platform you are working on, the installation requirements for Libreoffice differ. Please consult the Libreoffice web page for general installation instructions. If installing on Linux/Unix be aware that you may have to install a separate package for libreoffice headless to allow libreoffice to function from a shell script without a graphical display environment.

The first step naturally is to generate your data. I’m going to assume that you know some basic html and are conversant enough with filePro’s various methods of generating text output.

Taking a look at our resulting html file with a web browser.

With a border turned on so better display the tables so you can better evaluate libreoffices conversion of the file to XLS format.

Use Libreoffice to perform the conversion from html to xis

libreoffice --headless    --talc   --convert-to xls  sample.html

Use the ‘-outdir ‘ command to specify a folder, otherwise the file will be written to the current directory.

Use ‘libreoffice -headless -help’ for additional details and options.

One quirk — your source file (here, sample.html) must give the user read, write, and EXECUTE permissions or your conversion will fail. (This may be a quirk of my version at the time of this writing.)

It looks great …

…. but it doesn’t work great.

The problem — date formats are required!

The most obvious issue is that the Date fields are not formatted as dates — they are text! In many cases this may be desired — such as ETL data export creation — but if the data is intended for user manipulation this is an issue.

Attempting to perform date related operations on our date fields will fail as our fields are treated as text.

This will not make our users happy. Below is our simple spreadsheet modified to add 3 days to our enrollment date of 2008-03-01. The user would expect the formula to produce a date of 2008-03-04.

Note that we do not get the expected date result below! Why? Because you can’t perform date operations on a text field.

Using Libreoffice markup tags for date formatting

Fortunately Libreoffice has markup tags which can be included in your source html file which will produce XLS files with date formatting.

The SD tag!

Let’s add Libreoffice’s SD attribute to our data lines.  Change the following lines.

<td>Billy</td>
<td>Batson</td>
<td>Fawcett City</td> <td>2008-03-01</td>
<td>Billy</td> <td>Batson</td>
<td>Fawcett City</td>
<td sdval=”39508″  sdnum=”4105;4105;YYYY-MM-DD”>2008-03-01</td>

Continuing inserting the sdval and sdnum attribute and use libreoffice to convert the html to XLS again — we now produce a file with dates formatted!

Inserting Libreoffice special tags in filePro produced html produces spreadsheet with cells formatted for dates!

What have we done above?

  • “sdval” is the libreoffice tag indicating a formatted value is being passed
  •  “sdnum” defines the format string
  • The date within the <td></td> tags is actually irrelevant — it need not be supplied — the date is the assigned “sdval”

How is “sdval” calculated?

sdval is the number of days between your date and January 1 1900 inclusive.

Pseudo code:
[Your desired date] - [Date Jan. 1 1900] + 2

The best way to discover these tags is to create a spreadsheet in Libreoffice and save it as html. Examining the html code will give you most of the information you’ll need to implement them in your own code.  I have yet to find a comprehensive document on libreoffice tag codes with the Libreoffice special tags document only delivering some hints. [If you find good documentation on these tags please let me know].

After making the SDVAL / SDNUM changes to our data it looks like this

Producing a file that we can modify and use formulas requiring proper date formats in fields.

filepro html with libreoffice attributes-> libreoffice converted -> spreadsheet XLS/XLSX now handles date math!

Conversion trouble shooting

The source files have to be wide open permissions (777 ie. rwxrwxrwx) for the user. This seems very odd indeed and caused me no end of grief.

To determine what was failing I ran a ‘strace’ on the command and reviewed the log file.

strace    -o  -f   /tmp/logfile.log        [scalc  command]

Review the log file for errors … strings to search are “Permission Denied”, “Fail”, “[“

In my case the error the Permission Denied error pointed to the execute permission. With the change made my file converted flawlessly from html to xis.

Libreoffice and file conversions

Creating files for Excel need not be restricted to simple CSV’s.

Experiment with html tags and libreoffices’ convert option. Save some Libreoffice spreadsheets in html format and examine the source code generated.

Libreoffice provides a simple utility for converting your html data into XLS/XLSX and other formats.  Libreoffice is also useful for converting reports into DOC/DOCX and PDF formats. Libreoffice is a wonderful tool to
enhance your legacy systems output.

Posted in Uncategorized |
« Using Linux to convert legacy output to PDF format

Leave a comment Cancel reply

Your email address will not be published. Required fields are marked *

  • William James McEachran
  • About Us

Recent Posts

  • Create Excel files from Legacy system with Libreoffice
  • Using Linux to convert legacy output to PDF format
  • Codeigniter and Netbeans
© William J McEachran