• William James McEachran
  • About Us

Author Archives: billmc

Create Excel files from Legacy system with Libreoffice

Posted on April 24, 2020 by billmc

It’s well known that legacy systems (such as filePro as used here) can create CSV or HTML files that will open with Excel or other Spreadsheet programs. And as useful as CSV files are, there are occasions when a true XLS or XLSX file is preferable. You can create true Excel files from filePro data with some help from additional free/opensource software.

There are a number of methods which can be used to create true Excel files from legacy created output processes. This article explores using Libreoffice to translate legacy created HTML output into an XLS or XLSX file. I think Libreoffice’s file conversion utilities offers the most flexibility regarding input format accepted and output file formats generated.

The same methods used here can be used to translate filePro into other formats supported by Libreoffice —such as DOC/DOCX with minor modifications.

Depending on which platform you are working on, the installation requirements for Libreoffice differ. Please consult the Libreoffice web page for general installation instructions. If installing on Linux/Unix be aware that you may have to install a separate package for libreoffice headless to allow libreoffice to function from a shell script without a graphical display environment.

The first step naturally is to generate your data. I’m going to assume that you know some basic html and are conversant enough with filePro’s various methods of generating text output.

Taking a look at our resulting html file with a web browser.

With a border turned on so better display the tables so you can better evaluate libreoffices conversion of the file to XLS format.

Use Libreoffice to perform the conversion from html to xis

libreoffice --headless    --talc   --convert-to xls  sample.html

Use the ‘-outdir ‘ command to specify a folder, otherwise the file will be written to the current directory.

Use ‘libreoffice -headless -help’ for additional details and options.

One quirk — your source file (here, sample.html) must give the user read, write, and EXECUTE permissions or your conversion will fail. (This may be a quirk of my version at the time of this writing.)

It looks great …

…. but it doesn’t work great.

The problem — date formats are required!

The most obvious issue is that the Date fields are not formatted as dates — they are text! In many cases this may be desired — such as ETL data export creation — but if the data is intended for user manipulation this is an issue.

Attempting to perform date related operations on our date fields will fail as our fields are treated as text.

This will not make our users happy. Below is our simple spreadsheet modified to add 3 days to our enrollment date of 2008-03-01. The user would expect the formula to produce a date of 2008-03-04.

Note that we do not get the expected date result below! Why? Because you can’t perform date operations on a text field.

Using Libreoffice markup tags for date formatting

Fortunately Libreoffice has markup tags which can be included in your source html file which will produce XLS files with date formatting.

The SD tag!

Let’s add Libreoffice’s SD attribute to our data lines.  Change the following lines.

<td>Billy</td>
<td>Batson</td>
<td>Fawcett City</td> <td>2008-03-01</td>
<td>Billy</td> <td>Batson</td>
<td>Fawcett City</td>
<td sdval=”39508″  sdnum=”4105;4105;YYYY-MM-DD”>2008-03-01</td>

Continuing inserting the sdval and sdnum attribute and use libreoffice to convert the html to XLS again — we now produce a file with dates formatted!

Inserting Libreoffice special tags in filePro produced html produces spreadsheet with cells formatted for dates!

What have we done above?

  • “sdval” is the libreoffice tag indicating a formatted value is being passed
  •  “sdnum” defines the format string
  • The date within the <td></td> tags is actually irrelevant — it need not be supplied — the date is the assigned “sdval”

How is “sdval” calculated?

sdval is the number of days between your date and January 1 1900 inclusive.

Pseudo code:
[Your desired date] - [Date Jan. 1 1900] + 2

The best way to discover these tags is to create a spreadsheet in Libreoffice and save it as html. Examining the html code will give you most of the information you’ll need to implement them in your own code.  I have yet to find a comprehensive document on libreoffice tag codes with the Libreoffice special tags document only delivering some hints. [If you find good documentation on these tags please let me know].

After making the SDVAL / SDNUM changes to our data it looks like this

Producing a file that we can modify and use formulas requiring proper date formats in fields.

filepro html with libreoffice attributes-> libreoffice converted -> spreadsheet XLS/XLSX now handles date math!

Conversion trouble shooting

The source files have to be wide open permissions (777 ie. rwxrwxrwx) for the user. This seems very odd indeed and caused me no end of grief.

To determine what was failing I ran a ‘strace’ on the command and reviewed the log file.

strace    -o  -f   /tmp/logfile.log        [scalc  command]

Review the log file for errors … strings to search are “Permission Denied”, “Fail”, “[“

In my case the error the Permission Denied error pointed to the execute permission. With the change made my file converted flawlessly from html to xis.

Libreoffice and file conversions

Creating files for Excel need not be restricted to simple CSV’s.

Experiment with html tags and libreoffices’ convert option. Save some Libreoffice spreadsheets in html format and examine the source code generated.

Libreoffice provides a simple utility for converting your html data into XLS/XLSX and other formats.  Libreoffice is also useful for converting reports into DOC/DOCX and PDF formats. Libreoffice is a wonderful tool to
enhance your legacy systems output.

Posted in Uncategorized | Leave a comment |

Using Linux to convert legacy output to PDF format

Posted on April 22, 2020 by billmc

Get your legacy laserjet (pcl5) or postscript output converted to PDF format.

Legacy systems are still out there! Some are still generating reams of paper and this can be easily avoided with a variety of Linux tools to convert legacy output into PDF format.

This article discusses some simple linux tools that can be used to change print outputs into more useful PDF files.  Let’s begin with a quick look at some of simple Linux  tools that can be used to convert text/html to PDF format.

Converting to PDF format

Perhaps the easiest way to get your legacy print jobs into PDF format is to pipe your text (or laserjet coded) print file through one of the many of the linux utilities available in most distributions (or easily installed).  Let’s take a very quick look at a few of your options.

Postscript output and Ghostscript

Most Linux distributions pre-install Ghostscript (the binary is ‘gs’) or make it readily available via their package management systems.  If you prefer you can also install it yourself from source obtained from the ghostscript web site.

Ghostscript will accept postscript  input and will output to PDF format.  If you’ve been printing to postscript printers you’ll be able to preserve all formatting.

Most Linux printers pipe their output to a spooler; something like this:

cat samplePS.ps | gs  -dNOPAUSE   -sDEVICE=pdfwrite -sOutputFile=OutputFromPS.pdf -

or, if you’re dealing with an final file

gs  -dNOPAUSE   -sDEVICE=pdfwrite -sOutputFile=OutputFromPS.pdf samplePS.ps

PCL (laserjet) Output and GhostPCL (pcl6)

GhostPCL is developedt by the same project that publishes the familiar ghostscript.  The great feature of ghostpcl is that it accepts laserjet print codes.  This makes it simple to take your print jobs formatted for laserjet printers and convert them to PDF without losing the formatting (or graphics or other special print formatting).

Ghostpcl is available the ghostscript website.  The binary is ‘pcl6’.

Redefine your printer  piping the output through pcl6 rather that to a printer

cat samplePCL.prn | pcl6  -dNOPAUSE   -sDEVICE=pdfwrite \
-sOutputFile=OutputFromPCL.pdf -

or

pcl6  -dNOPAUSE   -sDEVICE=pdfwrite \
-OutputFile=OutputFromPCL.pdf  samplePCL.prn

PJL commands in the file seem to cause pcl6 some issues.  There is a command line flag for PJL commands.

Text Output and enscript and PS2PDF

In legacy systems you’ll no doubt encounter many print jobs that are plain text — not postscript or pcl (laserjet). Linux has a number of tools to convert text to PDF.

enscript -p - -r -f Courier7 -M Letter -B -L 60 \
-c sampleTEXT.txt \ | gs  -dNOPAUSE \
-sDEVICE=pdfwrite -OutputFile=OutputFromTXT.pdf -

Libreoffice to convert to PDF

libreoffice  --headless --convert-to pdf sampleDocForPDFArticle.odt

This creates a file ‘sampleDocForPDFArticle.pdf’ in the current directory.  The output directory can be specified on the command line.

Note that Libreoffice can convert to a number of other formats, including html.

In most Linux distributions the ‘headless’ version of Libreoffice is a separate installation from the more common version. If you find this command doesn’t work search for ‘libreoffice headless’ in your distribution’s repository.

Joining PDF files

gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite \
-sOutputFile=finished.pdffile1.pdf file2.pdf

gs -dNOPAUSE -sDEVICE=pdfwrite \
-sOUTPUTFILE=singleCombinedMultipagePdfFile.pdf \
-dBATCH  tmp_1,1.pdf tmp_1,2.pdf

See also: ‘pdftk’ which can join PDFs and much more, though I’ve found using ‘gs’ to be much more robust.

Splitting PDFs

gs -dBATCH -sOutputFile="$4" -dFirstPage=$2 \
-dLastPage=$3 -sDEVICE=pdfwrite "$1" >& /dev/null

gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER   -dFirstPage=22 \
-dLastPage=36  -sOutputFile=outfile_p22-p36.pdf 100p \
-inputfile.pdf

 

See also: ‘pdftk’ which can split PDFs (and much more), though I’ve found ‘gs’ to be much more robust.

Rotating PDF files

This can be useful if you find your PDF’s aren’t being oriented properly.

It can also rotate PDF’s made from PCL (by converting to postscript first).

The desired gs command argument ‘AutoRotatePages’ requires Postscript input.  The method is converts the original PDF to postscript and make the rotation.

pcl6 -dNOPAUSE -dBATCH  -sDEVICE=ps2write \
-sOutputFile=- original.pcl | gs -dNOPAUSE -dBATCH \
-dAutoRotatePages=/All -q -sDEVICE=pdfwrite   \
-sOutputFile=new_file.pdf

Also consider ‘pdftk’, available in most distributions, which can easily rotate PDF files.

Note that the pdftk method for rotating is faster.
pdftk original_wrong.pdf cat 1-endeast output new_orientation.pdf

I’ve come to prefer using ‘gs’ (and it’s sibling ‘pcl6’) over ‘pdftk’ as it’s likely to already be installed and it seems more reliable and produces a better PDF. Extract text from PDF

pdftotext -layout OutputFromTXT.pdf

Produces a file ‘OutputFromTXT.txt’

pdftotext -layout OutputFromTXT.pdf test.txt

Produces file text.txt in the current directory.

PDFTK

I’ve come to prefer using ‘gs’ (and it’s sibling ‘pcl6’) over ‘pdftk’ as it’s likely to already be installed and it seems more reliable and produces a better PDF.   

However, ‘pdftk’ can do many useful things not covered in this article.  ‘pdftk’ is worth considering if you need to:

  • manipulate PDF metadata,  
  • set file encryption passwords,
  • compress/uncompress,
  • attach files to your PDF (not all PDF readers will recognize the attached files),
  • include X/FDF Data, and
  • set ‘stamps’ on the background of your PDF.  

We’ll save ‘pdftk’ for a future article.

Posted in Uncategorized | Leave a comment |

Codeigniter and Netbeans

Posted on April 22, 2020 by billmc

This is a short article for configuring Netbeans for use with Codeigniter. It covers configuring the debugger and code completion.


These steps work for Netbeans 7.0 and Codeigniter 2.0.3. I suspect that they work fine with earlier version of both programs but have not been thoroughly tested.

I’m using XAMPP on Windows. These instructions may not apply to other AMP stacks or operating systems.

The procedure below is not my creation it has been pieced together taken from various web sources
including brettic.us, Tohan, Sturgeon. robsnotebook, leonardteo and various posts on the Codeigniter forum and other forums.

XDEBUG

The first prerequisite is the xdebug program.

Go to http://xdebug.org/. Follow the installation instructions carefully for your OS.


PHP

Modify PHP php.ini to use xdebug.

  • comment out all references to ZEND
  • Add the following section.

[XDebug]

zend_extension = C:\xampp\php\ext\php_xdebug-2.1.2-5.3-vc6.dll

xdebug.profiler_append = 0

xdebug.profiler_enable = 0

xdebug.profiler_enable_trigger = 0

xdebug.profiler_output_dir = "C:\xampp\tmp"

xdebug.profiler_output_name = "xdebug_profile.%R::%u"

xdebug.remote_enable = On

xdebug.remote_handler = "dbgp"

xdebug.remote_host = "localhost"

xdebug.remote_port = 9000

If you’re unsure what any of the above do please consult the xdebug documentation

  • Restart Apache.
  • Now test with a PHP page containing phpinfo();

Check that xdebug is running and that the various settings are now correctly set.

Browser Plug-in

Firefox

Download the Firefox plugin ‘Easy Xdebug 1.5’. Make sure it’s enabled.

Chrome

Install ‘xdebug helper’ from the ‘Chrome Store’

  • Simple test with simplest version of CI.
  • download a fresh CI.
  • Make the following changes to application/config of the default version

$config['uri_protocol'] = 'PATH_INFO';

$config['enable_query_strings'] = TRUE;

$config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-';

Check that it works

  • on the welcome page index function add a few lines setting and displaying a variable
  • set a break point on that line – run debug
  • hit the Netbeans debug button — the application should start.

You should see an indication that xdebug and netbeans are in communication.

Set Netbeans to NOT stop on the first line … this keeps things more or less to your code eliminating debugging through the various ‘system’ folder code (if they start at line 1 … most don’t but it helps a bit).

Alternative Setup We’ve got htaccess set to hide the index.php … so we have to compensate.

$config['permitted_uri_chars'] = "\'; //a-z0-9~%.:_\-";

$config['enable_query_strings'] = TRUE;

$config['uri_protocol'] = "AUTO";

$config['index_page'] = "index.php/";

$config['css'] = '../application/css/styles.css';

$config['images'] = '../application/images/';

$config['jss'] = '../application/jss/jquery.js';

$config['log_threshold'] = 0;

To test … put a breakpoint in controller login.php index. Then hit the debug button. You should now get all the debugger buttons activated. 

If you’ve set a break point the code will be stopped at the appropriate line. 

It might be possible to eliminate the $config[‘index_page’] by removing the contents of the ‘Index File’ in the project properties (I haven’t tried yet). 

CODE COMPLETION

  • In the netbeans add a file with the php extension (any file name will do … I’ve choosen netbeans_ci_code_completion.php
  • Cut and paste the following into this file:

<?php

/**

* @property CI_DB_active_record $db

* @property CI_DB_forge $dbforge

* @property CI_Benchmark $benchmark

* @property CI_Calendar $calendar

* @property CI_Cart $cart

* @property CI_Config $config

* @property CI_Controller $controller

* @property CI_Email $email

* @property CI_Encrypt $encrypt

* @property CI_Exceptions $exceptions

* @property CI_Form_validation $form_validation

* @property CI_Ftp $ftp

* @property CI_Hooks $hooks

* @property CI_Image_lib $image_lib

* @property CI_Input $input

* @property CI_Language $language

* @property CI_Loader $load

* @property CI_Log $log

* @property CI_Model $model

* @property CI_Output $output

* @property CI_Pagination $pagination

* @property CI_Parser $parser

* @property CI_Profiler $profiler

* @property CI_Router $router

* @property CI_Session $session

* @property CI_Sha1 $sha1

* @property CI_Table $table

* @property CI_Trackback $trackback

* @property CI_Typography $typography

* @property CI_Unit_test $unit_test

* @property CI_Upload $upload

* @property CI_URI $uri

* @property CI_User_agent $user_agent

* @property CI_Validation $validation

* @property CI_Xmlrpc $xmlrpc

* @property CI_Xmlrpcs $xmlrpcs

* @property CI_Zip $zip

*/

class CI_Controller {};

/**

* @property CI_DB_active_record $db

* @property CI_DB_forge $dbforge

* @property CI_Config $config

* @property CI_Loader $load

* @property CI_Session $session

*/

class CI_Model {};

?>

You will now have codeigniter code completion. 

Posted in Uncategorized | Leave a comment |
  • William James McEachran
  • About Us

Recent Posts

  • Create Excel files from Legacy system with Libreoffice
  • Using Linux to convert legacy output to PDF format
  • Codeigniter and Netbeans
© William J McEachran