Archive for the ‘developers’ Category

Honey I Shrunk the Installers

Monday, December 19th, 2011

The SOFA installers for Windows and Mac have shrunk substantially – from 43MB to 25MB for Windows and from a rather hefty 85MB to 36MB for Mac. They’ll be quicker to download, and the new installers also avoid possible conflicts with other Python packages on a system. It’s all self-contained. A final benefit is that the installation process itself has become much simpler, with much fewer steps. For those who are technically minded, it is thanks to pyinstaller and py2app (with some initial help from Gui2exe).

Making better installer for SOFA using Pyinstaller

Friday, December 9th, 2011

As SOFA Statistics has gained more functionality it has grown in complexity – there are modules for reading Excel spreadsheets, connecting to Google Docs spreadsheets, displaying charts, displaying GUI widgets etc. Trying to make a single executable for Windows users was always going to be a challenge and would probably involve a lot of trial and error. So it proved.

But there was one technique I used to make the seemingly impossible task manageable. I made a single python script I called which was responsible for importing all the main modules the executable would need to handle (e.g. matplotlib, MySQLdb etc). I identified the imports I would need by looking at each and every main module in SOFA and adding any external library module imports not already included.

The process of making an executable failed initially, so by variously commenting and uncommenting parts of the launch script I was able to isolate problem modules and fix them. To get PostgreSQL working, for example, I needed to add the following fix:

    # I needed to add the Postgres library directory to the PATH
    # variable in Windows. Apparently when Postgres is installed under Windows as a
    # service, this isn't done automatically (no need to) so that library isn't
    # available. []
    # OK to hardwire to version available to my installer dev environment. The user experience
    # will depend on whether they have set the PATH properly.
    os.environ['PATH'] += ";C:\\Program Files\\PostgreSQL\\9.1\\bin"
    import pgdb
except ImportError, e:

Here is the full text of

#! /usr/bin/env python
# -*- coding: utf-8 -*-

from __future__ import absolute_import
from __future__ import division # so 5/2 = 2.5 not 2 !
from __future__ import print_function

# remove import __future__ from dbe_sqlite
import cgi
import codecs
from collections import defaultdict
from collections import namedtuple
import copy
import csv
import datetime
import decimal
import gettext
import glob
import locale
import math
from operator import itemgetter
import os
import platform
import pprint
import random
import re
import shutil
import socket
import subprocess
import sys
import time
import traceback
from types import IntType, FloatType, ListType, TupleType, StringType
import warnings
import weakref
import webbrowser
import xml.etree.ElementTree as etree
import zipfile

# Even though not used here pyinstaller won't know about it otherwise
# and will not have it when encountered in etc

import MySQLdb as mysql
    # I needed to add the Postgres library directory to the PATH
    # variable in Windows. Apparently when Postgres is installed under Windows as a
    # service, this isn't done automatically (no need to) so that library isn't
    # available. []
    # OK to hardwire to version available to my installer dev environment. The user experience
    # will depend on whether they have set the PATH properly.
    os.environ['PATH'] += ";C:\\Program Files\\PostgreSQL\\9.1\\bin"
    import pgdb
except ImportError, e:
import sqlite3 as sqlite # using sqlite3.dll from Python 2.7 so includes foreign key support

#import wxversion"2.8") # Not needed when using executable.

if not hasattr(sys, 'frozen'):
    import wxversion'2.8')
import wx 
import wx.lib.iewin as ie
import wx.gizmos
import wx.grid
import wx.html
    from agw import hyperlink as hl
except ImportError: # if it's not there locally, try the wxPython lib.
    import wx.lib.agw.hyperlink as hl

# problem locating eggs folder - solution in
# change pyinstaller-1.5\support\
#if os.path.isdir(d):
#    for fn in os.listdir(d):
#        sys.path.append(os.path.join(d, fn))

import numpy as np
#if hasattr(sys, 'frozen') and sys.frozen:
#    import
#    sys.modules[''] = sys.modules[''] 

# if include matplotlib before sys.path, matplotlib.collections shadows collections and won't find namedtuple

# Currently problem with Path in environment MATPLOTLIBDATA not a directory
# Must put mpl-data folder in same folder as the executable is finally run from

import matplotlib
#import matplotlib.numerix as Numerix
#from matplotlib.axes import _process_plot_var_args
#from matplotlib.backend_bases import FigureCanvasBase
#from matplotlib.backends.backend_agg import FigureCanvasAgg, RendererAgg
#from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg
#from matplotlib.figure import Figure
#from matplotlib.font_manager import FontProperties
#from matplotlib.projections.polar import PolarAxes
#from matplotlib.transforms import Bbox

# connected to matplotlib
# don't exclude Tkinter, Tkconstants
import wxmpl
import pylab # must import after wxmpl so matplotlib.use() is always first
# don't import boomslang - trouble with import pylab in many cases, even import math.
# works fine if matplotlib baked into exe
#import boomslang

# no need to bake googleapi in as nothing installed as such. Just ensure not using stale pycs from Ubuntu system.
#import googleapi
# problem with import os etc if using below
#import googleapi.gdata.spreadsheet.service as gdata_spreadsheet_service
#import googleapi.gdata.spreadsheet as gdata_spreadsheet
#import as gdata_docs_service
#import googleapi.gdata.service as gdata_service

# no need to bake xlrd in as nothing installed as such. Just ensure not using stale pycs from Ubuntu system.
#import xlrd

import adodbapi
import pywintypes
import win32api
import win32con
import win32com
import win32com.client

import dao36_from_genpy # go to makepy/genpy and look in py files till found - taken and rename and relocate so can directly call

import import2run

The code for SOFA is cross-platform and I start the Windows packaging process by copying everything across from Ubuntu. It is important in such a case to wipe all pyc files so that platform-specific ones are created for Windows and included in the executable creation process.

The final import statement is for This means that the executable doesn’t hardwire anything beyond the imports. As it happens I started by having import2run contain just the following line:


Later, once all the basic imports were working, I changed it to:

import start

to actually load SOFA. NB the executable created using the technique described here doesn’t replace all the SOFA modules with a single executable – its purpose is to replace Python and all the extra libraries such as matplotlib. So the exe is expected to live in the main SOFA program folder (usually in C:\Program Files\sofastats) alongside the usual modules such as If a user actually had Python 2.6 and all the libraries installed they could either use the exe or run directly themselves. It would have the same effect.

Getting matplotlib to work took a while and involved many false leads. In the end the solution was to copy the entire mpl-data folder (from somewhere like C:\Python26\Lib\site-packages\matplotlib) into the same folder as the sofastats.exe was going to end up.

Some final things I learned about Pyinstaller. –onedir is the default and adds the coll = COLLECT(…) part of the spec file. If making manual changes remember that if you want the onedir approach, don’t include a.binaries in the EXE(…) part and exclude_binaries should be True. If, like myself you want a single executable file, don’t bother with coll = COLLECT(…), include a.binaries, and set exclude_binaries to False. And while testing set debug=True and Console=True so you can see what is going wrong as you refine your spec file, script etc.

Although GUI2EXE is a wonderful program some aspects may not be compatible with Pyinstaller 1.5.1 so I now build my spec file using with the –onefile argument. It works in its basic vanilla form for SOFA using You can export the spec file GUI2EXE makes and see the differences.

Here is the final spec file I used:

# -*- mode: python -*-
# used MAKESPEC 1.5.1 with --onefile option
# NB must include mpl-data folder under main sofastats level (i.e. sibling of dbe_plugins etc) for matplotlib to work
# manually set level=9 in PYZ params (inspired by how GUI2EXE did it)
# manually replaced name=os.path.join('dist', 'launch.exe'), with name='C:\\sofastats_build_exe\\sofa.main\\sofastats.exe',
# manually set debug=True, upx=False in EXE params
# manually set exclude_binaries=False in EXE params

a = Analysis([os.path.join(HOMEPATH,'support\\'), os.path.join(HOMEPATH,'support\\'), 'C:\\sofastats_build_exe\\sofa.main\\'],
pyz = PYZ(a.pure, level=9)
exe = EXE( pyz,
          console=True )

Before going live switch debug and console to False.

This post is largely specific to SOFA Statistics but hopefully it includes some tips which might save others a lot of fruitless struggle. If you have trouble, I found the pyinstaller mailing list people helpful.

0.9.3 adds clustered bar charts to Chi Square test

Monday, February 1st, 2010

0.9.3 has nice new graphical output for the Chi Square Test and a few other enhancements. At least as important, however, are all the bug fixes. These are the result of a new pre-release testing process.

Underlying the clustered bar charts is the boomslang library, which provides a simplified interface to common matplotlib charts. What a great idea, and what a great name for a Python library.

Summary of new features in version 0.9.3:

  • Chi Square output includes clustered bar charts to display proportions and frequencies for the two variables selected.
    Chi Square output clustered bar charts

    Chi Square output clustered bar charts

  • Drop-downs default to the most recently used database and table. This recognises that most of the time you are using the same table as you used in the last analysis.
  • More helpful messages if trying to use variables with too many values for Chi Square.

Bug fixes:

  • Fix for Linux users with a 4-digit year date format.
  • Fixed encoding display issue for Windows users.
  • Miscellaneous fixes to the behaviour of the table design dialog. Numerous bugs were flushed out by more extensive user testing before release.
  • The Expand button is disabled if a report runs but not successfully (e.g. returns a warning).
  • The default database and table are saved correctly according to database engine (e.g. MySQL, MS Access etc). This ensures valid projects can always open.

Creating GUI tests

Monday, November 23rd, 2009

Over time, the goal is to extend the test coverage of SOFA Statistics. The GUI side of things needs to be included in this. Here is a link to some good resources:

Misc translation issues

Thursday, October 29th, 2009

The Galician translation is now complete.

e.g. #: dbe_plugins/
msgid “The SQLite details are incomplete”
msgstr “Os datos de SQLite están incompletos”

It is now time to enable SOFA Statistics to use multiple translations successfully. Here are some possible issues:

  • Overly-long strings: these can affect layout e.g. the buttons on the main form. There may be ways of abbreviating strings.
  • A locale not being installed on a computer. Lots more to learn about this but here is a linux command for identifying what is on your system
    locale -a
  • How to allow a user to select a locale (or to automatically use the locale of their computer).
  • Making sure everything works on Windows as well.
  • Getting the Galician po file approved by Launchpad (currently stuck in “Needs Review”)

Testing will begin soon. Most importantly, the internationalisation of SOFA Statistics has begun in earnest :-).

Making beautiful output using SVG and JavaScript

Saturday, October 24th, 2009

The charting functionality of SOFA Statistics is not available yet but the technology required is coming together. At the current time the intention is to use the gRaphaelJS library ( to create the charts and wxWebKit (wxWebKit progress) to display it. The goal is to have beautiful output without using a proprietary technology such as Flash (which also has printing problems). The gRaphaelJS library is still only version 0.2 but progress has been rapid. DmitryBaranovskiy is doing a great job (

Multi-language SOFA Statistics Begins

Saturday, October 24th, 2009

Launchpad offers great support for translating applications into different languages (  And Python (and wxPython have standard ways of supporting multiple languages.  So it was always going to be achievable to make SOFA Statistics multilingual as long as people were willing to help with translation.  First to raise their hand has been Indalecio Freiría Santos (see SOFA Statistics discussion thread) and the Galician version should be available first.  If you are interested in adding translations please feel free to raise your hand in the discussion group at any time.

Vista and Win 7 Permissions in SOFA Statistics Installer

Tuesday, October 6th, 2009

Successfully installing an NSIS-created package requires some attention to the permissions of the person doing the installation onto their Windows machine.

A useful discussion of permission levels is here –

If a user does not install SOFA Statistics with the appropriate permissions they might receive an error message like:

Error opening file for writing:
C:\Program Files\….

This may occur even if there is no folder called Program Files e.g. if they are installing onto a Swedish version of Windows.  See

If it is necessary to check if a user is installing with administrator permissions, the following may be useful –

Installing missing dlls in Windows for SOFA Statistics

Tuesday, October 6th, 2009

Creating a Windows installation package that works on everything from XP Home Edition to Vista 64-bit Business Edition is manageable but not exactly trivial.  Sometimes a single file can create a lot of issues e.g. msvcr71.dll (See To ensure this file is available on the target computer it is not simply a matter of transferring the file in the same way that other files are transferred.  The correct approach using NSIS is to run InstallLib.

The following item was helpful – The NSIS documentation of relevance is here –

The snippet of code used in the latest SOFA Statistics package for Windows is:

IfFileExists "$PROGRAMFILES\sofa\start.pyw" 0 new_installation



!insertmacro InstallLib REGDLL $ALREADY_INSTALLED REBOOT_NOTPROTECTED “G:\3 SOFA dev\sofalibs\msvcr71.dll” $SYSDIR\msvcr71.dll $SYSDIR

0.8.6 supports PostgreSQL and has better output formatting

Monday, August 24th, 2009

New features:

  • Added support for PostgreSQL databases.
  • Each item of output now has a preceding display line and a description of its data source (database and table) and when it was created.
  • Improved layout of exported scripts.
  • Added unit tests for main statistical algorithms used.
  • Better handling of timestamp and autonumber fields in data entry/editing.

Bug fixes:

  • Fixed script export bug.

Additionally, the Windows package now installs a menu shortcut for uninstallation. It always should have, of course, but the latter is still an example of a little thing which makes newer versions of SOFA Statistics nicer to use. The idea is that, collectively, thousands of details like that will create a sense of polish. The Ubuntu 100 papercuts project is one inspiration.