Archive for June, 2009

Bazaar – Simple Yet Powerful

Sunday, June 28th, 2009

This project uses Bazaar to provide versioning control.  AlthoughBazaar is very powerful, it is also very easy to start using.  Here are some of my most commonly used commands:

bzr add – adds file to versioning control

bzr commit -m “Message in here about changes” – takes the copies and creates a new version

bzr push – pushes the revision out to Launchpad

bzr ls -V – lists all versioned files (if any are missing just use add)

Candy, Community, Comfort, Credibility etc

Monday, June 22nd, 2009

I have just looked at a range of general/basic open source statistics programs.  Some had extensive lists of tests available.  Some had attractive output.  And some made it easy to edit or import data.  But I couldn’t help feeling your typical business analyst, school student, or medical/social sciences researcher with rusty statistics skills would feel quite daunted by the offerings I experimented with.  Which got me thinking about the different use cases for general purpose statistics/analysis/reporting applications.

So what should the focus be when designing SOFA Statistics and what messages should be communicated and to whom?

Here are some messages that could be made by a statistics/analysis/reporting application:

  • candy – beautiful output, attractive website, splashscreen, dialogs etc
  • comfort – easy interface, lots of help at appropriate level
  • communication – stats are well explained e.g. difference between mean and median
  • correctness – stats you can trust, transparent, verifiable, certified by experts
  • community – help is available for whatever level you are at (school homework, business results, advanced stats, integration with Office suites etc)
  • credibility – backed by a real company, going to be here for the long run, reference group has an impressive membership with good  coverage, people have appropriate qualifications etc
  • continuity/compatibility – no need to abandon existing data to start getting benefits of new system.  Has special “Help for users of [popular stats program name here]” etc.
  • code – using the right software, the coolest programming tricks etc
  • cheap – no money and little time required to use
  • customisability – can make work with other systems, can integrate with other systems, can automate processes

And lining these messages up with potential groups they might appeal to:

  • schoolkids – cheap, comfort, communication, community, and candy
  • teachers – cheap (students can use it), comfort, communication (educational), community (start sharing stats teaching resources while they are at it), correctness
  • university students (social sciences etc) – cheap, communication, community, correctness (so they’ll be allowed to use it)
  • university students (statistics – starting off) – same as social science students
  • statisticians (academic, professional) – correctness (paramount), credibility, continuity, customisability (can extend for special needs), cheap (they already have licenses for other products, plus they may want clients to do preliminary analyses using a free product they know themselves)
  • business analysts – continuity/compatibility (must work with Excel, Word, mainstream web browsers etc), candy (produce lots of reports that managers like to look at and show others), comfort, communication (may be very rusty on stats skills), credibility (a must-have for this group), customisability (want to be able to automate processes e.g. reports), community (where people show them how to automate things, tricks to get problems solved etc)
  • social science researchers – credible (so they can publish based on the data), candy, continuity/compatibility (so they can fall back on an established stats program if there is a problem or if SOFA can’t do something they need), comfort (may be good at social sciences but not computers/programming etc)
  • school administrators – cheap, customisable (for their curriculum)
  • business integrators – customisable, code (developers become more important and they care about code), cheap (so they can make their profit too), compatibility (with all the systems, input and output, they want to integrate with), credibility (can they make deals with you, will you be around in 5 years time?)
  • geeks/developers/coders – code

Of course, having a message is one thing – delivering on it is another.  But it is important to have a clear sense of a project’s priorities and a clear message to take to different groups.

0.7.4 can import from Excel

Saturday, June 20th, 2009

SOFA Statistics has now reached the point where you can probably always get data into it.  The lowest common denominator is the CSV (comma separated values) file, or an Excel spreadsheet (which you could always make in Open Office if you don’t have MS Office), or a MySQL, MS Access, or SQLite database.

Here is the list of main changes:

  • Now able to import from Excel spreadsheets.
  • Importing can now be cancelled.
  • There is a progress bar while importing.
  • CSV importing gives the option of fix and continue if import has problems.
  • Bug fix – can now cope with CSV files with more columns.
  • Bug fix – now able to create projects even if default project selected.
  • Bug fix – can now select database files without file extensions e.g. SQLite databases.

The 0.8 series should be starting soon, with an emphasis on statistical tests like the t-test, Chi Square etc.  Once those are in place, I will start to more heavily promote SOFA Statistics.

SOFA Statistics 0.7.3 has CSV importing

Monday, June 15th, 2009

SOFA Statistics 0.7.3 has just been released.  I have:

  • Added ability to import from csv files.
  • Fixed misc data table editing bugs.
  • Additional error trapping in summary report tables.

Plus there have been numerous small improvements above and under the hood.

Adding ability to import from csv and spreadsheets etc

Tuesday, June 9th, 2009

SOFA Statistics is having new import functionality added.  The first target is csv format files (using the standard Python csv module underneath) followed by Excel spreadsheets.  The solution I have for Excel works even when MS Office has not been installed on a machine but will only work in Windows.   Later on I will target SPSS data files and Open Office Calc spreadsheet files.

Video tutorial for making report tables with SOFA Statistics

Wednesday, June 3rd, 2009

The main SOFA Statistics website now has a videos page (  The first video shows how to make report tables based on your data.  Feedback suggests people find the video format helpful so the existing video will be improved and extended.  As functionality is added to SOFA, there will be other videos as well.  The video is hosted at, which I can highly recommend.

SOFA Statistics 0.7.2 released

Tuesday, June 2nd, 2009

Version 0.7.2 includes a lot of finishing touches for the Make Table functionality of SOFA Statistics.

  • Output is automatically saved into report file when make table runs a table
  • Displayed output informs user where it has also been saved to
  • OK and Cancel have been replaced by the more logical Close on the Data selection form
  • Exported scripts work from any folder, not just from program folder
  • Buttons that have no associated functionality inform user
  • Corrected button display bug when clicking Clear after having been in Raw tables
  • Copyright symbol appears correctly in Windows
  • Make table form displays all buttons when on smaller screen

The Windows installer is a little more polished as well.