Archive for July, 2009

0.8.4 adds new features, and lots more polish

Friday, July 31st, 2009

The latest version includes a lot more polish and has many rough edges removed.  There are also several important new features:

  • Users can explicitly set variables to Nominal, Ordinal, or Quantity.  These settings are used to limit the variables displayed in various tests to those which are of the appropriate type.
  • Pearson’s Chi Square now has a contingency table with both observed and expected values.
  • The one-way ANOVA and Kruskal Wallis H now provide more information in the output e.g. average rank per group.
  • Variables can be configured from a right click when running statistical tests (as is already possible when making report tables).
  • Can select statistical tests by double clicking.  These are also sorted alphabetically to make direct selection faster.
  • New way of indicating a set of values has been limited to the first 20 unique values no longer disrupts the user from making selections.
  • Test data now includes a string variable (browser).

Bug fixes:

  • Numerical values appear in numerical rather than string order when configuring variable details.
  • No longer necessary to complete MS SQL Server details merely for having plugin installed.
  • Statistical output works even if variable is a string variable.
  • Minor problem with start screen positioning on dual monitors resolved (adequately)

Please let me know what you think.  Is the project heading in the right direction from your point of view?

The next lot of development will focus on subsidiary charting (as opposed to charting for main output).  E.g. assessing normality of data before choosing the appropriate test.

0.8.3 supports MS SQL Server – also misc improvements and some important bug fixes

Saturday, July 18th, 2009

The latest release not only adds some important new functionality, it cleans up a lot of existing code and fixes a lot of bugs.

  • Supports direct connection to MS SQL Server databases.
  • Previous selections become the defaults when configuring statistical tests.
  • Labels are consistently updated in the statistical test dialogs if a new label file is selected.
  • Plus lots of important bug fixes and usability improvements.
  • Fixed installation bug affecting multi-user Windows installations.
  • Fixed bugs when connecting to MS Access depending on data type.
  • Fixed bug preventing tab traversal of Projects form.
  • Fixed bug when entering data in Windows.
  • Fixed bug when variable in independent tests had large number of unique values.
  • Fixed various bugs occurring when changing database or table selection.

Please report bugs – it’s good for the project

Saturday, July 18th, 2009

Bugs are never welcome, but the only thing worse than a bug is a bug you don’t know about and could easily fix.  Even worse, an unknown bug could put some people off using your software, which is not a good outcome for anyone.  So how do you report a bug in SOFA Statistics?  Fortunately, Launchpad (which is where the SOFA Statistics source code lives) makes bug reporting easy.  Just go to: https://launchpad.net/sofastatistics/+filebug/+login and register the bug.  I’ll do my best to fix it and keep everyone informed along the way.

Remember – reporting a bug is an act of kindness so please don’t hold back.  Your report could help many other users.

0.8.2 adds final tests

Friday, July 10th, 2009

Finally!  Version 0.8.2 is the first with all the core statistical tests functional.  This version:

  • added one-way ANOVA.
  • added Kruskal-Wallis H.
  • added Pearson’s Chi Square.
  • and fixed startup bug affecting Windows users on networked drives.

Of course, the best is yet to come.  You may have noticed that huge empty space in the dialog for configuring a test.  Plus those disabled buttons about reporting level (“results only” through to “full explanation”).  I will be coming back to flesh out those areas.  The intention is that the user will be supplied with little visualisations of the actual data so they can see whether the test is appropriate or not (all explained in words as well with Help on hand).  E.g. a histogram of each sample so the shape of each distribution is visible at a glance.  Plus a small test and its interpretation which lets you know whether the test is usable or not.  The user shouldn’t have to know about, or remember to use, tests of kurtosis or skew or equality of variance.  They should simply choose the most likely test to be appropriate and have SOFA Statistics explain to them whether it will work or not based on the actual data being analysed (along the way some of these ideas are bound to rub off, of course). And when you click on the Help button next to the buttons on Normal/Not Normal, SOFA Statistics should not only explain the concept (with a couple of simple images), but also enable you to visualise the data and run the appropriate tests to decide if it is Normal or not.

0.8.1 adds 4 new tests inc Mann Whitney U and Pearson’s R

Saturday, July 4th, 2009

Release 0.8.1 of SOFA Statistics added 4 new tests:

  • the Mann Whitney U
  • the Wilcoxon Signed Ranks
  • Pearson’s Correlation
  • Spearman’s Correlation

At this stage only the raw results are presented but the intention is to let users choose the level of explanation they want in their output.  Downloads are available from:

http://www.sofastatistics.com/downloads.php

SOFA Statistics and R

Friday, July 3rd, 2009

Someone asked me recently about the difference between R and SOFA Statistics.  In short, SOFA is aiming for a very different niche.  One of the initial project slogans/messages is:  “SOFA won’t replace sophisticated statistics systems like R, but there is a good chance it will do what you need and do it well.”

Major points of difference as I see it (open for discussion):

Main users:

  • R: statisticians and experienced quantitative researchers.
  • SOFA: business analysts, secondary school statistics students and their teachers, social science students in the tertiary sector, experienced statisticians doing some quick exploration of data or wanting to create attractive output for a report or presentation, citizen activists wanting to use publicly available data to support their cause.

Main concerns:

  • R: statistical analysis – what are the very best tools available for understanding the data.
  • SOFA: ease of use, simplicity, beautiful output (aesthetics as a value in its own right, not just a means for the communication of information)

Scope of statistical tests:

  • R: everything and anything you might need
  • SOFA: the main tests that most potential users of statistical analysis need.  Favouring thoroughness of support for user over breadth of tests available.  See the second screenshot here – http://www.sofastatistics.com/screenshots.php – for an idea of the philosophy being followed.

Of course, these are generalisations.  R is not uninterested in ease-of-use or aesthetics and SOFA Statistics is intended to be extensible with plugins to allow more sophisticated analysis.  But there is a difference in emphasis and there is room for both approaches as open source software increases its presence in the statistical analysis area.

0.8.0 includes t-tests and help choosing the appropriate statistical test

Wednesday, July 1st, 2009

Version 0.8.0 of SOFA Statistics has now been released.

  • SOFA Statistics now includes both the independent samples t-test and the paired samples t-test.
  • There is the option of assistance when selecting a statistical test.
  • Random quotations on statistics are shown when hovering over the Statistics button.
  • Plus there are minor layout and label changes to increase usability.

The statistics selection form is the centrepiece of the new 0.8 series, the goal of which is to implement all the required statistical tests.

Form for selecting appropriate statistical test

Form for selecting appropriate statistical test

Download it on the downloads page – http://www.sofastatistics.com/downloads.php