SOFA Wins People’s Choice Award

November 9th, 2012

Great news! SOFA Statistics won the 2012 People’s Choice Award in the NZ Open Source Awards. Thanks to everyone who voted in support.

NZOSA Trophy

In addition to the trophy and framed certificate I was lucky to get a nice new Android tablet from Zareason (http://zareason.com/shop/zatab.html). Busy playing with that at the moment.

And SOFA was also a finalist for the Best Open Source Project award.

NZOSA Awards

So it was a great awards ceremony for the project.

Awards Ceremony Speech

Video of Open Source People’s Choice Award (Presented by ZaReason)

Video of finalists for Open Source Software Project (SOFA one of 3 finalists)

1.3.0 brings numerous improvements

November 4th, 2012

SOFA 1.3.0 has plenty of small but important additions:

  • Added Mode as an option to Row Stats report tables. Reports mode(s) with N of the mode value(s) e.g. mode weight 72.0, 76.0 (N=23)

    Modes available

  • Line and area charts can show major labels only as an option.
    All labels (the default)

    Major labels only

  • Pie charts now have option of displaying count and percentage (not just in the tooltips as at present).
    Pie Chart Details option available

  • Histograms use consistent bins when charted by a second variable.
  • Better placement of y-axis title when wide labels.
  • Pie charts keep consistent colours even if sorted by count rather than value or label.

    Consistent category colouring in Pie Charts

  • Better message when adding new reports if required subfolder with javascript and background images is missing. Only show message now if a problem.
  • When trying to export report, SOFA checks for expected subfolder as well (otherwise dojo fails for any charts and export fails).
  • SOFA prevents attempt to export report if no report file (yet).
  • No longer displays View or Export Report buttons on Projects dialog.

And there have also been some important bug fixes making it worth upgrading:

  • Fixed bug in row stats where data should have explicitly filtered out None values.
  • Fixed bug in setting of min and max values for y-axis for boxplots when min is below 0.
  • Refactored code for running report in output module. Easier to understand and also made it easy to save copy of internal html output with absolute paths to images – very helpful when exporting images.
  • Built more robust value quoting e.g. for sql statements.

I hope you like it.

FLOSS for Science Interview

October 12th, 2012

I was lucky enough to get interviewed by FLOSS for Science. Check it out :-)

FLOSS for Science Interview

Vote for SOFA please

October 12th, 2012

SOFA has been nominated for The People’s Choice Award as part of the New Zealand Open Source Awards. I would really love it if as many people as possible voted for SOFA at The People’s Choice Award. Tell your friends; spread the word.

New Zealand Open Source Awards

Version 1.2.2 has XLSX importing and reportable normality analyses

September 20th, 2012
  • The latest SOFA, version 1.2.2, lets you import from Excel XLSX files (previously, Excel files had to be in the XLS format).

    XLSX Format Supported

  • Normality analyses can be included in reports, saved as output etc.

    Normal Curves

  • And there is support for CUBRID databases. CUBRID is an open source relational database highly optimized for Web Applications. That brings to 6 the total number of SQL-type databases that SOFA can directly link to.
    CUBRID Logo

A few bugs were also fixed:

  • Restored standard deviation option to row stats report tables.
  • Fixed bug which meant row % was appended multiple times to config col dialog (until session closed).
  • Restored PostgreSQL functionality by fixing faulty psycopg import statement.

1.2.1 enables export to spreadsheet and more

August 28th, 2012

The latest version of SOFA Statistics makes it easy (via a plug-in) to export to spreadsheet.

Export Data

SOFA also makes it easier to select multiple variables when making report tables.

Select multiple variables

Additionally, there have been some important bug fixes – mainly for bugs which snuck in during the major change to 1.2.0.

  • Fixed nasty bug breaking demo report tables. A casualty of the changes to the independent display of titles from report tables in 1.2.0 so wide titles didn’t mean wide table cells.
  • Fixed bug with display of percent symbols in report tables. A missing not was the culprit – another casualty of the big refactoring for 1.2.0.
  • Fixed bug with Data List reports – wouldn’t update display after changing sort order of a variable.
  • Sort by value now works properly in Data List reports.
  • Can now handle excessively long values being used as categories in report tables or charts etc. Checks are also now made for excessively long category variable values.
  • If encoding problems, SOFA now tries to use the field encoding e.g. iso-8859-1.

Easy exporting of output to PDF and images

August 18th, 2012

SOFA creates HTML web reports ready to share but sometimes you want to share a PDF or include graphs and tables in documents and slideshows. The ‘export output’ extension makes it easy. And the images can be exported in draft for speed or a range of higher quality resolutions e.g. for publication. Export output plug-in

The Export Output plug-in is available at get_extensions.php and is compatible with SOFA 1.2.0, which has just been released.

Version 1.2.0 also offers the following:

  • Scatterplots can have more than one series e.g. different coloured male and female dots on same plot.

    Multiseries scatterplot
  • Individual columns within a data display report table can be sorted. This is extremely useful for checking your raw data, looking for anomalies before analysis.

    Sortable Data List Reports
  • Report tables of summary statistics e.g. mean, median, sd, now show the measures across columns instead of down rows. This makes it much easier to compare results and notice difference.

    New Row Stats Layout
  • Numerous changes to layout and functioning to make SOFA easier to use. For example, the demonstration tables displayed when configuring report tables now use real data instead of random data if the data source is small enough to make updating fast. And the “Show Results” button becomes “Add to Report” when report tables are able to use live data for examples (and the “Add to report checkbox” is hidden). Add to report now defaults to False and remembers the last setting throughout the user session. And some standard GUI items have been repositioned to increase the logic of placement and speed of use.

    Layout Improvements
  • Added IQR (Inter-Quartile Range) to the Row Stats Table.
  • Can now do a boxplots of a single variable (not requiring a variable for it to be By).

    Single Variable Boxplots
  • Now save and show cropped matplotlib charts.
  • Scatterplots are a better size.
  • Excel importing copes with faulty dates better and tells the user where the problem is.
  • Filter dialog now checks if user wants to apply a filter when the settings haven’t been changed. Prevents inadvertent filtering to null.
  • Better font sizes in auxiliary clustered bar charts (matplotlib).
  • GUIs no longer update if the user selects the same database as is already selected.

And there have been plenty of useful bug fixes too:

  • Fixed bug when more colours are required for auxiliary clustered barcharts than in the css.
  • Fixed bug re: linked images with non-English characters not displaying in GUI html window (still issue if user relying on IE6).
  • Fixed bug which made report tables widen if the title or subtitle was wider.
  • Fixed bug where removal of filter in the charting dialog would change third variable dropdown to the last value it had when the filter was applied even if it had subsequently been changed to the Nothing Selected value. The default values are now updated everytime the dropdowns change.
  • Now a guaranteed consistency in series colours across charts even if a series is missing e.g. the middle one, from one of the charts.
  • Misc bug fixes for stats tests where no variability. Better user messages.
  • Independent variable tests cope with new lines in text values.
  • Histograms all get title case titles if no value label instead of upper case.
  • Fixed bug when filtering results in no group a and b values possible in indep2var analyses.
  • Fixed histogram width issue esp if wide numbers e.g. 12000 is wider than 12.
  • Fixed css display problem in webkit internal browsers.

This is an exciting release and hopefully the plug-in will be well received.

GUI performance nightmare if shrinking font of drop-down lists

July 18th, 2012

I was wanting to shrink the font of elements of the SOFA GUI dialogs so I could squeeze more in or relocate items to more logical positions. Can’t be that hard, surely? I have since discovered that if a drop-down list (wxPython wx.Choice widget) has lots of items e.g. 30+ it takes seconds for fresh items to be added the the widget if you are trying to use your own font selection (using setFont()) on Linux. SetItems() takes a long time as, presumably, it sets the font for each individual item. And given I can’t control how many items will appear in drop-down lists or avoid having to repopulate lists (e.g. new data table selected so variable lists have to be updated) the option of shrinking fonts is not viable. Back to the drawing board.

[UPDATE] I came up with a workaround. Because there is no performance problem when items are included with the initial instantiation of dropdown widgets, all dropdowns are rebuilt each time they are changed. This means they have to be destroyed before being replaced, and the panel they are on must be hidden temporarily to avoid flicker on Windows, but it works. The fact that I was able to clean up some code in the process almost compensates for the considerable extra work :-)

1.1.7 Enables Series of Clustered Bar Charts and Multi-Line Charts

July 6th, 2012

Now SOFA lets you make series of Clustered Bar Charts and multi-line Line Charts.
Series of clustered bar charts

Series of multi-line line charts

And when a data type mismatch is encountered during an import, SOFA now reports on the first faulty cell (row, cell value, and expected type) to make it easier for to clean your data.

Mix of data types import message

In addition, here are the other features added in this release:

  • Now able to run multiple single line charts. And multiple line charts have a consistent y-axis for role as trellis charts.
  • SOFA now guesses whether an xls or ods spreadsheet has a header row or not when importing and sets the default buttons accordingly.
  • Tighter and more flexible spacing of x-axis title according to number of lines in x labels. Now show legend title before legend and only show legend if multiple series.
  • Smarter column resizing.
  • CSV importing correctly identifies if file has a header more accurately if only one column (and thus no delimiter identified initially).
  • Adjusted chart sizings to be more rational and less magic.
  • Better width settings for dojo histogram output when lots of bins (prevents x-axis labels bunching up).
  • Row stats tables show “Not calc” instead of nan (not a number) when appropriate.

And there are some important bug fixes as well:

  • Fixed bug encountered when importing spaces in fields.
  • Fixed bug when missing values in some fields in a boxplot with multiple series.
  • Table design changes now cope with pre-existing sofa id index and demo values no longer updated on exit (inadvertently via on_show event).
  • Fixed bug in tooltip display of averages in general charts – no longer rounds to lower integer.
  • Charts of averages don’t show percentages in tooltips.
  • All matplotlib histograms, including those in stats display such as for one-way ANOVAs, now cope with sigma of 0.
  • Fixed bug calculating percentages where all values in a chart (of a series) are 0.
  • Fixed bug in showing rotate when changing to clustered bar charts.
  • Fixed bug in ods importing where display of sample values didn’t include repeated cells (only showing the first value).
  • SQLite now correctly identifies more types as numeric.

I hope you like it.

SOFA Celebrates 100,000th Download!

July 1st, 2012

SOFA Statistics had its 100,000th download today, which is a doubling in just over a year. And more features and user experience refinements are in the pipeline. So please spread the word. There is no advertising budget so we need you to blog, tweet, like, +1 etc. Thanks!

100,000 Download Milestone