Version 1.5.1 mops up variety of bugs

May 19th, 2019

SOFA Statistics 1.5.0 was a big change from previous versions in terms of its underlying technologies. Unfortunately that meant a number of bugs made it past release testing. Every bug identified so far has been fixed in 1.5.1 so now is a good time to upgrade.

SOFA turns 10!

May 15th, 2019

SOFA Statistics was first released in May 2009. Since that time SOFA has been downloaded nearly 300,000 times – which is about 10,000 times more than I ever expected :-). Since that time many statistics packages have faded away and others have risen. Hopefully, SOFA Statistics has been a good option for many people and the recent replatforming onto Python 3 and wxPython 4 will enable SOFA to continue for many years to come.

Worked examples in 1.5.0

May 15th, 2019

Statistics can be mind bending and complex. It can make your brain hurt and your heart despair. But sometimes it is very simple and easy to follow. Four of the statistical tests provided by SOFA Statistics – namely

  • Mann-Whitney U
  • Wilcoxon’s Signed Ranks
  • Spearman’s Rho
  • Pearson’s Chi Square

– are reasonably easy to understand with worked examples. You could even say they are elegant in their simplicity. Check out the new worked examples feature of SOFA Statistics version 1.5.0.

The worked examples use your actual data so you should be able to follow the logic of the test step-by-step if you are interested in learning how it works. The goal is to demystify the tests and appreciate them better.

Other changes in version 1.5.0 include the following:

  • You can now choose the number of decimal places to show in report tables and charts.
  • Can display counts or percentage separately on pie charts.
  • Better smoothed line displayed for line charts.
  • Easier to run generated scripts for testing (only need to set use_locally boolean).
  • Better visual separation of subtables.
  • Charts can show N.
  • Improvements to darker themes.
  • Add ability to define table to automatically open on startup (using open_on_start setting in projs/default.proj).
  • Ask users if they want to override existing project if adding new project with an already-used name.
  • Upgraded to newer GUI library (wxPython 4.0).
  • Added new safeguards and user feedback when problems converting output to images.
  • Dropped support for xls (xlsx is supported alongside tsv, csv, ods etc)
  • Dropped support for CUBRID (largely because drivers for newer versions of Python are not available)

There are also numerous important bug fixes:

  • Important bug fix for filters with OR conditionals.
  • No longer fails to copy to clipboard when chart names include slashes.
  • Scatterplots now cope with variable names including percentage symbols.
  • Fixed bug displayed empty values in Display Data report tables.
  • Fixed bug with Select All/Deselect All button not working correctly in all cases.
  • Fixed bug when repairing duplicate names in import.
  • Only give project override warning when a new file
  • Add sky.css to styles deployed

SOFA Statistics has received a major overhaul under the hood for version 1.5.0 (for example, the shift to Python 3.6/7 from 2.7). Inevitably there will be some issues but the intention will be to resolve those as quickly as possible. Enjoy!

Nearly ready to release 1.5.0

April 29th, 2019

Version 1.5.0 is nearly ready to release and not before time ;-). The last release was 1.4.6 in January 2016 but the time since then has not been wasted. Here are some of the changes ready to go:

  • SOFA will be able to display worked examples for the following statistical tests:
    • Mann-Whitney U
    • Wilcoxon’s Signed Ranks
    • Spearman’s Rho
    • Pearson’s Chi Square
  • It will be possible to choose the number of decimal places to show in report tables and charts
  • SOFA will be able to display counts or percentage separately on pie charts
  • There will be better smoothed line will be displayed for line charts
  • Better visual separation of subtables
  • Charts will be able to show N
  • Improvements to darker themes
  • Numerous bug fixes

Under the hood, SOFA has had some major changes:

  • Python 3.6+ (Linux) / 3.7 (Windows)
  • wxPython GUI toolkit is 4.0 (up from 2.8)

Sadly, the 1.5.0 release will not include a Mac package but later versions might do depending on practical considerations and offers of packaging help from Mac users. A deb package is already generated for Ubuntu / Debian. A Windows package is nearly ready – SOFA works on Windows 10 and all the dependencies have been baked into an executable ready for the final stages of packaging.

It is expected there will be a few minor bugs slipping through given the scale of the changes underneath but the plan is to quickly release 1.5.1 with these mopped up.

Great new SOFA teaching resource

January 28th, 2017

Thanks to George Self there is a great new teaching resource available for SOFA users. See https://goo.gl/4lpIaO Here is George’s announcement repeated from the discussion group:

I teach an undergrad research methodology class and wrote a SOFA-based lab manual for that class that some of you may be interested in. You can find the manual and the data sets at https://goo.gl/4lpIaO.

The manual has ten chapters:

  1. Introduction (data types, normal distribution, kurtosis, skew, null hypothesis, downloading/installing SOFA, recoding data)
  2. Central Measures (mean, median, mode)
  3. Data Dispersion (range, quartiles, standard deviation)
  4. Visualizing Dispersion (box charts)
  5. Frequency Tables (frequency tables, crosstabs, complex crosstabs)
  6. Visualizing Frequency (histogram, bar chart, clustered bar chart, pie chart, line graph)
  7. Correlation (pearson’s r, spearman’s rho, significance, scatter plots)
  8. Regression
  9. Hypothesis Testing: Nonparametric Statistics (SOFA Statistics Wizard, Kruskal-Wallis H, Wilcoxon Signed Ranks, Mann-Whitney U)
  10. Hypothesis Testing: Parametric Statistics (ANOVA, t-test-Independent, t-test-Paired)

There are also two appendices, the first is a data dictionary for each of the data sets used and the second covers the various report generating features of SOFA.

The lab manual covers all of the functions and features of SOFA, but in the context of a lab where those functions are practiced rather than just described. The manual also includes a lot of information about how the various statistical measures are used (for example, the difference between correlation and causation). No math knowledge beyond simple high school algebra is assumed on the part of the student and each of the labs includes a “deliverable” activity so instructors can use this as part of a class.

I’ve printed this manual under Creative Commons-BY-ShareAlike so please feel free to use this in any way you want. Of course, I’m also happy to receive comments that could help me improve this manual in the future.

–George

Please give the resource a spin and provide George with any feedback that can improve/refine it. Once again, thanks George for making this available to the community 🙂

SOFA passes quarter-million downloads

January 27th, 2017

I was delighted when SOFA passed 30 downloads in 2009 and here we are in 2017 more than 250,000 downloads later – still can’t believe it :-). Even though Sourceforge seems to have become confused about how many downloads there are having “lost” a whole lot in the last couple of months I am pretty confident we really have crossed the quarter-million mark. BTW a major new version of SOFA is in the pipeline and will be released when I fix some Mac installer problems.

Installing SOFA on Ubuntu 16.04 and 16.10

January 27th, 2017

tl;dr

echo "deb http://archive.ubuntu.com/ubuntu wily main universe" | sudo tee /etc/apt/sources.list.d/wily-copies.list

sudo apt update
sudo apt install python-wxgtk2.8
sudo rm /etc/apt/sources.list.d/wily-copies.list
sudo apt update
Download latest deb from http://www.sofastatistics.com/downloads.php
cd ~/Downloads
sudo dpkg -i sofastats-1.4.6-1_all.deb

Details

Even though SOFA is developed on Ubuntu (16.10 at present) there was a problem installing SOFA onto 16.04 or 16.10. The root cause related to Ubuntu support for different versions of wxPython and I spent a lot of time trying different solutions. Fortunately there is a simple workaround that only requires about six terminal commands (see below). Obviously, having to run commands is not as good as a standard installation but it will have to do for now because the main alternatives aren’t currently viable. E.g. some parts of SOFA don’t seem to play nicely with the packaged versions of wxPython 3.0. Snap packaging holds some promise but that will have to wait for later depending on the next releases of Ubuntu.

Thanks to bbobbo for finding a general solution to wxPython 2.8 installation problems on Ubuntu 16 and relating them to the specific SOFA problem and Domenico Somma for bringing it to my attention via the SOFA forum. Here are the steps (Solution from SOFA (statistics) – python 2.8 request – unable installation):

1. Add needed repository and update package list

echo "deb http://archive.ubuntu.com/ubuntu wily main universe" | sudo tee /etc/apt/sources.list.d/wily-copies.list

sudo apt update

2. Install it
sudo apt install python-wxgtk2.8

3. Remove repository entry and update package list again
sudo rm /etc/apt/sources.list.d/wily-copies.list

sudo apt update

4. Install SOFA Statistics
Download latest deb from http://www.sofastatistics.com/downloads.php

cd ~/Downloads

sudo dpkg -i sofastats-1.4.6-1_all.deb

5. Success?
sofastats

Extra info – Warning from http://askubuntu.com/questions/789302/install-python-wxgtk2-8-on-ubuntu-16-04 – “Following this method on large scale can lead to unmet dependency hell. So keep in mind this is similar to PPA’s.” This comment also has more details too about solving issues with broken packages.

Using SOFA on multiple machines with synced config and data

April 26th, 2016

One of the nice things about open source software like SOFA Statistics is that you can freely install it on as many machines as you like without licensing issues, complex validation etc. But how do you keep your content in sync across all the different devices? One approach is to keep the user sofastats folder in a synced drive.

What might be coming next

April 20th, 2016

Python 2 is reaching its End Of Life (EOL) in 2020 so sometime before then I will want to shift SOFA to Python 3. I much prefer Python 3 but the main thing will be the libraries SOFA relies on to operate – especially on Windows and Mac.

Speaking of Mac, I am finding it very time-consuming supporting the platform. Not to enable SOFA’s core functionality to work but for the image processing libraries (esp convert and gs). Along the way I have spent countless weekends compiling using homebrew etc. Slow, tricky, and often fruitless. And it is difficult to test. I only have access to a Snow Leopard machine (virtualised to allow revert to snapshot) and that is no longer very relevant to what people need for newer versions of OS X. Some very kind people have offered to help with testing (thanks!) but the problem seems to be the packaging steps. Maybe what I need to do is ask the people who volunteered if any of them are able to compile convert and gs for me on their machines. I can then just include those versions in my packages and hopefully everything works.

A final work item is to add a Fisher’s LSD test. A friend is helping with this.

1.4.6 Adds basic time series

January 1st, 2016

SOFA line charts and area charts now treat dates as dates in the x-axis which makes it easier to look at time series data.

New option added to interface

New option added to interface

X-axis date aware

Time series selected – X-axis date aware

X-axis not date-aware

Time series not selected – X-axis not date-aware

Example time series chart

Example time series chart

Additional improvements include:

  • Better error message when not enough values in group to run analysis e.g. ANOVA.
  • Better handling of precision in p-value results displayed.
  • Better handling of dates pre-1900.
  • Better messages to user about potentially excessive categories in charts.
  • Add support for float years as date values for time series.
  • Add support for specifying port connecting to postgresql.
  • Allows boxplots when fewer values to display.

And there were two other changes:

  • Removed broken google docs integration – just as easy to manually download and import normally.
  • Removed two pop-ups – no longer needed.

There are also a number of bug fixes:

  • No longer a missing legend in multiseries scatterplots just because the first scatterplot only had one series of data.
  • Fixed bug with saving database connection details when a number involved (port).
  • Fixed PostgreSQL bug when saving connection without password – now succeeds rather than failing silently.
  • Fixed MySQL bug with adding rows.
  • Fixed bug in Windows with checkboxes not enabling/disabling properly unless panels refreshed.