Archive for the ‘general’ Category

Easy exporting of output to PDF and images

Saturday, August 18th, 2012

SOFA creates HTML web reports ready to share but sometimes you want to share a PDF or include graphs and tables in documents and slideshows. The ‘export output’ extension makes it easy. And the images can be exported in draft for speed or a range of higher quality resolutions e.g. for publication. Export output plug-in

The Export Output plug-in is available at get_extensions.php and is compatible with SOFA 1.2.0, which has just been released.

Version 1.2.0 also offers the following:

  • Scatterplots can have more than one series e.g. different coloured male and female dots on same plot.

    Multiseries scatterplot
  • Individual columns within a data display report table can be sorted. This is extremely useful for checking your raw data, looking for anomalies before analysis.

    Sortable Data List Reports
  • Report tables of summary statistics e.g. mean, median, sd, now show the measures across columns instead of down rows. This makes it much easier to compare results and notice difference.

    New Row Stats Layout
  • Numerous changes to layout and functioning to make SOFA easier to use. For example, the demonstration tables displayed when configuring report tables now use real data instead of random data if the data source is small enough to make updating fast. And the “Show Results” button becomes “Add to Report” when report tables are able to use live data for examples (and the “Add to report checkbox” is hidden). Add to report now defaults to False and remembers the last setting throughout the user session. And some standard GUI items have been repositioned to increase the logic of placement and speed of use.

    Layout Improvements
  • Added IQR (Inter-Quartile Range) to the Row Stats Table.
  • Can now do a boxplots of a single variable (not requiring a variable for it to be By).

    Single Variable Boxplots
  • Now save and show cropped matplotlib charts.
  • Scatterplots are a better size.
  • Excel importing copes with faulty dates better and tells the user where the problem is.
  • Filter dialog now checks if user wants to apply a filter when the settings haven’t been changed. Prevents inadvertent filtering to null.
  • Better font sizes in auxiliary clustered bar charts (matplotlib).
  • GUIs no longer update if the user selects the same database as is already selected.

And there have been plenty of useful bug fixes too:

  • Fixed bug when more colours are required for auxiliary clustered barcharts than in the css.
  • Fixed bug re: linked images with non-English characters not displaying in GUI html window (still issue if user relying on IE6).
  • Fixed bug which made report tables widen if the title or subtitle was wider.
  • Fixed bug where removal of filter in the charting dialog would change third variable dropdown to the last value it had when the filter was applied even if it had subsequently been changed to the Nothing Selected value. The default values are now updated everytime the dropdowns change.
  • Now a guaranteed consistency in series colours across charts even if a series is missing e.g. the middle one, from one of the charts.
  • Misc bug fixes for stats tests where no variability. Better user messages.
  • Independent variable tests cope with new lines in text values.
  • Histograms all get title case titles if no value label instead of upper case.
  • Fixed bug when filtering results in no group a and b values possible in indep2var analyses.
  • Fixed histogram width issue esp if wide numbers e.g. 12000 is wider than 12.
  • Fixed css display problem in webkit internal browsers.

This is an exciting release and hopefully the plug-in will be well received.

1.1.7 Enables Series of Clustered Bar Charts and Multi-Line Charts

Friday, July 6th, 2012

Now SOFA lets you make series of Clustered Bar Charts and multi-line Line Charts.
Series of clustered bar charts

Series of multi-line line charts

And when a data type mismatch is encountered during an import, SOFA now reports on the first faulty cell (row, cell value, and expected type) to make it easier for to clean your data.

Mix of data types import message

In addition, here are the other features added in this release:

  • Now able to run multiple single line charts. And multiple line charts have a consistent y-axis for role as trellis charts.
  • SOFA now guesses whether an xls or ods spreadsheet has a header row or not when importing and sets the default buttons accordingly.
  • Tighter and more flexible spacing of x-axis title according to number of lines in x labels. Now show legend title before legend and only show legend if multiple series.
  • Smarter column resizing.
  • CSV importing correctly identifies if file has a header more accurately if only one column (and thus no delimiter identified initially).
  • Adjusted chart sizings to be more rational and less magic.
  • Better width settings for dojo histogram output when lots of bins (prevents x-axis labels bunching up).
  • Row stats tables show “Not calc” instead of nan (not a number) when appropriate.

And there are some important bug fixes as well:

  • Fixed bug encountered when importing spaces in fields.
  • Fixed bug when missing values in some fields in a boxplot with multiple series.
  • Table design changes now cope with pre-existing sofa id index and demo values no longer updated on exit (inadvertently via on_show event).
  • Fixed bug in tooltip display of averages in general charts – no longer rounds to lower integer.
  • Charts of averages don’t show percentages in tooltips.
  • All matplotlib histograms, including those in stats display such as for one-way ANOVAs, now cope with sigma of 0.
  • Fixed bug calculating percentages where all values in a chart (of a series) are 0.
  • Fixed bug in showing rotate when changing to clustered bar charts.
  • Fixed bug in ods importing where display of sample values didn’t include repeated cells (only showing the first value).
  • SQLite now correctly identifies more types as numeric.

I hope you like it.

SOFA Celebrates 100,000th Download!

Sunday, July 1st, 2012

SOFA Statistics had its 100,000th download today, which is a doubling in just over a year. And more features and user experience refinements are in the pipeline. So please spread the word. There is no advertising budget so we need you to blog, tweet, like, +1 etc. Thanks!

100,000 Download Milestone

1.1.6 adds percentage charts and more

Friday, June 22nd, 2012

More steady improvements to SOFA:

  • Added ability to chart percentages as well as frequencies.
    Percentage charts

  • Chart series now all have consistent y-axes (trellis style).
    Same scale

And there have been some bug fixes:

  • Fixed ODS import bugs when encountering fields formatted as fractions, boolean, percentage etc.
  • Fixed bug stopping linked images external to the html e.g. generated by matplotlib, from displaying in the internal GUI if the report path was different from the default report path.
  • Fixed bug when running scatterplots and histograms with chart by (because of use of dd object which cannot be used headless).
  • Fixed bug in charts dialog where second variable drop down would sometimes be overly restrictive when doing a chart of average values.
  • Fixed bug where old pycs can interfere with Windows upgrades.

Enjoy!

More chart improvements in 1.1.5

Sunday, May 27th, 2012

Charts now have the option of rotated (vertical) x-axis labels. This can be useful for longer labels.

Rotated labels

Rotated labels

Note – if you have upgraded SOFA rotated labels may not work unless you update the sofastats_charts.js file in your local sofastats folder e.g. C:\Users\username\sofastats\reports\sofastats_report_extras with the sofastats_charts file for sofastats_report_extras

Scatterplots now focus on the data better by starting axes just below the minimum x and y values of the data unless the value is close enough to 0 to make it worth using 0 anyway.

Scatterplot focus

Scatterplot focus

And for Ubuntu users, a much nicer launcher icon :-). Actually, it’s a set of icons at different resolutions so that SOFA always looks good on the launcher.

More attractive launcher icon in Ubuntu

More attractive launcher icon in Ubuntu

Other changes include:

  • Numeric values are right justified in data tables.
  • Kurtosis values in the normality test include the Fischer adjustment (subtracting 3).
  • Duplicated field names in imports are given unique suffixes and allowed (now that they are unique).
  • Excel importing now handles times without dates.
  • More date formats are accepted when importing data.
  • Better guidance on data preparation before importing data.
  • More robust handling of variable definition files if corrupted.

Note: if upgrading on Linux, the two user folders (sofastats and sofastats_recovery) may be shifted from inside your home folder to a better location e.g. “/home/username/Documents” if free desktop standards are supported. After upgrading you may wish to manually replace the contents of the new folders with the contents of the old ones.

Bug Fixes

  • Fixed small bug stopping column labels displaying in data table view.
  • Fixed bug in recode operation which would wipe the table if any errors at all where encountered trying to turn the user recode config into SQLite update clauses.
  • Fixed bug in getting structured data e.g. for line charts, where a user names a field freq and thus has a conflict with my own freq field. Renamed the internal use field _sofa_freq to prevent collisions.
  • Creating user’s default proj file now copes with apostrophes etc in user path e.g. /Users/Jim’s/etc.
  • The project dialog now displays the default report and css details saved with it from previous occasions.
  • Project settings are only applied if the project is selected – they are not automatically triggered by changes when configuring a project.
  • Multi-line values entered into data cells e.g. variable label settings, automatically have the line breaks converted into spaces. Prevents errors in display of data e.g. in single line text boxes, and problems storing in python scripts (EOL error) etc.
  • Fixed bug where the first SQLite database in a project was assumed to be the default sofa database even though it might not be. Now possible to link to multiple default databases e.g. testing copies etc as long as simple naming convention followed.
  • SOFA now rolls back to last good database connection if a failure.
  • Fixed strange bug where default database would lock if made a new table, then looked as design, then tried to write to the database e.g. importing, editing data. Just refreshed cursor after updating demo table design and problem gone.

Further improvements in 1.1.4

Friday, February 24th, 2012

The latest version adds a range of improvements:

  • Added lower and upper quartiles to Row Stats report tables.

    Quartiles

  • Box plots now start y-axis from just below the minimum y value of the data unless the content is close enough to the bottom of the graph to make it worth using 0 anyway.
    Y axis adjusted automatically  for box and whisker plots
  • Showing the percent sign in percent columns for report tables is now optional – which is good news for many dissertation students.

    Show (or hide) percentage symbols

  • SOFA now displays value labels sorted by the numerical version of numbers even if stored as text. So no more 1, 11, 2,3 etc in cases where people have stored the number as a Text data type.
  • Added some more valid US date formats using dot dividers.
  • New help button for importing data.
  • New help button to advise on how to make of flexible data filters.
  • English translations are handled better (no more messages about not having US English and using UK English instead etc).

Plus there are some useful bug fixes:

  • Fixed bug where getting observed values e.g. for chi square test, fell over when one field in pair had missing values while the other didn’t.
  • Fixed bug in calculation of upper and lower whiskers in box plots.
  • Single bar charts don’t show a bar title anymore – only needed if multichart.
  • Fixed bug which only changed variable definitions when the extra settings dialog was closed with OK and didn’t ever set it otherwise e.g. when changing the selected project.
  • Now copes with newer versions of matplotlib on Linux.
  • No longer stores empty strings as variable labels if user doesn’t enter a label.

Honey I Shrunk the Installers

Monday, December 19th, 2011

The SOFA installers for Windows and Mac have shrunk substantially – from 43MB to 25MB for Windows and from a rather hefty 85MB to 36MB for Mac. They’ll be quicker to download, and the new installers also avoid possible conflicts with other Python packages on a system. It’s all self-contained. A final benefit is that the installation process itself has become much simpler, with much fewer steps. For those who are technically minded, it is thanks to pyinstaller and py2app (with some initial help from Gui2exe).

Mainstream German Computer Magazine Reviews SOFA

Sunday, December 18th, 2011

SOFA has been reviewed and included in the software CD for a recent edition of Germany’s c’t magazine (c’t 2011 Issue 26 p.118). C’t (Magazin für Computertechnik) has a sold circulation of about 367,000 so it was wonderful to show up on their radar.

c't magazine cover

Better installation in non-English environments

Wednesday, November 23rd, 2011

Version 1.1.2 fixes a bug which affected people trying to install SOFA into many non-English environments. SOFA also has some changes which make it safe for SOFA to communicate progress in more detail while being run in Windows using the non-console version of Python. Overall, SOFA has become much more robust in recent versions.

Good news for Mac & Linux users – Excel importing added

Sunday, October 9th, 2011

SOFA Statistics 1.1.1 brings good news for Mac and Linux users. You can now import Excel xls files directly. This is no longer a Windows-only feature.

Here is the full list of changes:

  • Excel can be imported from Mac and Linux as well as Windows.
  • ODS importing now copes with single ‘divider’ columns – i.e. columns with no field name in the header.
  • CSV importing now autofills blank columns with field numbers such as Var018.
  • More informative if locale issues.
  • More informative if unable to connect to MySQL on Mac.
  • Changed standard deviation in report tables from population sd to sample sd.

There is one important set of bug fixes which allows more sophisticated extraction of cell values from ODS spreadsheets. SOFA now copes with formatted content of cells and other complex cases by handling subelements in the XML.