User Tools

Site Tools





Shared Folders in Linux Guests

Allow shared folder to be Auto-mount and Make Permanent.

And add the main user in the VM to be in the vboxsf group (amongst others of course).

On the guest - e.g.:

sudo usermod -a -G vboxsf <username here>

Trouble shooting


sofastats.exe - No hay disco (No disc)

No hay disco en la unidad. Inserte un disco en la unidad \Device\Harddisk3\DR3.

(No disk in the unit. Insert a disc in the unit device/etc)

Cancelar Reintentar Continuar (Cancel, retry, continue)


I am not sure what the problem is but here is a link that could be helpful: It seems to be related to drive letters and possibly USB drives. Does that seem to be the right track? This advice helped another user.

Jokes etc for What's Up


129. A statistics professor was describing sampling theory to his class, explaining how a sample can be studied and used to generalize to a population. One of the students in the back of the room kept shaking his head. “What's the matter?” asked the professor. “I don't believe it,” said the student, “why not study the whole population in the first place?” The professor continued explaining the ideas of random and representative samples. The student still shook his head. The professor launched into the mechanics of proportional stratified samples, randomized cluster sampling, the standard error of the mean, and the central limit theorem. The student remained unconvinced saying, “Too much theory, too risky, I couldn't trust just a few numbers in place of ALL of them.” Attempting a more practical example, the professor then explained the scientific rigor and meticulous sample selection of the Nielsen television ratings which are used to determine how multiple millions of advertising dollars are spent. The student remained unimpressed saying, “You mean that just a sample of a few thousand can tell us exactly what over 250 MILLION people are doing?” Finally, the professor, somewhat disgruntled with the scepticism, replied, “Well, the next time you go to the campus clinic and they want to do a blood test…tell them that's not good enough …tell them to TAKE IT ALL!!”

131. A new Ph.D statistician had just taken a position with the Bureau of Standards. One of his first tasks was to familiarize himself with the volumes of measurement standards for the vast array of objects in the world. He was immediately curious about his own profession and looked up “statistician.” Among the list of physical characteristics, he came across a shocking figure…The mean weight of all statisticians in the world is 3 POUNDS. He gasped in disbelief. He thought surely this was a typographical error and that the first two digits had been omitted. Then he squinted and noticed a small asterisk by this figure. He quickly directed his eyes to the bottom of the page. He sighed a breath of relief as the footnote boldly stated, “INCLUDES URN.”

148. Checking some questionnaires that had just been filled in, a census clerk was amazed to note that one of them contained figures 121 and 125 in the spaces for “Age of Mother, If Living” and “Age of Father, if Living.”

“Surely your parents can't be as old as this?” asked the incredulous clerk.

“Well no,” was the answer, “but they would be IF LIVING!”

160. It is 1941 and the Germans are bombing Moscow. Most people in Moscow flee to the underground bomb shelters at night, except for a famous Russian statistician who tells a friend that he is going to sleep in his own bed, saying that “There is only one of me, among five million other people in Moscow. What are the chances I'll get hit?”

He survives the first night, but the next evening he shows up at the shelter. His friend asks why he has changed his mind. “Well,” says the statistician, “there are five million people in this city, and one elephant in the Moscow Zoo. Last night, THEY GOT THE ELEPHANT!”


1. A statistics major was completely hung over the day of his final exam. It was a True/False test, so he decided to flip a coin for the answers. The stats professor watched the student the entire two hours as he was flipping the coin…writing the answer…flipping the coin…writing the answer. At the end of the two hours, everyone else had left the final except for the one student. The professor walks up to his desk and interrupts the student, saying: “Listen, I have seen that you did not study for this statistics test, you didn't even open the exam. If you are just flipping a coin for your answer, what is taking you so long?”

The student replies bitterly, as he is still flipping the coin: “Shhh! I am checking my answers!”

2. Three professors (a physicist, a chemist, and a statistician) are called in to see their dean. Just as they arrive the dean is called out of his office, leaving the three professors there. The professors see with alarm that there is a fire in the wastebasket.

The physicist says, “I know what to do! We must cool down the materials until their temperature is lower than the ignition temperature and then the fire will go out.”

The chemist says, “No! No! I know what to do! We must cut off the supply of oxygen so that the fire will go out due to lack of one of the reactants.”

While the physicist and chemist debate what course to take, they both are alarmed to see the statistician running around the room starting other fires. They both scream, “What are you doing?”

To which the statistician replies, “Trying to get an adequate sample size.”

Manually Uninstalling


  • remove local folders
    • C:\Documents and Settings\username\sofastats (C:\Users … in Vista and Win7)
    • C:\Documents and Settings\username\sofastats (C:\Users … in Vista and Win7)
  • remove program folder
    • e.g. delete C:\Program Files\sofastats
  • remove any shortcuts
    • desktop
    • sofastats folder under Start>All Programs (Programs in Vista and Win 7)
  • uninstall any associated packages (only possible for Python)
    • In sofalibs python-2.6.5.msi select “Remove Python”

Following Up Potential Clients

Hi X,

I hope you found X useful. Were you able to get reports working or would you like some help? And if SOFA Statistics was not what you were looking for, would you have any suggestions for how the product could be improved to better meet your needs.

All the best, Dr Grant Paton-Simpson SOFA Statistics

Brainstorm software

  • At some point we could add a brainstorm page to our SOFA website so that people can suggest features and vote on them. I know Canonical has improved brainstorm software and I expect it will be open source. Have you ever used it? Very cool and user-friendly and it makes users feel appreciated and listened to. Not to mention the extremely valuable user feedback it gives. 7 noisy people may want advanced feature X but 5,163 ordinary users may want more practical change Y. It will help us respond to where the market is.
  • We could also explicitly mention that we are willing to discuss bounties etc with interested parties e.g. $5,000 for the addition of major functionality - perhaps that required by a business integrator making a killing off our product ;-).

Why SOFA vs Proprietary

Easier to start using - more guidance. Less to learn.

Market Positioning

  • Eliminate some features that are the norm
  • Add some new features that are not the norm
  • Lower the standard of some features below the norm
  • Raise the standard of some features above the norm

SPSS Customers

Is there anything I can learn from what SPSS customers are saying about SPSS? Possible issues from etc:

  • Sloooow startup
  • Slow processing “I am finding that SPSS 16 isn't just slow on start up, it is MUCH MUCH slower processing too. I'm waiting more than a minute to get frequencies on data sets with 6,000 or so cases. With SPSS 14, my results were very fast. … I dread using the program and would love to have 14 back but we have a site license so I'm stuck.”
  • Outputs can become inaccessible (have to use a special legacy viewer etc)
  • SPSS does not “remember” the size and position of its windows from the previous session
  • licensing activation issues
  • Cross-case functions difficult - LAG is a bit limited etc.
  • Problems with complex conditional joins between datasets.

Paying upstream

  • If we want to pay upstream projects we should do so on the basis that we ask them to prioritise certain features etc that we need. That is what we buy from them. If we do this right, we can get something valuable for our money and the IRD will be happy.

Extensions - keeping it manageable

Both authors reflect on how CRAN is having so many packages (extensions to R core).  While the diversity is wonderful, 
the scalability in the user’s ability to handle the variety is limited.  From a user’s perspective it is very hard to 
find/follow/manage all the innovative R extensions out there.  One hope for improvement in this front is the project 
“Crantastic“, which I hope will get (much) more attention and expansion. 

Future of Open Source Survey and R

OpenOffice Libraries

NB I am interested in reading from simple ODS files (rows and columns of data, no formulae etc), not OpenOffice Calc per se. Gnumeric is also important.

Platform Options (alternatives to wxPython)

In Python: a.) use Eric4 IDE as the core engine and layer your UI / algorithms on top b.) use Orange, data mining tool with a great UI

* In Java: c.) Eclipse's RCP (Rich Client Platform) Many, many great programs use the Eclipse RCP “engine” to build great apps on top. A few examples: and

All approaches above will allow you to get out of the “infrastructure” business (use theirs), while focusing on your true value-add: the algorithms and user experience. Just a suggestion.

Jose C. Lacal Vice President, Advanced Technology Data Stream Content Solutions, LLC 5000 College Ave College Park, MD 20742 +1 (561) 523-9056

Lines of Code

  • Copy sofa.main into SOFA/storage/sloc
  • Delete non-code or external code folders e.g. googleapi, boomslang, tests etc
  • sloccount /home/g/projects/SOFA/storage/sloc/sofa.main
  • or sloccount –details /home/g/projects/SOFA/storage/sloc/sofa.main

Google API

  • use latest source files gdata download
  • only bring across the folders needed
  • add any missing init modules
  • create any needed from ..googleapi.gdata import spam as eggs etc


  • NB Win, Deb, and Mac
  • Add folders and files
  • Add to licence (Apache 2)


import gettext
import os
import platform
import sys
import wx

import my_globals as mg

test_lang = False

print("About to get path")
path = sys.path[0].decode(sys.getfilesystemencoding())
print("About to get langdir")
langdir = os.path.join(path,u'locale')
print("About to get langid")

class MsgFrameTest(wx.Frame):
    def __init__(self, msg):
        wx.Frame.__init__(self, None, title=_("SOFA Test Message"))

class MsgAppTest(wx.App):

    def __init__(self, msg):
        self.msg = msg
        wx.App.__init__(self, redirect=False, filename=None)

    def OnInit(self):
        msgframe = MsgFrameTest(self.msg)
        return True

def show_msg(msg):
    msgapptest = MsgAppTest(msg)
    del msgapptest

langid = wx.LANGUAGE_GALICIAN if test_lang else wx.LANGUAGE_DEFAULT
# next line will only work if locale is installed on the computer
show_msg("About to get mylocale")
mylocale = wx.Locale(langid) #, wx.LOCALE_LOAD_DEFAULT)
show_msg("About to get canon_name")
canon_name = mylocale.GetCanonicalName() # e.g. en_NZ, gl_ES etc
# want main title to be right size but some langs too long for that
show_msg("About to get main_font_size (albeit not added to self ;-)))")
main_font_size = 20 if canon_name.startswith('en_') else 16
show_msg("About to get mytrans")
mytrans = gettext.translation(u"sofa", langdir, 
                            languages=[canon_name], fallback = True)
show_msg("About to install mytrans")
if platform.system() == u"Linux":
        # to get some language settings to display properly:
        os.environ['LANG'] = u"%s.UTF-8" % canon_name
    except (ValueError, KeyError):
msgapptest = MsgAppTest("About to set geometry etc")
mg.MAX_WIDTH = wx.Display().GetGeometry()[2]
mg.MAX_HEIGHT = wx.Display().GetGeometry()[3]
mg.HORIZ_OFFSET = 0 if mg.MAX_WIDTH < 1224 else 200
del msgapptest

Multiple version issues

* Coping with multiple versions of wxPython -

Technologies used and why

Why Python?

  • beautiful language to work with :-)
  • cross platform
  • strong in scientific/mathematical/academic communities
  • easy to learn so that users can string exported scripts together when automating report building
  • a good choice for attracting developers
  • a first class citizen of linux so not just something you would learn to use SOFA's more advanced functionality

Why wxPython?

  • native look and feel across platforms
  • an active community
  • a mature code base

Why wxWebKit?

  • cross platform
  • capable of sophisticated tabular and graphical output

sofa_report_extras vs all in one report subfolder

  • Too much space if all JS kept with each report. Major bloat.
  • Images folder currently only exists if png images (although this could be overcome easily enough)

Cross Platform Development

Making data changes using underlying SQL database


One option is to install the free and open source product It is quite light-weight and simple. You can manipulate data there very easily with your own data queries. The sqlite database that SOFA stores its data in is called sofa_db and can be found in C:\Documents and Settings\username\sofastats\_internal (Windows XP), C:\Users\username\sofastats\_internal (Windows Vista/7), /home/username/sofastats/_internal (Ubuntu/Linux Mint), /Users/username/sofastats/_internal (Mac OS X).

To open that database, sofa_db, within SQLite Data Browser:

File > Open Database


This is how Bachoura et al (2012) cited SOFA:

Data were analyzed using Statistics Open For All (SOFA) 1.1.5 software (Paton-Simpson & Associates Ltd, New Zealand).

proj/misc.txt · Last modified: 2014/06/29 20:36 by admin