Which statistical package?

My research stretches across clinical psychology, behavioural neuroscience and neuropsychology (and now, with a smidge of epidemiology thrown in). Researchers in these fields use different statistical approaches, and reflecting this, different packages. SPSS is common, although the neuroscience contingent (especially the younger researchers) tend to use R. I’m a long-time, disgruntled SPSS user. Disgruntled because it’s often clunky and slow, it’s a pain to produce attractive graphs, and it’s also expensive. (It’s free through most universities, but if I wanted to use it on my own it would be a fairly costly yearly subscription fee.)

Therefore, I’ve been teaching myself R, which has a rather steep learning curve, but is probably the best long-term option. In the meantime, I’ve also trialled other free packages that are easily available, namely PSPP and JASP. (I’ve also dipped a toe into STATA, which seems like a pretty good – paid – program, but haven’t used it enough to talk about it at length.)

R is definitely the most powerful and flexible, and the graphs it produces are very pretty. It’s open-source, and it has a large international user base. New packages for R are always being developed, and support is usually available (although I find that those who ask questions are similar in skill level to me – i.e. novices – , whereas those providing answers are…not. Which often makes the answers hard to understand.) I use R Studio, which provides a nicer environment through which to use R. That being said, it’s not a point-and-click interface at all, so to use R you have to use syntax/coding. There are lots of free web-based tutorials on using R. One of these, which looks good for new users, is provided by The Analysis Factor.

PSPP is an open-source take on SPSS, which looks and feels very similar, and as such it’s a great choice for someone needing relatively basic stats but not wanting to pay for SPSS. Pros: it will read SPSS dataset and syntax files, and syntax is almost identical, so if you’re familiar with SPSS it’s an easy transition. Even if you just need to get some data from SPSS files into something else, this will work. You can also edit data in it, i.e. create new variables, edit values within your variables, etc. Cons: It doesn’t have the full functionality of SPSS (e.g. GLM does not support continuous variables/covariates), and graphs are *very* basic. From memory it’s also a bit fiddly to install, but good instructions are available and it’s definitely worth it for a free “SPSS lite” program.

JASP is a good little program. Pros: it is very clean and simple to use. It reads both csv and SPSS (*.sav) data files, and it does some surprisingly funky stuff like some Bayesian analyses (not that I know much about Bayesian statistics at present). While it only has a limited number of analyses available, it does these well, and produces nice graphs, very easily. Cons: It is all point-and-click, i.e. you can’t use syntax. You also can’t use it to edit data at all. It lacks some more advanced options available in SPSS and R, such as bootstrapping.

For now, until I get better at R and can switch to using it full-time, I am using bits and pieces from all these packages (…as I procrastinate by writing this blog post instead of working on my second paper…).


Back to Stats

Tools of the trade: coloured whiteboard markers, eraser, Casio calculator of a vintage that makes me feel old, tissues, and mints because talking for hours requires minty fresh sustenance. Plus accidental e-reader.

It’s the first week of tutorials for the undergrads, and the first day of tutoring for me. I started doing university tutoring two years ago, not having done any kind of teaching before, and (mostly) loved it, so here I am, back again, doing it alongside research and other work.

I’ve tutored various 1st, 2nd and 3rd year units, but most of the time I stick with Statistics. Why Stats? Quite a few of the students I teach openly admit they’re scared Stats. So I give them a bit of a spiel at the start of the semester. Stats is important, obviously so if you’re running your own research, so you can make sense of your data and see how your hypotheses fared. But even if you don’t go on to run your own experiments, in any area of science or health science you end up in, you’ll be able to critically evaluate journal articles, for example about different treatments, and make up your own mind about the results*. And even if you don’t stay in science, if you get Stats you will find people who want to be your friends, because so many people are scared of Stats**. Stats is also relevant to lots of other areas, like marketing and politics.

Riveting stuff 😉

But I do think the above is true, and the reason I generally choose to tutor Stats over other areas is because I want to make it a bit less scary for the students, and hopefully get some of them interested in Stats. (And also, other more selfish reasons, like keeping it fresh in my mind for my own research needs, and also because the marking is more objective and straight-forward than in other subjects. And also professionally selfish reasons, like increasing the Stats literacy of the future Psychology workforce.)


* What I don’t tell them is that it takes a long time, and a fair bit of not only statistical knowledge, but also knowledge of research methods in general and also often of a particular area of research, to really be able to engage critically with a paper’s results section.

** You might prefer people to befriend you based on your stellar personality and sparkling wit, but as a fellow Stats enthusiast I’m certain you possess both of these attributes in spades.