Replicate My Work!

Scientific work requires transparency. There is no mad genius in his/her lonely tower working for years on end on some great invention. While it may be true that professors have little time for anything but their research, they communicate their findings (along with their methods). Science is a social enterprise. Primed by Gary King‘s essay “Replication, Replication” (1995) and lectures by Rainer Schnell, I arrived at the conclusion that a scientific workflow must be a reproducible workflow. I do think that making replication material broadly available is a good thing for everyone involved.

Replication materials for my recent publications can now be found online. Maintaining a reproducible workflow is hard work but rewarding. Looking back, I could have improved a lot of things (without changing the results, mind you). It felt a bit awkward at first. Soon enough it felt even more awkward to have waited so long to put up the material. I wish I could share more of my older publications (and also raw data) but privacy laws, work contracts, and fellow psychologists who are highly skeptical of these ideas keep me from doing so.

Hopefully, the present material is just the beginning. Sadly, most psychologists do not share their materials publicly so I had to figure out most stuff on my own. I decided against third-party repositories because some focus solely on data sets whereas others are somewhat difficult to handle. So I wrote the HTML by hand hoping that a plain format allows for longevity. Let me know if you have any suggestions for improvements.

Practical tips for statisticians (part 7)

A couple of days ago I got hold of the book The Workflow of Data Analysis Using Stata by J. Scott Long. I haven’t yet delved into it. But I’m already loving and condemning it. Loving it, because it covers an integral part of scientific data analysis, filling a void that left by both the literature and the courses taught at university. Condemning it, because I had wanted to write a book on the same topic (how to ensure your data analysis is documented well, i.e., replicable) during the next years. It wouldn’t have been the same book; in fact, it would have been vastly different, possibly much worse.

It’s too early for me to review the book in a conclusive manner. Still, the content looks very promising and I think it’s telling that Long focuses on Stata as the software of choice. This is going to be fun!