When is a Measure of Oxytocin No Such Thing?
A few days ago, I promised I would demonstrate how you can use Denny Borsboom and colleagues’ concept of validity to evaluate whether a scientific device is a valid tool for scientific measurement. To review, Borsboom and colleagues argued that we can claim that a scientific device D provides valid measurement of an invisible substance, force, or trait (which I will represent as T) when two conditions are obtained:
(1) T must exist;
(2) T must cause physical changes to D that can be read off as measurements of T.
Once you accept this definition of validity, evaluating whether a scientific device is actually a scientific measure becomes simple (not necessarily easy, but simple)—even fun: You need to concern yourself with trying to find answers to exactly two questions: First, does this thing that researchers call T even exist? In other words, is T what philosophers of science would call a “natural kind?" Second, if T does exist, are we justified in believing that the natural kind we have named T causes physical changes to the Device that can then be read off as measurements of T?
In this post, I’ll use this approach to think through the validity of a biological assay technique that is often used in hopes of measuring oxytocin in human body fluids such as blood plasma, serum, or saliva. I’ve written a bit on this blog about research in humans on the social causes and effects of oxytocin (for example, here, here, and here). My colleagues and I see signs that a lot of the enthusiasm for this research is being driven by wishful thinking about whether the devices that are being called oxytocin assays are actually valid measures. To gain purchase on this particular validity problem, Borsboom and colleagues’ concept of validity tells you everything you should want to know: First, you will want to know whether there is a natural kind in the world that corresponds to the concept that we have decided to call oxytocin. Second, you will want to know whether that natural kind that we are calling oxytocin is responsible for physical changes in the Device. That’s all you need to care about.
The consequence of accepting this simple but strict definition is liberating. Among other things, you can brush aside validity arguments that rest on claims that individual differences in measured levels of oxytocin are correlated (for instance) with self-ratings of social support, or scores on a measure of empathic accuracy, or how many Facebook friends people have. Sure, all of those correlations might fit with somebody’s theory of oxytocin, but validity arguments that rest on correlational claims like that are so 20th-century.
All you need to concern yourself with is (a) whether oxytocin exists; and (b) whether oxytocin is causally responsible for the scores that your device produces. You can quickly satisfy yourself that (a) is true: Sir Henry Dale extracted oxytocin from the human pituitary gland in 1909. The biochemist Vincent de Vigneaud identified its molecular structure in 1953. So, all that’s left to confirm is (b).
And how do we confirm (b)? Through experimental research, not correlational research. Quite simply, the question we want an answer to is this: Does the nine-amino-acid substance that we have come to call oxytocin exert causal effects on the physical states of a particular device that we can read off as measurements of that substance? If so, when you add known quantities of oxytocin to a container (or to an animal) that has zero oxytocin in it, the device should then undergo physical changes in proportion to the amount of oxytocin that you added. From those changes, it should be possible to work backwards and solve for the amount of oxytocin that was added in the first place. If you can’t do that (and, as a few of us have been arguing, with some of the most popular approaches to assaying oxytocin, you can’t), then you should doubt the validity of that particular assay. Indeed, it might be more accurate to view such a device as a very expensive random number generator.
The oxytocin assays I am referring to here fail Borsboom’s validity test because they fail on criterion (b): Changes in physical states of those assays cannot be read off veridically as changes in oxytocin. Other measures can fail the Borsboom test for a more interesting reason: The trait that they supposedly measure doesn’t exist in the first place.
I’ll look at that scenario in my next post.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111, 1061-1071.
Postscript: Denny Borsboom tells me that the reason that their paper is not visible to Google Scholar is that the citations for that paper are getting merged with another one of their papers. What a drag.