January 2011 – [citation needed]

to each their own addiction

An only slightly fictionalized story, for my long-suffering wife.

“It’s happening again,” I tell my wife from the couch. “I’m having that soul-crushing experience again.”

“Too much work?” she asks, expecting the answer to be yes, since no matter what quantity of work I’m actually burdened with at any given moment, the way I describe it to to other people when they ask is always “too much.”

“No,” I say. “Work is fine right now.”

“Had a paper rejected?”

“Pfft, no,” I say. “Like that ever happens to me!” (I don’t tell her it’s happened to me twice in the past week.)

“Then what?”

“The blog posts,” I tell her, motioning to my laptop screen. “There’s just too many of them in my Reader. I can’t keep up! I’m drowning in RSS feeds!”

My wife has learned not to believe anything I say, ever; we’ve lived together long enough that her modal response to my complaints is an arched eyebrow. So I flip my laptop around and point at the gigantic bolded text in the corner that says All Items (118). Emotionally gigantic, I mean; physically, I think it’s only like 12 point font.

“One hundred and eighteen blog posts!” I yell at absolutely no one. “I’m going to be here all night!”

“That’s because you live here,” she helpfully points out.

I’m not sure exactly when I became enslaved by my blog feeds. I know it was sometime after Carl Zimmer‘s amazing post about the man-eating fireflies of Sri Lanka, and sometime before the Neuroskeptic self-published his momentous report introducing three entirely new mental health diagnoses. But that’s as much as I can tell you; the rest is lost in a haze of rapid-scrolling text, retweeted links, and never-ending comment threads. There’s no alarm bell that sounds out loud to indicate that you’ve stomped all over the line that separates occasional indulgence from outright “I can quit any time, honest!” abuse. No one shows up at your door, hands you a bucket of Skittles, and says, “congratulations! You’re hooked on feeds!”

The thought of all those unread posts piling up causes me to hyperventilate. My wife, who sits unperturbed in her chair as 1,000+ unread articles pile up in her Reader, stares at me with a mixture of bemusement and horror.

“Let’s go for a walk,” she suggests, making a completely transparent effort to distract me from my immense problems.

Going for a walk is, of course, completely out of the question; I still have 118 blog posts to read before I can do anything else. So I read all 118 posts, which turns out not to take all night, but more like 15 minutes (I have a very loose definition of reading; it’s closer to what other people call ‘seeing’). By the time I’ve done that, the internet has written another 8 new articles, so now I feel compelled to read those too. So I do that, and then I hit refresh again, and lo and behold, there are 2 MORE articles. So I grudgingly read those as well, and then I quickly shut my laptop so that no new blog posts can sneak up on me while I’m off hanging out in Microsoft Word pretending to do work.

Screw this, I think after a few seconds, and run to find my wife.

“Come on, let’s go for that walk,” I say, running as fast as I can towards my sandals.

“What’s the big rush,” she asks. “I want to go walking, not jogging; I already went to the gym today.”

“No choice,” I say. “We have to get back before the posts pile up again.”

“What?”

“I said, I have a lot of work to do.”

So we go out walking, and it’s nice and all that; the temperature is probably around 70 degrees; it’s cool and dry and the sun’s just going down; the ice cream carts are out in force on the Pearl Street mall; the jugglers juggle and the fire eaters eat fire and give themselves cancer; a little kid falls down and skins his knee but gets up and laughs like it didn’t even hurt, which it probably didn’t, because everyone knows children under seven years of age don’t have a central nervous system and can’t feel pain. It’s a really nice walk, and I’m happy we’re on it, but the whole time I keep thinking, How many dozens of posts has PZ Myers put up while I’ve been gone? Are Razib Khan and Ed Yong posting their link dumps as I think this? And what’s the over-under on the number of posts in my ‘cog blogs’ folder?

She sees me doing all this of course, and she’s not happy about it. So she lets me know it.

“I’m not happy about this,” she says.

When we get back, we each back to our respective computer screen. I’m relieved to note that the internet’s only made 11 more deliveries, which I promptly review and discharge. I star two posts for later re-consideration and let the rest disappear into the ether of spent words. Then I open up a manuscript I’ve been working on for a while and pretend to do some real work for a couple of hours. With periodic edutainment breaks, of course.

Around 11:30 pm I decide to close up shop for the night. No one really blogs after about 9 pm, which is fortunate, or I’d never get any sleep. It’s also the reason I avoid subscribing to European blogs if I can help it. Europeans have no respect for Mountain Time.

“Are you coming to bed,” I ask my wife.

“Not yet,” she says, looking guilty and avoiding eye contact.

“Why not? You have work to do?”

“Nope, no work.”

“Cooking? Are you making a fancy meal for dinner tomorrow?”

“No, it’s your turn to cook tomorrow,” she says, knowing full well that my idea of cooking consists of a take-out menu and telephone.

“Then what?”

She opens her mouth, but nothing comes out. The words are all jammed tightly in between her vocal cords.

Then I see it, poking out on the couch from under a pillow: green cover, 9 by 6 inches, 300 pages long. It’s that damn book!

“You’re reading Pride and Prejudice again,” I say. It’s an observation, not a question.

“No I’m not.”

“Yes you are. You’re reading that damn book again. I know it. I can see it. It’s right there.” I point at it, just so that there can’t possibly be any ambiguity about which book I’m talking about.

She gazes around innocently, looking at everything but the book.

“What is that, like the fourteenth time this year you’ve read it?”

“Twelfth,” she says, looking guilty. “But really, go to bed without me; I might be up for a while still. I have another fifty pages or so I need to finish before I can go to sleep. I just have to find out if Elizabeth Bennet and Mr. Darcy end up together.”

I look at her mournfully, quietly shut my laptop’s lid, and bid the both of them–wife and laptop–good night. My wife grudgingly nods, but doesn’t look away from Jane Austen’s pages. My RSS feeds don’t say anything either.

“Yes,” I mumble to no one in particular, as I slowly climb up the stairs and head for my toothbrush.

“Yes, they do end up together.”

you can’t make this stuff up (but Freud could)

Out of idle curiosity, I just spent a few minutes looking up the origin of the phrase “the narcissism of small differences.” Turns out it’s one of Freud’s many contributions to our lexicon, and originates in his 1917 article The Taboo of Virginity:

Crawley, in terms that are hardly distinguishable from those employed by psychoanalysis, sets forth how each individual is separated from the others by a “taboo of personal isolation” and that it is precisely the little dissimilarities in persons who are otherwise alike that arouse feelings of strangeness and enmity between them. It would be tempting to follow up this idea and trace back to this “narcissism of small differences” the antagonism which in all human relations we see successfully combating feelings of fellowship and the commandment of love towards all men.

…so there’s that question answered. As Freud goes, this is positively lucid prose; for context, the very next sentence is: Psychoanalysis believes that, in pointing out the castration complex and its influence on the estimation in which women are held, it has discovered one of the chief factors underlying the narcissistic rejection of women by men that is so liberally mingled with disdain.

And then there are lots of other little gems in the same article, like this one:

We know, however, that the first act of intercourse is by no means always followed by this behaviour; very often the experience merely signifies a disappointment to the woman, who remains cold and unsatisfied; usually it takes some time and frequent repetition of the sexual act before satisfaction in it for her too sets in.

Freud justifiably gets a lot of credit for revolutionizing the study of the mind, but it’s worth remembering that he also did a lot of cocaine.

of postdocs and publishing models: two opportunities of (possible) interest

I don’t usually use this blog to advertise things (so please don’t send me requests to publicize your third cousin’s upcoming bar mitzvah), but I think these two opportunities are pretty cool. They also happen to be completely unrelated, but I’m too lazy to write two separate posts, so…

Opportunity 1: We’re hiring!

Well, not me personally, but a guy I know. My current postdoc advisor, Tor Wager, is looking to hire up to 4 postdocs in the next few months to work on various NIH-funded projects related to the neural substrates of pain and emotion. You would get to play with fun things like fMRI scanners, thermal stimulators, and machine learning techniques. Oh, and snow, because we’re located in Boulder, Colorado. So we have. A lot. Of snow.

Anyway, Tor is great to work with, the lab is full of amazing people and great resources, and Boulder is a fantastic place to live, so if you have (or expect to soon have) a PhD in affective/cognitive neuroscience or related field and a background in pain/emotion research and/or fMRI analysis and/or machine learning and/or psychophysiology, you should consider applying! See this flyer for more details. And no, I’m not being paid to say this.

Opportunity 2: Design the new science!

That’s a cryptic way of saying that there’s a forthcoming special issue of Frontiers in Computational Neuroscience that’s going to focus on “Visions for Open Evaluation of Scientific Papers by Post-Publication Peer Review.” As far as I can tell, that basically means that if you’re like every other scientist, and think there’s more to scientific evaluation than the number of publications and citations one has, you now have an opportunity to design a perfect evaluation system of your very own–meaning, of course, that system in which you end up at or near the very top.

In all seriousness though, this seems like a really great idea, and I think it’s the kind of thing that could actually have a very large impact on how we’re all doing–or at least communicating–science 10 or 20 years from now. The special issue will be edited by Niko Kriegeskorte, whose excellent ideas about scientific publishing I’ve previously blogged about, and Diana Deca. Send them your best ideas! And then, if it’s not too much trouble, put my name on your paper. You know, as a finder’s fee. Abstracts are due January 15th.

The psychology of parapsychology, or why good researchers publishing good articles in good journals can still get it totally wrong

Unless you’ve been pleasantly napping under a rock for the last couple of months, there’s a good chance you’ve heard about a forthcoming article in the Journal of Personality and Social Psychology (JPSP) purporting to provide strong evidence for the existence of some ESP-like phenomenon. (If you’ve been napping, see here, here, here, here, here, or this comprehensive list). In the article–appropriately titled Feeling the Future—Daryl Bem reports the results of 9 (yes, 9!) separate experiments that catch ordinary college students doing things they’re not supposed to be able to do–things like detecting the on-screen location of erotic images that haven’t actually been presented yet, or being primed by stimuli that won’t be displayed until after a response has already been made.

As you might expect, Bem’s article’s causing quite a stir in the scientific community. The controversy isn’t over whether or not ESP exists, mind you; scientists haven’t lost their collective senses, and most of us still take it as self-evident that college students just can’t peer into the future and determine where as-yet-unrevealed porn is going to soon be hidden (as handy as that ability might be). The real question on many people’s minds is: what went wrong? If there’s obviously no such thing as ESP, how could a leading social psychologist publish an article containing a seemingly huge amount of evidence in favor of ESP in the leading social psychology journal, after being peer reviewed by four other psychologists? Or, to put it in more colloquial terms–what the fuck?

What the fuck?

Many critiques of Bem’s article have tried to dismiss it by searching for the smoking gun–the single critical methodological flaw that dooms the paper. For instance, one critique that’s been making the rounds, by Wagenmakers et al, argues that Bem should have done a Bayesian analysis, and that his failure to adjust his findings for the infitesimally low prior probability of ESP (essentially, the strength of subjective belief against ESP) means that the evidence for ESP is vastly overestimated. I think these types of argument have a kernel of truth, but also suffer from some problems (for the record, I don’t really agree with the Wagenmaker critique, for reasons Andrew Gelman has articulated here). Having read the paper pretty closely twice, I really don’t think there’s any single overwhelming flaw in Bem’s paper (actually, in many ways, it’s a nice paper). Instead, there are a lot of little problems that collectively add up to produce a conclusion you just can’t really trust. Below is a decidedly non-exhaustive list of some of these problems. I’ll warn you now that, unless you care about methodological minutiae, you’ll probably find this very boring reading. But that’s kind of the point: attending to this stuff is so boring that we tend not to do it, with potentially serious consequences. Anyway:

Bem reports 9 different studies, which sounds (and is!) impressive. But a noteworthy feature these studies is that they have grossly uneven sample sizes, ranging all the way from N = 50 to N = 200, in blocks of 50. As far as I can tell, no justification for these differences is provided anywhere in the article, which raises red flags, because the most common explanation for differing sample sizes–especially on this order of magnitude–is data peeking. That is, what often happens is that researchers periodically peek at their data, and halt data collection as soon as they obtain a statistically significant result. This may seem like a harmless little foible, but as I’ve discussed elsewhere, is actually a very bad thing, as it can substantially inflate Type I error rates (i.e., false positives).To his credit, Bem was at least being systematic about his data peeking, since his sample sizes always increase in increments of 50. But even in steps of 50, false positives can be grossly inflated. For instance, for a one-sample t-test, a researcher who peeks at her data in increments of 50 subjects and terminates data collection when a significant result is obtained (or N = 200, if no such result is obtained) can expect an actual Type I error rate of about 13%–nearly 3 times the nominal rate of 5%!

There’s some reason to think that the 9 experiments Bem reports weren’t necessarily designed as such. Meaning that they appear to have been ‘lumped’ or ‘splitted’ post hoc based on the results. For instance, Experiment 2 had 150 subjects, but the experimental design for the first 100 differed from the final 50 in several respects. They were minor respects, to be sure (e.g., pictures were presented randomly in one study, but in a fixed sequence in the other), but were still comparable in scope to those that differentiated Experiment 8 from Experiment 9 (which had the same sample size splits of 100 and 50, but were presented as two separate experiments). There’s no obvious reason why a researcher would plan to run 150 subjects up front, then decide to change the design after 100 subjects, and still call it the same study. A more plausible explanation is that Experiment 2 was actually supposed to be two separate experiments (a successful first experiment with N = 100 followed by an intended replication with N = 50) that was collapsed into one large study when the second experiment failed–preserving the statistically significant result in the full sample. Needless to say, this kind of lumping and splitting is liable to additionally inflate the false positive rate.
Most of Bem’s experiments allow for multiple plausible hypotheses, and it’s rarely clear why Bem would have chosen, up front, the hypotheses he presents in the paper. For instance, in Experiment 1, Bem finds that college students are able to predict the future location of erotic images that haven’t yet been presented (essentially a form of precognition), yet show no ability to predict the location of negative, positive, or romantic pictures. Bem’s explanation for this selective result is that “… such anticipation would be evolutionarily advantageous for reproduction and survival if the organism could act instrumentally to approach erotic stimuli …”. But this seems kind of silly on several levels. For one thing, it’s really hard to imagine that there’s an adaptive benefit to keeping an eye out for potential mates, but not for other potential positive signals (represented by non-erotic positive images). For another, it’s not like we’re talking about actual people or events here; we’re talking about digital images on an LCD. What Bem is effectively saying is that, somehow, someway, our ancestors evolved the extrasensory capacity to read digital bits from the future–but only pornographic ones. Not very compelling, and one could easily have come up with a similar explanation in the event that any of the other picture categories had selectively produced statistically significant results. Of course, if you get to test 4 or 5 different categories at p < .05, and pretend that you called it ahead of time, your false positive rate isn’t really 5%–it’s closer to 20%.
I say p < .05, but really, it’s more like p < .1, because the vast majority of tests Bem reports use one-tailed tests–effectively instantaneously doubling the false positive rate. There’s a long-standing debate in the literature, going back at least 60 years, as to whether it’s ever appropriate to use one-tailed tests, but even proponents of one-tailed tests will concede that you should only use them if you really truly have a directional hypothesis in mind before you look at your data. That seems exceedingly unlikely in this case, at least for many of the hypotheses Bem reports testing.
Nearly all of Bem’s statistically significant p values are very close to the critical threshold of .05. That’s usually a marker of selection bias, particularly given the aforementioned unevenness of sample sizes. When experiments are conducted in a principled way (i.e., with minimal selection bias or peeking), researchers will often get very low p values, since it’s very difficult to know up front exactly how large effect sizes will be. But in Bem’s 9 experiments, he almost invariably collects just enough subjects to detect a statistically significant effect. There are really only two explanations for that: either Bem is (consciously or unconsciously) deciding what his hypotheses are based on which results attain significance (which is not good), or he’s actually a master of ESP himself, and is able to peer into the future and identify the critical sample size he’ll need in each experiment (which is great, but unlikely).
Some of the correlational effects Bem reports–e.g., that people with high stimulus seeking scores are better at ESP–appear to be based on measures constructed post hoc. For instance, Bem uses a non-standard, two-item measure of boredom susceptibility, with no real justification provided for this unusual item selection, and no reporting of results for the presumably many other items and questionnaires that were administered alongside these items (except to parenthetically note that some measures produced non-significant results and hence weren’t reported). Again, the ability to select from among different questionnaires–and to construct custom questionnaires from different combinations of items–can easily inflate Type I error.
It’s not entirely clear how many studies Bem ran. In the Discussion section, he notes that he could “identify three sets of findings omitted from this report so far that should be mentioned lest they continue to languish in the file drawer”, but it’s not clear from the description that follows exactly how many studies these “three sets of findings” comprised (or how many ‘pilot’ experiments were involved). What we’d really like to know is the exact number of (a) experiments and (b) subjects Bem ran, without qualification, and including all putative pilot sessions.

It’s important to note that none of these concerns is really terrible individually. Sure, it’s bad to peek at your data, but data peeking alone probably isn’t going to produce 9 different false positives. Nor is using one-tailed tests, or constructing measures on the fly, etc. But when you combine data peeking, liberal thresholds, study recombination, flexible hypotheses, and selective measures, you have a perfect recipe for spurious results. And the fact that there are 9 different studies isn’t any guard against false positives when fudging is at work; if anything, it may make it easier to produce a seemingly consistent story, because reviewers and readers have a natural tendency to relax the standards for each individual experiment. So when Bem argues that “…across all nine experiments, Stouffer’s z = 6.66, p = 1.34 Ã— 10-11,” that statement that the cumulative p value is 1.34 x 10-11 is close to meaningless. Combining p values that way would only be appropriate under the assumption that Bem conducted exactly 9 tests, and without any influence of selection bias. But that’s clearly not the case here.

What would it take to make the results more convincing?

Admittedly, there are quite a few assumptions involved in the above analysis. I don’t know for a fact that Bem was peeking at his data; that just seems like a reasonable assumption given that no justification was provided anywhere for the use of uneven samples. It’s conceivable that Bem had perfectly good, totally principled, reasons for conducting the experiments exactly has he did. But if that’s the case, defusing these criticisms should be simple enough. All it would take for Bem to make me (and presumably many other people) feel much more comfortable with the results is an affirmation of the following statements:

That the sample sizes of the different experiments were determined a priori, and not based on data snooping;
That the distinction between pilot studies and ‘real’ studies was clearly defined up front–i.e., there weren’t any studies that started out as pilots but eventually ended up in the paper, or studies that were supposed to end up in the paper but that were disqualified as pilots based on the (lack of) results;
That there was a clear one-to-one mapping between intended studies and reported studies; i.e., Bem didn’t ‘lump’ together two different studies in cases where one produced no effect, or split one study into two in cases where different subsets of the data both showed an effect;
That the predictions reported in the paper were truly made a priori, and not on the basis of the results (e.g., that the hypothesis that sexually arousing stimuli would be the only ones to show an effect was actually written down in one of Bem’s notebooks somewhere);
That the various transformations applied to the RT and memory performance measures in some Experiments weren’t selected only after inspecting the raw, untransformed values and failing to identify significant results;
That the individual differences measures reported in the paper were selected a priori and not based on post-hoc inspection of the full pattern of correlations across studies;
That Bem didn’t run dozens of other statistical tests that failed to produce statistically non-significant results and hence weren’t reported in the paper.

Endorsing this list of statements (or perhaps a somewhat more complete version, as there are other concerns I didn’t mention here) would be sufficient to cast Bem’s results in an entirely new light, and I’d go so far as to say that I’d even be willing to suspend judgment on his conclusions pending additional data (which would be a big deal for me, since I don’t have a shred of a belief in ESP). But I confess that I’m not holding my breath, if only because I imagine that Bem would have already addressed these concerns in his paper if there were indeed principled justifications for the design choices in question.

It isn’t a bad paper

If you’ve read this far (why??), this might seem like a pretty damning review, and you might be thinking, boy, this is really a terrible paper. But I don’t think that’s true at all. In many ways, I think Bem’s actually been relatively careful. The thing to remember is that this type of fudging isn’t unusual; to the contrary, it’s rampant–everyone does it. And that’s because it’s very difficult, and often outright impossible, to avoid. The reality is that scientists are human, and like all humans, have a deep-seated tendency to work to confirm what they already believe. In Bem’s case, there are all sorts of reasons why someone who’s been working for the better part of a decade to demonstrate the existence of psychic phenomena isn’t necessarily the most objective judge of the relevant evidence. I don’t say that to impugn Bem’s motives in any way; I think the same is true of virtually all scientists–including myself. I’m pretty sure that if someone went over my own work with a fine-toothed comb, as I’ve gone over Bem’s above, they’d identify similar problems. Put differently, I don’t doubt that, despite my best efforts, I’ve reported some findings that aren’t true, because I wasn’t as careful as a completely disinterested observer would have been. That’s not to condone fudging, of course, but simply to recognize that it’s an inevitable reality in science, and it isn’t fair to hold Bem to a higher standard than we’d hold anyone else.

If you set aside the controversial nature of Bem’s research, and evaluate the quality of his paper purely on methodological grounds, I don’t think it’s any worse than the average paper published in JPSP, and actually probably better. For all of the concerns I raised above, there are many things Bem is careful to do that many other researchers don’t. For instance, he clearly makes at least a partial effort to avoid data peeking by collecting samples in increments of 50 subjects (I suspect he simply underestimated the degree to which Type I error rates can be inflated by peeking, even with steps that large); he corrects for multiple comparisons in many places (though not in some places where it matters); and he devotes an entire section of the discussion to considering the possibility that he might be inadvertently capitalizing on chance by falling prey to certain biases. Most studies–including most of those published in JPSP, the premier social psychology journal–don’t do any of these things, even though the underlying problems are just applicable. So while you can confidently conclude that Bem’s article is wrong, I don’t think it’s fair to say that it’s a bad article–at least, not by the standards that currently hold in much of psychology.

Should the study have been published?

Interestingly, much of the scientific debate surrounding Bem’s article has actually had very little to do with the veracity of the reported findings, because the vast majority of scientists take it for granted that ESP is bunk. Much of the debate centers instead over whether the article should have ever been published in a journal as prestigious as JPSP (or any other peer-reviewed journal, for that matter). For the most part, I think the answer is yes. I don’t think it’s the place of editors and reviewers to reject a paper based solely on the desirability of its conclusions; if we take the scientific method–and the process of peer review–seriously, that commits us to occasionally (or even frequently) publishing work that we believe time will eventually prove wrong. The metrics I think reviewers should (and do) use are whether (a) the paper is as good as most of the papers that get published in the journal in question, and (b) the methods used live up to the standards of the field. I think that’s true in this case, so I don’t fault the editorial decision. Of course, it sucks to see something published that’s virtually certain to be false… but that’s the price we pay for doing science. As long as they play by the rules, we have to engage with even patently ridiculous views, because sometimes (though very rarely) it later turns out that those views weren’t so ridiculous after all.

That said, believing that it’s appropriate to publish Bem’s article given current publishing standards doesn’t preclude us from questioning those standards themselves. On a pretty basic level, the idea that Bem’s article might be par for the course, quality-wise, yet still be completely and utterly wrong, should surely raise some uncomfortable questions about whether psychology journals are getting the balance between scientific novelty and methodological rigor right. I think that’s a complicated issue, and I’m not going to try to tackle it here, though I will say that personally I do think that more stringent standards would be a good thing for psychology, on the whole. (It’s worth pointing out that the problem of (arguably) lax standards is hardly unique to psychology; as John Ionannidis has famously pointed out, most published findings in the biomedical sciences are false.)

Conclusion

The controversy surrounding the Bem paper is fascinating for many reasons, but it’s arguably most instructive in underscoring the central tension in scientific publishing between rapid discovery and innovation on the one hand, and methodological rigor and cautiousness on the other. Both values are important, but it’s important to recognize the tradeoff that pursuing either one implies. Many of the people who are now complaining that JPSP should never have published Bem’s article seem to overlook the fact that they’ve probably benefited themselves from the prevalence of the same relaxed standards (note that by ‘relaxed’ I don’t mean to suggest that journals like JPSP are non-selective about what they publish, just that methodological rigor is only one among many selection criteria–and often not the most important one). Conversely, maintaining editorial standards that would have precluded Bem’s article from being published would almost certainly also make it much more difficult to publish most other, much less controversial, findings. A world in which fewer spurious results are published is a world in which fewer studies are published, period. You can reasonably debate whether that would be a good or bad thing, but you can’t have it both ways. It’s wishful thinking to imagine that reviewers could somehow grow a magic truth-o-meter that applies lax standards to veridical findings and stringent ones to false positives.

From a bird’s eye view, there’s something undeniably strange about the idea that a well-respected, relatively careful researcher could publish an above-average article in a top psychology journal, yet have virtually everyone instantly recognize that the reported findings are totally, irredeemably false. You could read that as a sign that something’s gone horribly wrong somewhere in the machine; that the reviewers and editors of academic journals have fallen down and can’t get up, or that there’s something deeply flawed about the way scientists–or at least psychologists–practice their trade. But I think that’s wrong. I think we can look at it much more optimistically. We can actually see it as a testament to the success and self-corrective nature of the scientific enterprise that we actually allow articles that virtually nobody agrees with to get published. And that’s because, as scientists, we take seriously the possibility, however vanishingly small, that we might be wrong about even our strongest beliefs. Most of us don’t really believe that Cornell undergraduates have a sixth sense for future porn… but if they did, wouldn’t you want to know about it?

Bem, D. J. (2011). Feeling the Future: Experimental Evidence for Anomalous Retroactive Influences on Cognition and Affect Journal of Personality and Social Psychology

Share this:

Share this:

Opportunity 1: We’re hiring!

Opportunity 2: Design the new science!

Share this:

What the fuck?

What would it take to make the results more convincing?

It isn’t a bad paper

Should the study have been published?

Conclusion

Share this: