better tools for mining the scientific literature

Freethinker’s Asylum has a great post reviewing a number of tools designed to help researchers mine the scientific literature–an increasingly daunting task. The impetus for the post is this article in the latest issue of Nature (note: restricted access), but the FA post discusses a lot of tools that the Nature article doesn’t, and focuses in particular on websites that are currently active and publicly accessible, rather than on proprietary tools currently under development in dark basement labs and warehouses. I hadn’t seen most of these before, but am looking forward to trying them out–e.g., pubget:

When you create an account, pubget signs in to your institution and allows you to search the subscribed resources. When you find a reference you want, just click the pdf icon and there it is. No clicking through to content provider websites. You can tag references as “keepers“ to come back to them later, or search for the newest articles from a particular journal.

Sounds pretty handy…

Many of the other sites–as well as most of those discussed in the Nature article–focus on data and literature mining in specific fields, e.g., PubGene and PubAnatomy. These services, which allow you to use specific keywords or topics (e.g., specific genes) to constrain literature searches, aren’t very useful to me personally. But it’s worth pointing out that there are some emerging services that fill much the same niche in the world of cognitive neuroscience that I’m more familiar with. The one that currently looks most promising, in my opinion, is the Cognitive Atlas project led by Russ Poldrack, which is “a collaborative knowledge building project that aims to develop a knowledge base (or ontology) that characterizes the state of current thought in cognitive science. … The Cognitive Atlas aims to capture knowledge from users with expertise in psychology, cognitive science, and neuroscience.”

The Cognitive Atlas is officially still in beta, and you need to have a background in cognitive neuroscience in order to sign up to contribute. But there’s already some content you can navigate, and the site, despite being in the early stages of development, is already pretty impressive. In the interest of full disclosure, as well as shameless plugging, I should note that Russ will be giving a talk about the Cognitive Atlas project as part of a symposium I’m chairing at CNS in Montreal this year. So if you want to learn more about it, stop by! Meantime, check out the Freethinker’s Asylum post for links to all sorts of other interesting tools…

one possible future of scientific publishing

Like many (most?) scientists, I’ve often wondered what a world without Elsevier would look like. Not just Elsevier, mind you; really, the entire current structure of academic publishing, which revolves around a gatekeeper model where decisions about what gets published where are concentrated in the hands of a very few people (typically, an editor and two or three reviewers). The way scientists publish papers really hasn’t kept up with the pace of technology; the tools we have these days allow us, in theory, to build systems that support the immediate and open publication of scientific findings, which could then be publicly reviewed, collaboratively filtered, and quantitatively evaluated using all sorts of metrics that just aren’t available in a closed system.

One particularly compelling vision is articulated by Niko Kriegeskorte, who presents a beautiful schematic of one potential approach to the future of academic publishing. I’m a big fan of Niko’s work (see e.g., this, this, or this)–almost everything he publishes is great, and his articles consistently feature absolutely stunning figures–and these ideas are no exception. The central motif, which I’m wholly sympathetic to, is to eliminate gatekeepers and promote open review and evaluation. Instead of a secret cabal small group of other researchers (and potential competitors) making behind-the-scenes decisions about whether to accept or reject your paper, you’d publish your findings online in a centralized repository as soon as you felt it was ready for prime time. At that point, the broader community of researchers would set about evaluating, rating, and commenting on your work. Crucially, all of the reviews would also be made public (either in signed or anonymous form), so that other researchers could evaluate not only the work itself, but also the responses to it. Reviews would therefore count as a form of publication, and one can then imagine all sorts of sophisticated metrics that could take into account not only the reception of one’s publications, but also the quality and nature of the reviews themselves, the quality of one’s own ratings of others’ work, and so on. Schematically, it looks like this:

review review review!

Anyway, that’s just a cursory overview; Niko’s clearly put a lot of thought into developing a publishing architecture that overcomes the drawbacks of the current system while providing researchers with an incentive to participate (the sociological obstacles are arguably greater than the technical ones in this case). Well, at least in theory. Admittedly, it’s always easier to design a complex system on paper than to actually build it and make it work. But you have to start somewhere, and this seems like a pretty good place.

what do turtles, sea slugs, religion, and TED all have in common?

…absolutely nothing, actually, except that they’re all mentioned in this post. I’m feeling lazy very busy this week, so instead of writing a long and boring diatribe about clowns, ROIs, or personality measures, I’ll just link to a few interesting pieces elsewhere:

Razib of Gene Expression has an interesting post on the rapid secularization of America, and the relation of religious affiliation to political party identification. You wouldn’t know it from the increasing political clout of the religious right, but Americans are substantially more likely to report having no religious affiliation today than they were 20 years ago. I mean a lot more likely. In Vermont, over a third of the population now reports having no religion. Here’s an idea, Vermont: want to generate more tourism? I present your new slogan: Vermont, America’s Europe.

Sea slugs are awesome. If you doubt this, consider Exhibit A: a sea slug found off the East Coast that lives off photosynthesis:

The slugs look just like a leaf, green and about three centimetres long, and are found off the east coast of North America from Nova Scotia to Florida.

They acquire the ability to photosynthesize by eating algae and incorporating the plants’ tiny chlorophyll-containing structures, called chloroplasts, into their own cells.

You can’t make this stuff up! It’s a slug! That eats algae! And then turns into  leaf!

I’m a big fan of TED, and there’s a great interview with its curator, Chris Anderson, conducted by reddit. Reddit interviews are usually pretty good (see, e.g., Barney Frank and Christopher Hitchens); who knew the internet had the makings of a great journalist?!?

Ok, now for the turtles. According to PalMD, they cause salmonella. So much so that the CDC banned the sale of turtles under 4 inches in length in 1975. Apparently children just loved to smooch those cute little turtles. And the turtles, being evil, loved to give children a cute little case of salmonella. Result: ban small turtles and prevent 200,000 infections. Next up: frog-banning and salami-banning! Both are currently also suspected of causing salmonella outbreaks. Is there any species those bacteria can’t corrupt?

sea slug or leaf?

elsewhere on the internets…

The good people over at OKCupid, the best dating site on Earth (their words, not mine! I’m happily married!), just released a new slew of data on their OKTrends blog. Apparently men like women with smiley, flirty profile photos, and women like dismissive, unsmiling men. It’s pretty neat stuff, and definitely worth a read. Mating rituals aside, thuough, what I really like to think about whenever I see a new OKTrends post is how many people I’d be willing to kill to get my hands on their data.

Genetic Future covers the emergence of Counsyl, a new player in the field of personal genomics. Unlike existing outfits like 23andme and deCODEme.com, Counsyl focuses on rare Mendelian disorders, with an eye to helping prospective parents evaluate their genetic liabilities. What’s really interesting about Counsyl is its business model; if you have health insurance provided by Aetna or Blue Cross, you could potentially get a free test. Of course, the catch is that Aetna or Blue Cross get access to your results. In theory, this shouldn’t matter, since health insurers can’t use genetic information as grounds for discrimination. But then, on paper, employers can’t use race, gender, or sexual orientation as grounds for discrimination either, and yet we know it’s easier to get hired if your name is John than Jamal. That said, I’d probably go ahead and take Aetna up on its generous offer, except that my wife and I have no plans for kids, and the Counsyl test looks like it stays away from the garden-variety SNPs the other services cover…

The UK has banned the export of dowsing rods. In 2010! This would be kind of funny if not for the fact that dozens if not hundreds of Iraqis have probably died horrible deaths as a result of the Iraqi police force trying to detect roadside bombs using magic. [via Why Evolution is True].

Over at Freakonomics, regular contributor Ryan Hagen interviews psychologist, magician, and author Richard Wiseman, who just published a new empirically-based self-help book (can such a thing exist?). I haven’t read the book, but the interview is pretty good. Favorite quote:

What would I want to do? I quite like the idea of the random giving of animals. There’s a study where they took two groups of people and randomly gave people in one group a dog. But I’d quite like to replicate that with a much wider range of animals — including those that should be in zoos. I like the idea of signing up for a study, and you get home and find you’ve got to look after a wolf “¦ .

On a professional note, Professor in Training has a really great two part series (1, 2) on what new tenure-track faculty need to know before starting the job. I’ve placed both posts inside Google Reader’s golden-starred vault, and fully expect to come back to them next Fall when I’m on the job market. Which means if you’re reading this and you’re thinking of hiring me, be warned: I will demand that a life-size bobble-head doll of Hans Eysenck be installed in my office, and thanks to PiT, I do now have the awesome negotiating powers needed to make it happen.

why do we sing up?

While singing loudly to myself in the car the other day (windows safely rolled up, of course–I don’t want no funny looks from pedestrians), I noticed that the first few notes of the vocal melody of most songs seem to go up rather than down. That’s to say, the first pitch change in most songs seems to be from a lower note to a higher note; there don’t seem to be very many that show the opposite pattern. It actually took me a while to find a song that goes down at the beginning (Elliott Smith‘s Angeles–incidentally also my favorite song—which hangs on B for the first few bars before dropping to A); the first eight or nine I tried all went up. After carefully inspecting millions thousands hundreds several more songs on my drive home, I established that only around 10% of vocal melodies dip down initially (95% confidence interval = 0 – 80%); the rest all go up.

When I got home I did a slightly more systematic but still totally ascientific analysis. I write songs occasionally, so I went through them all, and found that only three or four go down; the rest all go up. But that could just be me. So then I went through a few of my favorite albums (not a random sample–I picked ones I knew well enough to rehearse mentally) and found the same pattern. I don’t know if this is a function of the genre of music I listen to (which I’d charitably describe as wuss-rock) or a general feature of most music, but it seems odd. Not having any musical training or talent, I’m not really sure why that would be, but I’d like to know. Does it have anything to do with the fact that most common chord progressions go up the scale initially? Is there some biological reason we find ascending notes more pleasant at the beginning of a melody? Is it a function of our speech production system? Does Broca’s Area just like going up more than going down? Is it just an arbitrary matter of convention, instilled in songwriters everywhere by all the upward-bound music that came before? And is the initial rise specific to English, or does it happen when people sing in other languages as well? Could it be something about the emotional connotation of rises versus drops? Do drops seem too depressing to kick off a song with? Are evil clowns behind it all? Or am I just imagining the whole thing, and there isn’t actually any bias toward initial upness?

Can we get an update on this? Are there any musicians/musicologists in the house?

how to measure 200 personality scales in 200 items

One of the frustrating things about personality research–for both researchers and participants–is that personality is usually measured using self-report questionnaires, and filling out self-report questionnaires can take a very long time. It doesn’t have to take a very long time, mind you; some questionnaires are very short, like the widely-used Ten-Item Personality Inventory (TIPI), which might take you a whole 2 minutes to fill out on a bad day. So you can measure personality quickly if you have to. But more often than not, researchers want to reliably measure a broad range of different personality traits, and that typically requires administering one or more long-ish questionnaires. For example, in my studies, I often give participants a battery of measures to fill out that includes some combination of the NEO-PI-R, EPQ-R, BIS/BAS scales, UPPS, GRAPES, BDI, TMAS, STAI, and a number of others. That’s a large set of acronyms, and yet it’s just a small fraction of what’s out there; every personality psychologist has his or her own set of favorite measures, and at personality conferences, duels-to-the-death often break out over silly things like whether measure X is better than measure Y, or whether measures A and B can be used interchangeably when no one’s looking. Personality measurement is a pretty intense sport.

The trouble with the way we usually measure personality is that it’s wildly inefficient, for two reasons. One is that many measures are much longer than they need to be. It’s not uncommon to see measures that score each personality trait using a dozen or more different items. In theory, the benefit of this type of redundancy is that you get a more reliable measure, because the error terms associated with individual items tends to cancel out. For example, if you want to know if I’m a depressive kind of guy, you shouldn’t just ask me, “hey, are you depressed?”, because lots of random factors could influence my answer to that one question. Instead, you should ask me a bunch of different questions, like: “hey, are you depressed?” and “why so glum, chum?”, and “does somebody need a hug?”. Adding up responses from multiple items is generally going to give you a more reliable measure. But in practice, it turns out that you typically don’t need more than a handful of items to measure most traits reliably. When people develop “short forms” of measures, the abbreviated scales often have just 4 – 5 items per trait, usually with relatively little loss of reliability and validity. So the fact that most of the measures we use have so many items on them is sort of a waste of both researchers’ and participants’ time.

The other reason personality measurement is inefficient is that most researchers recognize that different personality measures tend to measure related aspects of personality, and yet we persist in administering a whole bunch of questionnaires with similar content to our participants. If you’ve ever participated in a psychology experiment that involved filling out personality questionnaires, there’s a good chance you’ve wondered whether you’re just filling out the same questionnaire over and over. Well you are–kind of. Because the space of personality variation is limited (people can only differ from one another in so many ways), and because many personality constructs have complex interrelationships with one another, personality measures usually end up asking similarly-worded questions. So for example, one measure might give you Extraversion and Agreeableness scores whereas another gives you Dominance and Affiliation scores. But then it turns out that the former pair of dimensions can be “rotated” into the latter two; it’s just a matter of how you partition (or label) the variance. So really, when a researcher gives his or her participants a dozen measures to fill out, that’s not because anyone thinks that there are really a dozen completely different sets of traits to measures; it’s more because we recognize that each instrument gives you a slightly different take on personality, and we tend to think that having multiple potential viewpoints is generally a good thing.

Inefficient personality measurement isn’t inevitable; as I’ve already alluded to above, a number of researchers have developed abbreviated versions of common inventories that capture most of the same variance as much longer instruments. Probably the best-known example is the aforementioned TIPI, developed by Sam Gosling and colleagues, which gives you a workable index of people’s relative standing on the so-called Big Five dimensions of personality. But there are relatively few such abbreviated measures. And to the best of my knowledge, the ones that do exist are all focused on abbreviating a single personality measure. That’s unfortunate, because if you believe that most personality inventories have a substantial amount of overlap, it follows that you should be able to recapture scores on multiple different personality inventories using just one set of (non-redundant) items.

That’s exactly what I try to demonstrate in a paper to be published in the Journal of Research in Personality. The article’s entitled “The abbreviation of personality: How to measure 200 personality scales in 200 items“, which is a pretty accurate, if admittedly somewhat grandiose, description of the contents. The basic goal of the paper is two-fold. First, I develop an automated method for abbreviating personality inventories (or really, any kind of measure with multiple items and/or dimensions). The idea here is to shorten the time and effort required in order to generate shorter versions of existing measures, which should hopefully encourage more researchers to create such short forms. The approach I develop relies heavily on genetic algorithms, which are tools for programmatically obtaining high-quality solutions to high-dimensional problems using simple evolutionary principles. I won’t go into the details (read the paper if you want them!), but I think it works quite well. In the first two studies reported in the paper (data for which were very generously provided by Sam Gosling and Lew Goldberg, respectively), I show that you can reduce the length of existing measures (using the Big Five Inventory and the NEO-PI-R as two examples) quite dramatically with minimal loss of validity. It only takes a few minutes to generate the abbreviated measures, so in theory, it should be possible to build up a database of abbreviated versions of many different measures. I’ve started to put together a site that might eventually serve that purpose (shortermeasures.com), but it’s still in the preliminary stages of development, and may or may not get off the ground.

The other main goal of the paper is to show that the same general approach can be applied to simultaneously abbreviate more than one different measure. To make the strongest case I could think of, I took 8 different broadband personality inventories (“broadband” here just means they each measure a relatively large number of personality traits) that collectively comprise 203 different personality scales and 2,091 different items. Using the same genetic algorithm-based approach, I then reduce these 8 measures down to a single inventory that contains only 181 items (hence the title of the paper). I named the inventory the AMBI (Analog to Multiple Broadband Inventories), and it’s now freely available for use (items and scoring keys are provided both in the paper and at shortermeasures.com). It’s certainly not perfect–it does a much better job capturing some scales than others–but if you have limited time available for personality measures, and still want a reasonably comprehensive survey of different traits, I think it does a really nice job. Certainly, I’d argue it’s better than having to administer many hundreds (if not thousands) of different items to achieve the same effect. So if you have about 15 – 20 minutes to spare in a study and want some personality data, please consider trying out the AMBI!

ResearchBlogging.org

Yarkoni, T. (2010). The Abbreviation of Personality, or how to Measure 200 Personality Scales with 200 Items Journal of Research in Personality DOI: 10.1016/j.jrp.2010.01.002

on the limitations of psychiatry, or why bad drugs can be good too

The Neuroskeptic offers a scathing indictment of the notion, editoralized in Nature this week, that the next decade is going to revolutionize the understanding and treatment of psychiatric disorders:

The 2010s is not the decade for psychiatric disorders. Clinically, that decade was the 1950s. The 50s was when the first generation of psychiatric drugs were discovered – neuroleptics for psychosis (1952), MAOis (1952) and tricyclics (1957) for depression, and lithium for mania (1949, although it took a while to catch on).

Since then, there have been plenty of new drugs invented, but not a single one has proven more effective than those available in 1959. New antidepressants like Prozac are safer in overdose, and have milder side effects, than older ones. New “atypical” antipsychotics have different side effects to older ones. But they work no better. Compared to lithium, newer “mood stabilizers” probably aren’t even as good. (The only exception is clozapine, a powerful antipsychotic, but dangerous side-effects limit its use.)

Those are pretty strong claims–especially the assertion that not a single psychiatric drug has proven more effective than those available in 1959. Are they true? I’m not in a position to know for certain, having had only fleeting contacts here and there with psychiatric research. But I guess I’d be surprised if many basic researchers in psychiatry concurred with that assessment. (I’m sure many clinicians wouldn’t, but that wouldn’t be very surprising.) Still, even if you suppose that present-day drugs are no more effective than those available in 1959 on the average (which may or may not be true), it doesn’t follow that there haven’t been major advances in psychiatric treatment. For one thing, the side effects of many modern drugs do tend to be less severe. The Neuroskeptic is right that atypical antipsychotics aren’t as side effect-free as was once hoped; but consider, in contrast, drugs like lamotrigine or valproate–anticonvulsants nowadays widely prescribed for bipolar disorder–which are undeniably less toxic than lithium (though also no more, and possibly less, effective). If you’re diagnosed with bipolar disorder in 2010, there’s still a good chance that you’ll eventually end up being prescribed with lithium;, but (in most cases) it’s unlikely that that’ll be the first line of treatment. And on the bright side, you could end up with a well-managed case of bipolar disorder that never requires you to take drugs with frequent and severe side effects–something that frankly wouldn’t have been an option for almost anyone in 1959.

That last point gets to what I think is the bigger reason for optimism: choice. Even if new drugs aren’t any better than old drugs on average, they’re probably going to work for different groups of people. One of the things that’s problematic about the way the results of clinical trials are typically interpreted is that if a new drug doesn’t outperform an old one, it’s often dismissed as unhelpful. The trouble with this worldview is that even if drug A helps 60% of people on average and drug B helps 54% of people on average (and the difference is statistically and clinically significant), it may well be that drug B helps people who don’t benefit from drug A. The unfortunate reality is that even relatively stable psychiatric patients usually take a while to find an effective treatment regime; most patients try several treatments before settling on one that works. Simply in virtue of there being dozens more drugs available in 2009 than in 1959, it follows that psychiatric patients are much better off living today than fifty years ago. If an atypical antipsychotic controls your schizophrenia without causing motor symptoms or metabolic syndrome, you never have to try a typical antipsychotic; if valproate works well for your bipolar disorder, there’s no reason for you to ever go on lithium. These aren’t small advances; when you’re talking about millions of people who suffer from each of these disorders worldwide, the introduction of any drug that might help even just a fraction of patients who weren’t helped by older medication is a big deal, translating into huge improvements in quality of life and many tens of thousands of lives saved. That’s not to say we shouldn’t strive to develop drugs that aren’t also better on average than the older treatments; it’s just that it shouldn’t be the only (and perhaps not even the main) criterion we use to gauge efficacy.

Having said that, I do agree with the Neuroskeptic’s assessment as to why psychiatric research and treatment seems to proceed more slowly than research in other areas of neuroscience or medicine:

Why? That’s an excellent question. But if you ask me, and judging by the academic literature I’m not alone, the answer is: diagnosis. The weak link in psychiatry research is the diagnoses we are forced to use: “major depressive disorder”, “schizophrenia”, etc.

There are all sorts of methodological reasons why it’s not a great idea to use discrete diagnostic categories when studying (or developing treatments for) mental health disorders. But perhaps the biggest one is that, in cases where a disorder has multiple contributing factors (which is to say, virtually always), drawing a distinction between people with the disorder and those without it severely restricts the range of expression of various related phenotypes, and may even assign people with positive symptomatology to the wrong half of the divide simply because they don’t have some other (relatively) arbitrary symptoms.

For example, take bipolar disorder. If you classify the population into people with bipolar disorder and people without it, you’re doing two rather unfortunate things. One is that you’re lumping together a group of people who have only a partial overlap of symptomatology, and treating them as though they have identical status. One person’s disorder might be characterized by persistent severe depression punctuated by short-lived bouts of mania every few months; another person might cycle rapidly between a variety of moods multiple times per month, week, or even day. Assigning both people the same diagnosis in a clinical study is potentially problematic in that there may be very different underlying organic disorders, which means you’re basically averaging over multiple discrete mechanisms in your analysis, resulting in a loss of both sensitivity and specificity.

The other problem, which I think is less widely appreciated, is that you’ll invariably have many “control” subjects who don’t receive the diagnosis but share many features with people who do. This problem is analogous to the injunction against using median splits: you almost never want to turn an interval-level variable into an ordinal one if you don’t have to, because you lose a tremendous amount of information. When you contrast a sample of people with a bipolar diagnosis with a group of “healthy” controls, you’re inadvertently weakening your comparison by including in the control group people who would be best characterizing as falling somewhere in between the extremes of pathological and healthy. For example, most of us probably know people who we would characterize as “functionally manic” (sometimes also known as “extraverts”)–that is, people who seem to reap the benefits of the stereotypical bipolar syndrome in the manic phase (high energy, confidence, and activity level) but have none of the downside of the depressive phase. And we certainly know people who seem to have trouble regulating their moods, and oscillate between periods of highs and lows–but perhaps just not to quite the extent necessary to obtain a DSM-IV diagnosis. We do ourselves a tremendous disservice if we call these people “controls”. Sure, they might be controls for some aspects of bipolar symptomatology (e.g., people who are consistently energetic serve as a good contrast to the dysphoria of the depressive phase); but in other respects, they may actually closer to the prototypical patient than to most other people.

From a methodological standpoint, there’s no question we’d be much better off focusing on symptoms rather than classifications. If you want to understand the many different factors that contribute to bipolar disorder or schizophrenia, you shouldn’t start from the diagnosis and work backwards; you should start by asking what symptom constellations are associated with specific mechanisms. And those symptoms may well be present (to varying extents) both in people with and without the disorder in question. That’s precisely the motivation behind the current “endophenotype” movement, where the rationale is that you’re better off trying to figure out what biological and (eventually) behavioral changes a given genetic polymorphism is associated with, and then using that information to reshape taxonomies of mental health disorders, than trying to go directly from diagnosis to genetic mechanisms.

Of course, it’s easy to talk about the problems associated with the way psychiatric diagnoses are applied, and not so easy to fix them. Part of the problem is that, while researchers in the lab have the luxury of using large samples that are defined on the basis of symptomatology rather than classification (a luxury that, as the Neuroskeptic and others have astutely observed, many researchers fail to take advantage of), clinicians generally don’t. When you see a patient come in complain of dsyphoria and mood swings, it’s not particularly useful to say “you seem to be in the 96th percentile for negative affect, and have unusual trouble controlling your mood; let’s study this some more, mmmkay?” What you need is some systematic way of going from symptoms to treatment, and the DSM-IV offers a relatively straightforward (though wildly imperfect) way to do that. And then too, the reality is that most clinicians (at least, the ones I’ve talked to) don’t just rely on some algorithmic scheme for picking out drugs; they instead rely on a mix of professional guidelines, implicit theories, and (occasionally) scientific literature when making decisions about what types of symptom constellations have, in their experience, benefited more or less from specific drugs. The problem is that those decisions often fail to achieve their intended goal, and so you end up with a process of trial-and-error, where most patients might try half a dozen medications before they find one that works (if they’re lucky). But that only takes us back to why it’s actually a good thing that we have so many more medications in 2009 than 1959, even if they’re not necessary individually more effective. So, yes, psychiatric research has some major failings compared to other areas of biomedical research–though I do think that’s partly (though certainly not entirely) because the problems are harder. But I don’t think it’s fair to suggest we haven’t made any solid advances in the treatment or understanding of psychiatric disorders in the last half-century. We have; it’s just that we could do much better.

what’s the point of intro psych?

Sanjay Srivastava comments on an article in Inside Higher Ed about the limitations of traditional introductory science courses, which (according to the IHE article) focus too much on rote memorization of facts and too little on the big questions central to scientific understanding. The IHE article is somewhat predictable in its suggestion that students should be engaged with key scientific concepts at an earlier stage:

One approach to breaking out of this pattern, [Shirley Tilghman] said, is to create seminars in which first-year students dive right into science — without spending years memorizing facts. She described a seminar — “The Role of Asymmetry in Development” — that she led for Princeton freshmen in her pre-presidential days.

Beyond the idea of seminars, Tilghman also outlined a more transformative approach to teaching introductory science material. David Botstein, a professor at the university, has developed the Integrated Science Curriculum, a two-year course that exposes students to the ideas they need to take advanced courses in several science disciplines. Botstein created the course with other faculty members and they found that they value many of the same scientific ideas, so an integrated approach could work.

Sanjay points out an interesting issue in translating this type of approach to psychology:

Would this work in psychology? I honestly don’t know. One of the big challenges in learning psychology — which generally isn’t an issue for biology or physics or chemistry — is the curse of prior knowledge. Students come to the class with an entire lifetime’s worth of naive theories about human behavior. Intro students wouldn’t invent hypotheses out of nowhere — they’d almost certainly recapitulate cultural wisdom, introspective projections, stereotypes, etc. Maybe that would be a problem. Or maybe it would be a tremendous benefit — what better way to start off learning psychology than to have some of your preconceptions shattered by data that you’ve collected yourself?

Prior knowledge certainly does seem to play a huge role in the study of psychology; there are some worldviews that are flatly incompatible with certain areas of psychological inquiry. So when some students encounter certain ideas in psychology classes–even introductory ones–they’re forced to either change their views about the way the world works, or (perhaps more commonly?) to discount those areas of psychology and/or the discipline as a whole.

One example of this is the aversion many people have to a reductionist, materialist worldview. If you really can’t abide by the idea that all of human experience ultimately derives from the machinations of dumb cells, with no ghost to be found anywhere in the machine, you’re probably not going to want to study the neural bases of romantic love. Similarly, if you can’t swallow the notion that our personalities appear to be shaped largely by our genes and random environmental influences–and show virtually no discernible influence of parental environment–you’re unlikely to want to be a behavioral geneticist when you grow up. More so than most other fields, psychology is full of ideas that turn our intuitions on our head. For many Intro Psych students who go on to study the mind professionally, that’s one of the things that makes the field so fascinating. But other students are probably turned off for the very same reason.

Taking a step back though, I think before you can evaluate how introductory classes ought to be taught, it’s important to ask what goal introductory courses are ultimately supposed to serve. Implicit in the views discussed in the IHE article is the idea that introductory science classes should basically serve as a jumping-off point for young scientists. The idea is that if you’re immersed in deep scientific ideas in your first year of university rather than your third or fourth, you’ll be that much better prepared for a career in science by the time you graduate. That’s certainly a valid view, but it’s far from the only one. Another perfectly legitimate view is that the primary purpose of an introductory science class isn’t really to serve the eventual practitioners of that science, who, after all, form a very small fraction of students in the class. Rather, it’s to provide a very large number of students with varying degrees of interest in science with a very cursory survey of the field. After all, the vast majority of students who sit through Intro Psych classes would never go onto careers in psychology no matter how the course was taught. You could mount a reasonable argument that exposing most students to “the ideas they need to take advanced courses in several science disciplines” would be a kind of academic malpractice,  because most students who take intro science classes (or at least, intro psychology) probably have no  real interest in taking advanced courses in the topic, and simply want to fill a distribution requirement or get a cursory overview of what the field is about.

The question of who intro classes should be designed for isn’t the only one that needs to be answered. Even if you feel quite certain that introductory science classes should always be taught with an eye to producing scientists, and you don’t care at all for the more populist idea of catering to the non-major masses, you still have to make other hard choices. For example, you need to decide whether you value breadth over depth, or information retention over enthusiasm for the course material. Say you’re determined to teach Intro Psych in such a way as to maximize the production of good psychologists. Do you pick just a few core topics that you think students will find most interesting, or most conducive to understanding key research concepts, and abandon those topics that turn people off? Such an approach might well encourage more students to take further psychology classes; but it does so at the risk of providing an unrepresentative view of the field, and failing to expose some students to ideas they might have benefited more from. Many Intro Psych students seem to really resent the lone week or two of the course when the lecturer covers neurons, action potentials and very basic neuroanatomy. For reasons that are quite inscrutable to me, many people just don’t like brainzzz. But I don’t think that common sentiment is sufficient grounds for cutting biology out of intro psychology entirely; you simply wouldn’t be getting an accurate picture of our current understanding of the mind without knowing at least something about the way the brain operates.

Of course, the trouble is that the way that people like me feel about the brain-related parts of intro psych is exactly the way other people feel about the social parts of intro psych, or the developmental parts, or the clown parts, and so on. Cut social psych out of intro psych so that you can focus on deep methodological issues in studying the brain, and you may well create students more likely to go on to a career in cognitive neuroscience. But you’re probably reducing the number of students who’ll end up going into social psychology. More generally, you’re turning Intro Psychology into Intro to Cognitive Neuroscience, which sort of defeats the point of it being an introductory course in the first place; after all, they’re called survey courses for a reason!

In an ideal world, we wouldn’t have to make these choices; we’d just make sure that all of our intro courses were always engaging and educational and promoted a deeper understanding of how to do science. But in the real world, it’s rarely possible to pull that off, and we’re typically forced to make trade-offs. You could probably promote student interest in psychology pretty easily by showing videos of agnosic patients all day long, but you’d be sacrificing breadth and depth of understanding. Conversely, you could maximize the amount of knowledge students retain from a class by hitting them over the head with information and testing them in every class, but then you shouldn’t be surprised if some students find the course unpleasant and lose interest in the subject matter. The balance between the depth, breadth, and entertainment value of introductory science classes is a delicate one, but it’s one that’s essential to consider before we can fairly evaluate different proposals as to how such classes ought to be structured.