Last week I’d offered to do a full write up of how to read scientific paper critically, and asked if there was any interest in such a topic. No one asked for this, but here you are anyway. I did the research, and I’ve got nothin’ else! Besides which, it’s a fascinating topic to me, and every time I delve deeper into it, I get happier about my career decisions that led me away from the publishing train. Because a lot of the problems in science today stem from the way scientists are evaluated: have you published anything recently? In a serious journal? No? Ok, any journal will do. Did you have positive, real effects? No? No one cares you proved a negative, go back and get me a positive. We want results, or you’re fired! Which, human nature being what it is, leads to… well, it’s not science, unless you’re talking about the study of human psychology when backed into a corner and one’s livelihood threatened. In dire cases when the scientist’s government gets involved, one’s life might be at stake. And that’s even without getting into citation padding, authorial padding (there’s an ongoing scandal in South Korea where researchers have been adding their children’s names onto their papers to pad the children’s academic resumes), and duplication of results. Not replication, which is the gold standard, but using the same results in multiple papers, which is highly unethical and will lead, if caught, to a retraction of the paper. Enough of those, and you will lose your funding, position and have to start over.
Which brings me to the blog that highlights so many of these train wrecks, Retraction Watch. The blog not only highlights papers that have been retracted from publication, but weekly offers great roundups of links to various scandals in science. Like the recent case, where an eagle-eyed scientist who does this sort of thing as a kind of crusade for good science, spotted possible fraudulent image manipulation in over 200 papers. Most of us here on the blog have limited, if any, journal access, so we could well find ourselves in the position of reading a paper that has been retracted. There’s no shame in that – scientists who really ought to know better themselves do it all the time. In a recent study on the fraudulent and shameful Wakefield paper on autism, it was concluded that “Even authors who used terms such as “flawed” or “false” to describe the Wakefield paper didn’t always note the retracted status of the paper. My team felt that documenting the retraction carries a great amount of weight in demonstrating that the findings were fraudulent, and by missing out on this important piece of information, people may be under the perception that the work could be valid.”
But I could wander very far into the weeds, indeed, with this. It’s not that I don’t want you all to trust science. It’s just that in science, especially these days, you must read any paper with a critical eye. Looking at the small details in the margins can yield big clues, and that’s where I’ll try to focus. Looking at the design of a study is also important, as well, and crucial to generating good data is using good controls. For example, machine learning is all the rage in science currently – allowing computers to crunch vastly more data than is humanly possible seems like a wonderful idea, but… “Machine learning algorithms readily exploit confounding variables and experimental artifacts instead of relevant patterns, leading to overoptimistic performance and poor model generalization.” The paper goes on to suggest that adversarial controls that can anticipate the problems inherent in the algorithms can lead to better data. When you are talking about studies on humans, you want to look at study sizes – the larger the better – and things like control groups, blinded studies (blinding is good!) and how the reporting was done. Self-reporting of symptoms is dubious at best. Asking people to keep track of what they ate (for instance) or their pain levels for weeks, let alone years, is a recipe for messy data and unreliable results. Which is part of the reason nutrition science is such a hot mess right now.
When you are reading a paper, you will want to look at a few things right away: the journal the paper was published in. Some journals, as I mentioned last week, will publish anything if the authors are willing to pay a fee. Vanity publishing is no better in science than it is in fiction publishing, and the results are just as dubious. Look at the section (usually at the bottom of the first page) where any conflicts of interest are laid out. Having some mentioned here is not a bad thing – and the problem arises when the authors don’t disclose potential conflicts, which is invisible without a great deal more research on the reader’s part – but it is something to keep in mind when assessing any potential bias the scientists may have. They are human. There will be bias, but a properly designed study will still yield good data and results that should be reproducible. Sadly, there seems to be little to no inducement for the publication of results that are reproducing, and reinforcing, good scientific results. In fact, there seems to be a lot of antagonism toward this. Finally, look at the results themselves. This isn’t where you need to put on your science hat, it’s where you can put on your writing hat – what does the wording look like? Is it cautiously optimistic, straightforward, dry and factual? Probably reliable. Is it hyperbolic, sensational, and does the word ‘cure’ appear? Probably unreliable. Science that makes for a good story rarely looks like it on the surface. It’s knowing the possible ramifications of this result that leads to the enthusiasm and excitement of potential, and as science fiction authors, that’s our job. We take the science and run with it, making it exciting and real, and inspiring.
The same goes for articles discussing poll results or something similar. Good articles will tell you the details of the poll (date(s) conducted, number of respondents, +/- error (you don’t want anything over +/- 4% for a margin of error). If it’s political poll, it should tell you what percent of respondents were which party, that sort of thing. Look for over and under-representation of groups (African-Americans make up ~13% of the population, their numbers shouldn’t be much below or above that unless the poll is regarding a specific issue that only affects AAs such as sickle cell anemia). Also, Cedar’s comments about hyperbolic conclusions holds here as well. Guarded in the case of social science articles should say “more likely” “tends to” etc. (and before anybody decides to go off on social science, I am one, and I know there is a lot of good and bad work out there.)
It’s just that measuring people is hard, so doing it right is difficult and tedious. Which means that if you are a lay outsider to the field, it takes a lot of work to become familiar with the good work done within the field.
The work fields A, B, and C borrow from D without checking hard or carefully rechecking for later work? Becomes the pop academic view of how D works. If field E depends slightly on A, B, C, and D, and has a foundation in A, B, C, and D that is only at the pop academic level, E may be fairly invalid even if E is otherwise being worked on in a diligent, sane and careful way.
A layman working carefully enough can get a somewhat reliable impression of when this is happening. This is not the same as being able to prove it, or able to fix it.
The good work within a social science tends to be most accessible to specialists practicing it. As for bad work promoted via organizational politics… Check out some of the engineering professional societies.
I remember a study where the writer was claiming that conservatives were more prone to believe conspiracy theories. Eyeball his data, and you realized that was entirely owing to two (2) people who claimed to be conservative and believe every theory out there, owing to his small pool of conservatives.
He said that people who pointed this out were conspiracy theorists.
I remember during my mental health nurse training we had to assess research papers. We were split into teams and each given a part of the paper to read and then present the paper. I caused a bit of a kerfuffle when I said that the paper we had read was not worth the paper it was written on by pointing out what the results actually meant.
My fellow student were appalled. The tutor on the other hand was pleased.
But the point is not being how clever I was, but the fact that the majority of papers produce null results, and that a P value of 0.5 is very good.
Don’t forget to account for p-hacking, as well. There’s a strong argument against using p-values being made, and while I’m not sure there’s a better way, there certainly are ways to game the system.
Absolutely, and there’s a typo in my post above; a P value of 0.5 is NOT very good.
And P hacking, evil.
What people don’t tend to understand is that even a significant P value of say 0.005 only means there’s a chance that further research is worthwhile, not that what was found is right.
Depending on the specific field, there might also be a literature review included (“We’re doing X, because the following people have tried doing it a different way, and these were their results. We’re approaching X from this other direction, because . . .”) Once you are even vaguely familiar with the field, certain names and papers will stand out as “things that need to be considered.” If you don’t see any reference to those, it might be a sign of potential problems. Especially if the paper you are reading poses major challenges to the older work.
Note: not every sub-specialty takes that approach.
and if they won’t show anyone their data?
If a pine tree grows in Yamal, Siberia, and no one takes a core. . . *innocent kitty look*
None of the scientific papers I read for work involve “experiments”. They’re either observations, inference from observations, or proposals of terminology refinements. (There are experiments in geology but “first we created a solar nebula” is a bit difficult.) The scientific literature is a rolling feast of “I have a better idea” and so I never completely assume there is a final answer without putting your/my nose on the rock itself.
Ah, but can you imagine the stories behind making an experimental solar nebula…?