Hacker News new | past | comments | ask | show | jobs | submit login
Understanding science is not always as hard as you think (jgc.org)
37 points by yummyfajitas on March 9, 2010 | hide | past | favorite | 23 comments



My favorite approach to reading science papers of all types is to assume the paper to be bogus, and then read it in order to find out why. If I am doing battle with the author every sentence of the way, I am forced into reading carefully and skeptically. If the author still has me convinced at the end, good on them.

I will say this approach is easier when you have had someone you trust teach you how to read scientific papers. In undergrad and in med school, professors would present us with two similar papers both published in top journals and ask, "Which one is wrong?" It was a highly valuable training exercise for me. The gist of the message that I got was this: (1) If you really need to understand the paper, ignore the discussion (and, sometimes, the results) until you have mastery of the methods and figures. (2) If you are really lost, read the discussion to see what the author thinks she is showing, then go back to the results to see if you agree.


Part of the problem is that you have to "Read Like Math" and not "Read Like A Novel." Papers aren't taken seriously unless they sound serious, and people like to read adventure stories. Anything else is just "too hard."

People don't know how to read scientific papers. They don't know how to find the story in the stilted and stultifying prose. They don't know that you can skim bits and return to them later, you can read through to get the sense, then return to see the details.

And it's got scary looking equations.


Understanding that you are allowed to do random access and skimming is crucial not only to enjoy reading many papers but to technical books as well! You got taught to read something from cover to cover before college and doing otherwise is viewed as laziness, maybe even cheating. It's not until later they explicitly state you should skim and only digg deeper when there's something relevant you don't understand.

It is also often pointed out that great academics aren't necessarily great writers which I'm sure is the cause for a lot of the dryness in papers.

I'm also reasonably sure it has to do with journals enforcing overly strict formating rules to the point of blatantly refusing to publish for no reason other than being slightly over length. A hilarious tale of this is How to Publish a Scientific Comment in 123 Easy Steps: http://www.physics.gatech.edu/frog/How%20to%20Publish%20a%20...


Just the fact that it's a PDF is enough to make me think twice before clicking, never mind reading it.

There was a cool sounding project called AcaWiki (http://acawiki.org) that wanted to crowdsource summaries of research papers. Doesn't look like it's exploded yet (not a single Climate related paper as far as I can see) but it's still a good idea.


I actually wish a lot of papers in my area (speaking as a grad student) had more novel-like writing and were less hung-up on seeming formal. I've been reading a bunch of logic papers lately, mainly to mine techniques and ideas for an application-oriented use of logic that my thesis is built on, and the papers parcel out concepts so slowly, and fail to explain any intuition about them, that it's pretty slow going. It's like 90% proofs that I don't care about (put them in an appendix) and 10% actual content I might want to read.


I agree with you, but that's what you have to do to get published. My first PhD thesis was rejected, although my viva saved me and I was given permission to re-write. The second version had mostly the same material, mostly the same proofs, but was written in a style that made it seem harder and more inaccessible. It appeared more impressive, so it made a better impression.

Clear, accurate and engaging writing is hard and undervalued.

My advice is: learn to live with it. After all, you don't really have a choice.


Yes and no. In my experience it is usually a mistake for scientists to try to write fancy prose like a novelist. I would much rather have a dull but comprehensible paper (all short, plain and simple sentences) than any kind of highly literate ambiguity, or indeed the over-formal pomposity you rightly complain about. A scientific paper should not be a work of entertainment or showing off, but an attempt to communicate an idea. Remember that a paper is written for several audiences, and the annoying crap that you don't care about might be of vital importance to someone else. Having said that, it is nice when a paper is well arranged so that you can find what you want, but too often you just have to develop your cherry-picking skills.


Understanding a single scientific paper is easier than many people think. Understanding something like the temperature trend from 1850, or the growth trend in artificially-bred corn over the last 100 years, is pretty easy, requiring little more than grade-school arithmetic.

But in my experience, people don't want to merely understand the existence of that trend. They want to understand the broader ideas of climate science, evolution, medicine, etc. That means understanding where the data comes from, what sort of filters have been applied, what principles account for it, how those principles have been lab-tested and on what scale, etc.

It's that sort of understanding that's hard.


I didn't read the full paper, so I could be wrong but... I think its MUCH harder to understand than he gives it credit for.

The equation on page 4 is enough to confused people, especially with C_H is described as "any homogenisation adjustment that may have been applied to the reported temperature."

It does't appear to be doing any high level math, but lots of people don't remember that (x^y)(x^z) = x^(y+z) so its going to be hard to get through a paper like this.

That being said, the number of people who could read this and understand it does seem to be much higher than I thought before.


Okay, I'll bite. Is the raw data available -- with no extrapolation or filling. Just a big series of: station, long/latt, date, max-temp, min-temp? Data that no human hand has touched since the original reading?

That would be a wonderful CSV file to play around with. It couldn't be that big of a file either.


Looked at the paper linked in TFA [1]. Said something about "HadCRUT" temperature data, so I Googled it. Third link down was [2], which has data downloads. Also checked wikipedia, which linked to [3], which has that and many more datasets.

[1] http://hadobs.metoffice.com/crutem3/HadCRUT3_accepted.pdf

[2] http://www.cru.uea.ac.uk/cru/data/temperature/

[3] http://hadobs.metoffice.com/


Alright I've clicked on all those links, read the text, and downloaded the files.

I've got files for variations, I've got files for averages, I've got files I don't know what they are.

Now back to my question: anybody know where the unaltered data, by day and location, is? Direct link would be awesome.

This isn't too much to ask, right? I mean, if we've been keeping temperature records, surely there has to be the raw data somewhere in an easy-to-consume format? (I'm not trying to be cynical or sarcastic. For all I know there might be good reasons for such data not existing or I might have a case of the doofus here)


I'm no climatologist. I don't know a lot about how temperature is measured. But your question is based on several assumptions that I think need analysing.

1) What temperatures are we talking about? Surface air? Water? 2) Who's collected it? There's probably dozens of organizations that collect it. I doubt there's one single repository for it all. 3) How was it collected? Are the data comparable? Especially historically. You can't simply ask the weatherman what the temperature was in Antarctica in 1874. 4) Even if you had universally comparable data for the past 150 years, you still have trouble. Temps in urban areas for example skew results because of heat trapping. Those factors need to be accounted for.

Science is complicated for the very simple reason that the natural world is complicated. There's no such thing as "pure" data.

I trust science industry (for the most part) over time. The over time part is key. There are generational checks and balance in science. Young turks looking to make a name are attacking the holes in theories all the time. If something is faulty, we'll find out...from other scientists.

I don't need to trust individual scientists, just the process of science. Which I do.


Can we have part #17 and part #73 of the conversation some other time, please?

I just wanted to know where the measurement data was. That's it.


There was a thread on HN some time ago and I saved this link, which has some instructions on where to find the data and a description of the format.

ftp://ftp.ncdc.noaa.gov/pub/data/gsod/readme.txt

I wanted to start a little project to process it but never got around to it.

Hope this helps.


I've seen claims somewhere (maybe at "Watts Up with That") that no unmodified climate station data exists, that it's all been modified or corrected to adjust for something.

Is this data raw?


Well, it does exist, you just have to go back to the original sources, e.g. the meteorological organizations of the various countries, militaries, etc.

As far as I know, the position of the CRU is that they have no original unmodified raw data anymore. Don't know about NASA and NOAA.


This article claims that 3 out of the 4 datasets available, the ones from the CRU, NASA’s Goddard Institute for Space Studies (GISS) and National Climate Data Center Global Historical Climate Network (NCDC GHCN), are not independent and that the latter two were felt by NASA to be inferior to the CRU's. The article doesn't have anything on the raw data, however.

http://pajamasmedia.com/blog/climategate-stunner-nasa-heads-...


The NOAA data is exactly what I was looking for.

Of course, it's only the U.S., and it's only for the past 80 years or so. But the dataset looks clean.


Is that really science? I could follow the same steps to balance my checkbook in order to determine spending periods.

If they'd created a small model of the Earth and applied what they know in order to determine the truth of their hypothesis, than yes, I would say they're doing science. Has the bar really dropped this low?


> subtract the average temperature from the observed temperature for the same month ... If it's getting hotter the anomalies will get bigger.

Based on the above, it should be noted that anomalies are signed, and an anomaly of +5 is bigger than an anomaly of -10, if I understand correctly.


> subtract the average temperature from the observed temperature for the same month

How are we computing average? The average daily high? The average daily low? The average of the daily midpoint? The average at 9am? The average at whatever time the person got around to recording it? These things affect what one can see in the data and how it relates to reality.

My point is that even simple sounding things aren't necessarily.


I'm finding this true as well. I was in graduate school for a year and got my ass kicked. I actually failed at least one course. I was lucky enough to swing into industry with a better job than most MAs or "ABDs" get so I didn't go back for a second year.

I kept myself in challenging work and average 10-15 hours per week studying CS concepts. Now those same subjects that kicked my ass seem not only comprehensible but beautiful. What was a poorly-motivated technical mess of symbols (to me) I can now relate to real concepts in CS, and I could play with them if I needed to. (Godel numbering? Oh, that's Lisp encoded in the integers. Primitive recursion? It's a for-loop.)

So it seems like I'm "smarter" at 26 than at 22, contrary to stereotype. Because I have more experience to relate new concepts to, I find it much easier now to learn new concepts in mathematics and science, except for the fact that I have 1/4 as much time in which to do so.

The problem is that most people are taught subjects like algebra and calculus without much motivation and with no prior experience, so only natural ability and parental expectations can drive people to learn them. Calculus isn't actually hard, unless your algebra sucks. Algebra is only hard for so many people because it's poorly-taught and often not well-motivated, so only people who get a lot of encouragement (often because of natural ability and a developed inclination to solve puzzles) learn it.

The lesson I take away from this is that education (about sciences, but also literature and history) isn't something that should occur only for those who are too young to be economically useful; it ought to be an ongoing process.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: