Hacker News new | past | comments | ask | show | jobs | submit login
Nasa just made all its research available online for free (independent.co.uk)
776 points by signa11 on Aug 20, 2016 | hide | past | favorite | 49 comments



I'm actually curious what is new within this. As it stands, NASA research pre-prints are available on the tech reports server (http://www.sti.nasa.gov/), typically right after the export control review process.

edit: maybe it's related to certain journals where there wasn't access on the STI server? Seems odd to me. For the 8 years that I've been at NASA, it's always been expected that my group's work had to be accessible.

edit2: not all rules are universal across the agency, so my experience may be too specific to Glenn Research Center/my division. In any case, the more open, the better.


NASA yanked a lot of material away from their NTRS (NASA technical reports server) some years ago. A lot of research papers that I would have needed in my research were gone. I can find some of that stuff via Sci-Hub but not everything.

At one point I was looking for a retracted paper and I even contacted my local (non-US) representative (the contact that was listed in NTRS) to try to get that paper, but they didn't even try to help me.

A lot of the retracted papers were fundamental research dating back to the space race era.

Glad to see more research being published but really, a lot of it was previously available but got yanked due to politics.


Interesting, I was not aware that they had done this!


After 9/11, when people were running around like decapitated chickens, a lot of useful government-funded (read: taxpayer-funded) documentation was pulled from sites like nist.gov. It has slowly been trickling back. This may be a similar case.


Overreaction is the calling card of not having learned from the past. In any case, would scraping sites like nist.gov and mirroring the information therein be considered illegal?


Works created by government employees as part of their official duties are in the public domain in the U.S. However, scraping might break some ToS and with that comes the possibility of the CFAA.


>I'm actually curious what is new within this. As it stands, NASA research pre-prints are available on the tech reports server

One big difference is that most journal will not allow papers to cite a pre-print paper. A lot of the pre-prints will have different names and differ slightly in content.


> Nasa announced it is making all its publicly funded research available online for free

This is the way things ought to be for all publicly funded research, not just NASA. Thank you, NASA, for leading the way.

The beginning of technological progress for mankind started with writing, the beginning of the industrial revolution started with the printing press. Having all the world's knowledge available on your desktop just a click away is the beginning of another exponential leap forward.


> Having all the world's knowledge available on your desktop just a click away is the beginning of another exponential leap forward.

Indeed. I had a fun example of that just last Friday. I volunteered for a new task on my job, rewriting someone's sloppy ad-hoc implementation of some diagram-drawing stuff in our application. I decided to practice the virtue of scolarship[0] and 5 minutes later, I had a printed copy of a paper from (who I assume are) leading experts on this particular topic, from which I've learned of approaches that were much better than ones I was thinking of.

Today we have an unprecedented opportunity - no matter where you live, no matter what question you have, as long as you know English, you have access not just to knowledge about it; you have access to the best knowledge humanity as a species currently has about it. You only have to use it. Want to learn about some topic? Don't pick up just any random tutorial, spend 10 minutes and get[1] the best book on the subject.

(Not to say that papers are perfect - I sometimes wish researchers would cut the bullshit out of publications, and focus more on presenting the methods instead of the results. I.e. it's fine and dandy that your algorithm is so good it can be used in realtime, but how about focusing more on explaining the details of the algorithm itself, so that I could actually use it?)

[0] - http://lesswrong.com/lw/3m3/the_neglected_virtue_of_scholars...

[1] - buy if you can, copy if you must. Personally, I support IP on books only in so far it helps the authors get fairly compensated; all the publishing industry built around it is mostly a huge brake on that "another exponential leap forward" 'WalterBright mentioned.


I was born in the 70s, so I still remember middle school research papers that required no less than 4 sources sited, only one of which could be an encyclopedia. In any case, it still gives me pause to consider that I have a box with almost instantaneous access to the wealth of the world's information in my pocket.


I was born a few years after WWII. I used to have a library occupying many shelves of about 10 years of CACM, JACM, SigPLAN, and other CS journals, along with file cabinets full of copies of research papers from other journals (like Software Practice & Experience and ACTA Informatics). There simply was no other way to look things up: curious about Alpha-Beta pruning? You had to find Knuth's 1975 article on it, published in Artificial Intelligence.

Things are so different and so much better now. However, I do find that the way I can so quickly browse so many publications, blogs, wikipedia, and source on the internet means that I don't retain the information I do process as well as I did when I had to find a research paper in a university research library, read it, and take notes. In the pre-www days I had an almost photographic ability to remember the research papers I read; now, not so much. (Maybe it's age, my wife has to help me find my car keys!)


>However, I do find that the way I can so quickly browse so many publications, blogs, wikipedia, and source on the internet means that I don't retain the information I do process as well as I did when I had to find a research paper in a university research library, read it, and take notes.

Being in my 30's I agree with the sentiment; I think it is the price to be able to explore more, and more quickly. I think that the case now is that if you want to retain something, then it is a different process than the exploration stage, while before the current state of affairs both processes were less decoupled.


Link to the actual data [0].

Looks pretty neat actually. This seems to stem from an executive order by President Obama in 2013. Mobile browsing is okay but I'm excited to check out some of the APIs when I get back to my computer a bit later on. They're seperated by category (Earth Science, Aerospace, etc).

Seems like a lot of the data is already queryable by their api's and I assume there are data dumps and research papers available as well.

Very cool. There is a serious wealth of data and apis available for tinkerers and builders these days from watson to NYC data to Nasa!

[0] - http://www.nasa.gov/open/researchaccess/pubspace


The executive order it self is part of the data released on their dataportal :) https://data.nasa.gov/external-dataset?datasetId=5abq-kcha


The mars tsunami link is a fun read: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4872529/

> We conclude that, on early Mars, tsunamis played a major role in generating and resurfacing coastal terrains.

That's a fascinating thing to to think about. Reminds me of the the ocean planet they landed on in Interstellar. Although Mars had giant tsunamis caused by impacts instead of perpetual waves caused by the gravity of a black hole.


Great that they're finally getting around to doing this 3 years after the order was signed.

Shouldn't it be a law that any research done with public dollars should be made available to the public for free? Any costs associated with the publishing should be built into the funding.


The complication is that it isn't just the public who it is made available to, it's made available to the whole world. There isn't necessarily a problem with that but the reality is that sometimes there is.

Still a much better chance of information becoming available this way than if it were private research.


>Shouldn't it be a law that any research done with public dollars should be made available to the public for free? Any costs associated with the publishing should be built into the funding.

If you like. How much do you think that would cost, though? Typically journals charge about $2000 to make a paper open-access.

Now, given that extra research funding won't just magically appear, what we're basically saying is that less research should be funded, just so that it can be hypothetically read by the small minority of people who wish to read it and yet aren't connected to a research institution and can't be bothered going to their local university library to arrange access.

"Ah", you say, "but at least the university libraries would save money because they'd no longer have to pay for all those journal subscriptions". Well, except they would, because of the rest of the world, and also because of all the locally-funded research which was published before this hypothetical new law came into effect.

It doesn't necessarily follow that just because research is publically funded that it should be publically available. Many things that are publically funded are not publically available, which is why I can't take a ride in an Air Force plane whenever I feel like it. Publically funded things are funded for specific purposes, which may or may not be served by making it publically available.


Do you really think it costs $2000 to publish a PDF on the Internet these days?

I'm talking about putting the documents on S3 with a simple paginated frontend that would allow searching the name/synopsis.

You must work for the government?


igf isn't saying that the technical cost of publishing the PDF is $2000. They are saying that it is what journals charge for the right to do so while also publishing in them. And yes, many academics will rather use any funds they can to pay that than not publish in those journals, so unless there is some widely-coordinated move to force the issue funds will go there for now.

While grants can enforce open-access publications, forbidding publication in other venues seems a lot more contentious.


I guess I just don't understand this kind of attitude. If it was done with public funds, it should be made available to the public. We have the means and capability to do so now.

The point of research is to be useful, even if it doesn't immediately have an application in the present, it could form the basis of something new in the future. Not having easy access to this institutional knowledge decreases the value to the public, or putting it behind a paywall and restricting access could prevent a non-standard performer or individual contributor from obtaining access and thereby benefiting.

I think the motivations in the academic world is very flawed. If I were an academic my goal would be to provide my work to the largest audience I possibly could, so there likelihood of someone benefiting from it would be the highest. This is especially true for research that has no present application. It should be highly accessible and available to anyone if the public has paid for the academic to live, eat, and generally do the work.


They still charge for court records and opinions. Remember how Aaron Swartz was prosecuted for trying to take them for free?


Ok, so maybe I'm missing something, but what's new here? Is it just the portal?

I've been reading NASA technical papers and publications for years, and although most of the research I was reading was very focused in a specific field, everything I wanted to see was freely available. Even some of the publications linked in the article have been online for a decent amount of time.


I've read NASA tech reports for years too; seems like one of those things that's "always" been online, though I don't know when precisely it went online. It's been online long enough, at least, to have even accumulated some episodes of backlash. In 2013, it was forced offline for a few months, because Congresspeople claimed that it might include content The Chinese could use to reconstruct national-security-sensitive information, so demanded that reports undergo a security audit: http://fas.org/blogs/secrecy/2013/03/ntrs_dark/


Searching the web portal leads to a PMC database query with the filter "nasa funded". Here's a link to a query with just that filter, which returns all 863 articles:

http://www.ncbi.nlm.nih.gov/pmc/?term=%22nasa+funded%22%5BFi...


Saying that "all" of NASA's research is being made available is a stretch given it's known they work on a lot of classified projects with the military, intelligence, etc.:

http://nsarchive.gwu.edu/NSAEBB/NSAEBB509/


This article is so misleading. NASA is requiring that the pubished research papers be public. The research discoveries are still owned by the Universities that did the research - the Bayh Dole Act is still in effect. https://en.wikipedia.org/wiki/Bayh%E2%80%93Dole_Act


Title should read "some" rather than "all"


Not NASA, but related: I really want the raw climate data and atmospheric models to be made open and publicly available. I don't understand how researchers can claim that climate change is the most serious problem facing our species while at the same time hiding what they are doing. I know that some data is available, but considering that our government funds most of the research why haven't they put this stuff up on github?

Is the secrecy really necessary in order to get tenure and win grants?


I very much doubt it's as "secret" as you think. Have you actually tried to get access to this data?


> hiding what they are doing

Would you back up this claim? There is too much rumor and propaganda around about climate science, so I feel we should take extra care to substantiate any claims.

I generally like the idea of scientific data being public, but the theory that there is a conspiracy of climate scientists (of which the claim that data is being 'hidden' is a building block) is tiresome and distracting from the very real, critical problems.


From the IPCC's Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assess- ment Report of the Intergovernmental Panel on Climate Change, Chapter 9 Evaluation of Climate Models, pp. 749-750.

> With very few exceptions (Mauritsen et al., 2012; Hourdin et al., 2013) modelling centres do not routinely describe in detail how they tune their models. Therefore the complete list of observational constraints toward which a particular model is tuned is generally not available.

From the House of Commons Science and Technology report The disclosure of climate data from the Climatic Research Unit at the University of East Anglia paragraph 54:

> 54. It is not standard practice in climate science and many other fields to publish the raw data and the computer code in academic papers. We think that this is problematic because climate science is a matter of global importance and of public interest, and therefore the quality and transparency of the science should be irreproachable. We therefore consider that climate scientists should take steps to make available all the data used to generate their published work, including raw data; and it should also be made clear and referenced where data has been used but, because of commercial or national security reasons is not available. Scientists are also, under Freedom of Information laws and under the rules of normal scientific conduct, entitled to withhold data which is due to be published under the peer-review process.78 In addition, scientists should take steps to make available in full their methodological workings, including the computer codes. Data and methodological workings should be provided via the internet. There should be enough information published to allow verification.


Thanks for doing the legwork to find and provide that; while the first cite is a narrow case, the second looks great.

> It is not standard practice in climate science and many other fields ...

I think this statement supports the idea that the actions of climate scientists are not a conspiracy, but normal practice in scientific research. I've read similar things in many other contexts.

On principle, I think all scientific data (and papers) should be open and free. In practice, I would need to know far more about the costs, value, and implications before demanding it.

On a practical level, very, very few would have the knowledge, skill, resources (software, etc.), time, and motivation to make use of the data. Very few outside any particular field even read that field's papers, much less try to work with the data. Likely, anyone in the general public trying to do so would interfere with the 'signal' of discourse by creating only noise - the open data could hurt more than help. If it costs a lot to produce the data, it's very possible that most of that expense would be a waste. OTOH, sometimes that noise and cost is the price of democracy.


Yes, and to be realistic, the researchers themselves are in a very competitive environment. Having done many years of graduate work, I realize just how hard it is for doctoral and post-doc students to break into a good position at a research university, but that's just the beginning. There is serious competition after that for tenure and grants.

Who want's to spend time dealing with FOIA (freedom of information act) requests from kooky skeptics while worrying that some other researcher will scoop your research? There are structural impediments in our university research environments to completely open science.

However, I'm hopeful because of the success that the software community has had with open software. Before Richard Stallman started the Free Software Foundation I would never have believed that such a transformation of the industry was possible.


What are you talking about? Most of this data is publicly available. You can download it and run your own analysis right now if you wish.


Imagine the entire skeptical climate blogosphere focusing its energy into p-hacking the raw climate dataset and models to prove their point. From the scientists' perspective, they wouldn't have a lot to gain by enabling this.


I'm pretty good at understanding technical and scientific publications, but this stuff drives me crazy:

- June, 2015, "Climate-change ‘hiatus’ disappears with new data" [1]

- Feb, 2016, "Global warming ‘hiatus’ debate flares up again" [2]

Both from the journal Nature!

[1] http://www.nature.com/news/climate-change-hiatus-disappears-...

[2] http://www.nature.com/news/global-warming-hiatus-debate-flar...


At last... I remember Richard Feynman complaining exactly about this!


Great! I glanced over it and I think the next step is to have them provide a For Dumbasses version of pretty much all of it...


Has anyone looked at the resource? Are there data sets and other similar things that could be used in novel ways?


following Space X as a top competitor


[dead]


Please don't do this here. We ban accounts that create sockpuppets to violate the guidelines with.


[flagged]


Why would you even say that considering how huge this is?


How huge is it? What's something that you can access now that you couldn't before?


Define "before". Is it a day ago, a month ago, a year ago?

NASA does research. It also publishes research. The latter is a subset of the former and we don't know how the subset compares to the superset.

I can access what it has published in 2016 which I couldn't access in 2015.

I'm not aware of, or know, everything an organization with a $20 billion budget this year and roughly 4% of the US GDP during the Cold War ever did and found there's no difference between that and what it published..

My comment was because I assumed that NASA didn't publish all its research before. Considering the size of that knowledge base, is this assumption so stupid and negative to deserve to be downvoted that much?


> with a $20 billion budget this year and roughly 4% of the US GDP during the Cold War

Just a minor note that it wasn't 4% (of US spending, not of GDP) for the bulk of the Cold War. It looks like it was only above 3% of federal spending for 4 years :

https://www.theguardian.com/news/datablog/2010/feb/01/nasa-b...


I stand corrected. Thanks for taking the time!


Including about set design of the moon for the landing?

(For those who just threw a fit.. it's a joke)


Seen this video? BUZr0Wr0v-s




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: