Claude Shannon's Masters Thesis was on the application of Boolean Algebra to circuits, effectively founding digital circuit design. That should have been enough for anyone, but not for Shannon. His later work on Information theory has proven important in everything from evolution to quantum mechanics (particularly, relative quantum entropy) and perhaps even to future physics (https://www.youtube.com/watch?v=td1fz5NLjQs).
He also did early work in cryptanalysis, AI (minimax chess algorithm and a learning robotic mouse). His student was Ivan Sutherland. More or less, whether its networking, signal processing, compression, crypto, machine learning, circuit design, basically anything to do with the digital age, you'll find Shannon did important foundational work there.
If I recall, Turing actually started to formalize some ideas in information theory, but stopped after meeting Shannon. Shannon showed Turing his information theory work and Turing decided Shannon already solved the question he was interested in.
It is a shame shannon doesn't get the recognition he deserves. Computer science has many fathers and as amazing as Turing was, nobody contributed more and nobody has a better claim to call himself the father of computer science than Shannon. Not Turing. Not Church. He outstrips them all but nobody outside of the computer science world has heard of him.
Just two nights ago, on Amazon Prime streaming, I watched the very good show "BBC Order And Disorder Episode 2 - Information" hosted by the excellent Jim Al-Khalili. He profiles Alan Turing and says that Turing was only half the story and then introduces Claude Shannon—a name with which I was unfamiliar—and Bell Labs. Great stuff.
Great documentary. I can heartily recommend all of Al-Khalili's documentaries: "Atom", "The End of the Universe", "Shock and Awe: The Story of Electricity" and "Chemistry: A Volatile History" are all excellent. The last two are my favorites; while his documentaries on physics and quantum mechanics are good, they describe scientific phenomena that are pretty difficult to understand, let alone visualize, and it ends up having to be a little over-simplified, or explained with unsatisfying visual analogies. The ones on chemistry and electricity are a lot more (no pun intended) grounded, and include some fun demonstrations.
Thank you. I discovered last night that some of the shows you mention are not on amazon Prime. Only the more astronomical and physics-related ones are (Gravity; Order/Disorder; Everything/Nothing; Beginning/End of the Universe).
Shannon came up much more than Turing during undergrad for electrical engineering. Can't learn signal processing, information theory, discrete math, etc without mentioning him. Perhaps he falls in the EE/ECE realm more than CS
Actually, I think Shannon might be more known than Turing.
Pretty much anyone doing signal processing, which is part of the standard physics curriculum where I'm from, will be introduced to the Nyquist–Shannon theorem while Church and Turing did fundational work in logic and computability but with far less practical application.
I am willing to grant you that a random person off the street is more likely to have heard of Turing than Shannon. But much more likely still to have heard of Stephen Hawking or Jane Goodall or James Watson.
Public awareness of scientists drops off very quickly. That’s just the way life works. It’s lucky enough if random people off the street have heard of small countries or political leaders of their own country. Most people don’t remember most of the major genocides of the past few decades.
Shannon is without question one of the best known (by the general public) 30 or 50 scientists of the 20th century. It is ridiculous to pretend that nobody has heard of him. At the very least anyone with a STEM degree will have some idea.
But not everyone can capture the public imagination the way (say) Einstein did.
> It is ridiculous to pretend that nobody has heard of him.
> At the very least anyone with a STEM degree will have some idea.
You've berated the parent for saying - obviously figuratively - "nobody outside of the computer science world has heard of him", then conceded that having a STEM degree would make a significant difference to the likelihood of whether someone knows of him.
It's a benign thing to trigger such an outburst, especially given there's so little substantive difference between your positions.
For what it's worth, I'm a self-taught software developer of 15+ years' experience and I've consumed plenty of material about science and scientific history - but without any academic STEM study.
I hadn't heard of him - at least to the extent that I remember.
Whereas of course I know plenty about Turing.
And for what it's worth I know much more of Turing's life than Hawking's, Goodall's or Watson's.
The question of how many non-CS/STEM-qualified people have heard about him is an interesting one to explore, and it can do without the poisonous tone you've introduced.
There are thousands of important mathematicians, electrical engineers, computer scientists, etc. languishing in obscurity who don’t get the credit they deserve.
Shannon, as one of the best known and most celebrated scholars (in any field) of the 20th century, and by any reasonable standard a scientific superstar, is not one of them.
It’s like saying “Le Corbusier doesn’t get the credit he deserves. Nobody outside architecture has heard of him. Everyone knows Frank Lloyd Wright, but what about Le Corbusier?!” Or “Carl Jung doesn’t get the credit he deserves. Nobody outside of psychiatry has heard of him. Everyone knows Sigmund Freud, but what about Jung?!”
Pick whatever field you want, and I’m sure you can find a list of seminal figures who are well known to anyone with basic knowledge of the field (say, anyone who took an intro course in college) and familiar to anyone with broad cultural education, but not as recognizable to the man on the street as top athletes or rock stars. Claiming that these people are unrecognized or unheard of is absurd.
Your comparison to Freud and Jung refutes the very point you're trying to make.
I have approximately the same (quite high) level of lay-person interest in psychology as I do in science, and I know more about Jung than I do about Freud.
On Jung's Wikipedia page, the "In Popular Culture" section contains 19 items. Freud's page doesn't have such a section - though of course he is still very well known in the mainstream, but not materially more so than Jung.
Turing's "Portrayal" section on his Wikipedia page contains 14 items across theatre, literature, music and film. No comparable section exists for Shannon, and you couldn't create one that would come close to Turing's.
This is all that your parent was trying to say. Not that Shannon is unrecognised within his field or "less recognizable to the man on the street as top athletes or rock stars", but less recognised in mainstream culture than fellow computer scientist Alan Turing.
Returning to your original comment:
> Yeah! Nobody has heard of that guy. His Mathematical Theory of Communication only has 112 thousand (!!!) google scholar citations, apparently the #4 most cited paper of all time in any field (#1–3, 5–9 are biochem/chem papers, and #10 is clinical psych).
> For comparison, Turing has 5 papers with 10–12k citations each.
The _entire point_ your parent was trying to make was that Shannon is vastly more credentialed and recognized within his field, yet little known in the mainstream.
Why get so worked up over a point on which there's basically no substantive disagreement?
It's probably the case that there are no scientists besides maybe Einstein (E = mc^2) and Newton (apple falling from the tree -> gravity) that the general public would be able to identify by name and refer to one thing that very roughly approximates what they did.
Who knows though, given the public's enthusiasm for modern tech and its stars (Jobs, Musk, Gates, Bezos, etc.) maybe one day scientists' work might make a "Casey's Top 40"-like popular countdown if presented the right way!
I cannot overstate this point and upvote it enough.
The reason Claude Shannon is a legend has a lot to do with the fact that his ideas are not just correct and draw from his multidisciplinary knowledge, but also expertly communicated. It is an incredibly readable paper and only absolutely requires a little bit of mathematics when he is describing how to convert a state machine for Morse code to a matrix which would allow someone to calculate how many bits per time unit can actually be transmitted in Morse code -- and this is not absolutely essential, it is calculated in a different way and also it is communicated that you can easily take his word for it that it is whatever numerical value it was. A lot of the arguments about fitting a stream of symbols into an encoding for noisy channels have these lovely diagrams that help to elucidate exactly what he's talking about.
If you want to make that same sort of impact, it is not just important to be a great mind, but to spend a bunch of time practicing how you communicate that information.
It was said that had the transistor not been discovered at Bell Labs, it would have developed somewhere else in the world within years. In fact, very similar work was taking place at Westinghouse in France under Mataré [0] and Heinrich Welker.
On the other hand, with Information Theory, Shannon was considered to have been ahead by decades.
The closest related work that is considered to come from an independent line of inquiry (taking place in the USSR), but similarly fundamental is Kolmogorov complexity (algorithmic entropy) by Andrey Kolmogorov [1] in 1963.
While Shannon made the huge leap on his own, it should be noted that Harry Nyquist [2] (his colleague at Bell Labs) laid essential foundations through his Nyquist Stability Criterion and studies on bandwidth. This came after harmonic analysis (Fourier transform, etc.) appeared in the 1800s.
As I mentioned elsewhere, I recall reading Turing was starting to think about what we now refer to as information theory, but shortly thereafter he met Shannon. After hearing Shannon describe his work, Turing was so impressed he decided to leave information theory for Shannon and work elsewhere.
Exactly. Effective communication is what made Feynman a legend.
Sadly, many extremely smart and profound scientists are quite incapable of (or not interested in) conveying clearly their thoughts to general audiences.
It’s actually not that common and a common fallacy. We even have a phrase “the tailors kids have the worst shoes” indicating that experts and geniuses don’t necessarily apply their own logic because application is different from theory.
If this is you, you should find a partner a level or few below you in mathematical/scientific prowess and above in communication skills, who can learn your theories and popularize them.
Ironically Feynman's biggest contribution -- the path integral formalism -- was mainly regarded as incomprehensible until Dyson explained its relationship to the canonical formalism.
I don't think it's his biggest contribution to physics; how about QED?
The path integral formalism or the diagrams, albeit being mathematically obscure at first where quite clear from the intuitive viewpoint once he explained them.
The ones I know who are both capable of and interested in it do not usually have the time to do so, and if they do, it is because they became teachers who sell their output to students.
>If you want to make that same sort of impact, it is not just important to be a great mind, but to spend a bunch of time practicing how you communicate that information.
The most insidious cause of this is that getting a permanent position in a department still often depends on publishing in the top journals within that department's subject, which means explaining it only to the point that it is coherent to that subject's experts. Any further elucidation is sometimes seen as simplification or over-analysis, and to the detriment of getting an article accepted in subject-leading venues.
Or, to put it another way, putting more bits down the channel than are needed for comprehension by the editor is seen as unnecessary redundancy.
"The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning..."
(emphasis in the original)
I find that second sentence humorous every time I read it... I wonder if he had seen the current state of internet discussion if he would have changed "frequently" to "occasionally".
> I wonder if he had seen the current state of internet discussion if he would have changed "frequently" to "occasionally"
Which is accurate. A lot of Internet communication is a tracker phoning home, TLS handshakes and other communication with content but no "meaning" per se. In Shannon's day, such industrial communications--over telegraph--were common enough to warrant mentioning but not prevalent enough to make up the bulk of human communication traffic.
Metadata is data. When Shannon says "meaning", he is including control channel information as well. He is distinguishing variable messages from copying constants, which is a different and easier challenge.
Meaning in this case essentially means context. You have a communications channel, it delivers to you four 32-bit integers, what do they represent? Positional coordinates? One pixel in RGBA? In CMYK? IPv4 port/address 4-tuple with zero-padded 16-bit ports? A single IPv6 address? Big endian or little endian?
The ingenuity and brilliance coming from Bell Labs during that era is absolutely astonishing. Transistors, information theory, satellite communication, UNIX/C, the list goes on. These ideas unquestionably laid the foundation for modern high-tech society.
If anyone is interested in learning more about Bell Labs and the folks who worked there, “The Idea Factory” by Jon Gertner is a fantastic book written on the subject. It’s not comprehensive but it’s a very inspiring read.
There has always been something very vexing about Bell Labs’ legacy though. They had everything they needed to start the personal computing revolution: engineers, scientists, equipment, a nationwide telephone network for god’s sake. What happened?
Bell Systems was precluded from entering many markets (and forced to offer its patents to anyone) because of the 1956 consent decree that permitted its ongoing monopoly in telecommunications.
I heartily second the recommendation for "The Idea Factory"! I'm currently reading this book and aside from the seriously impressive run of successes, the characters are really quite amusing at times too. E.g., Shannon built a desktop calculator that operated using Roman numerals only ("THROBAC") in order to amuse himself. And some of the whimsical creations were pretty impressive in their own right. E.g., his maze-solving mouse "Theseus" learned the maze layout on progressive runs through the maze by using relay-based logic.
I'll happily "third" the recommendation for The Idea Factory. It really is an inspiring read, and I expect many firms could find some actionable takeaways to apply within their own organizations, even today. Yes, Bell Labs had some specific circumstances that allowed them some luxuries that other firms may not have, but their success is not as simple as "they had a monopoly, herp derp".
As others have mentioned Xerox PARC, I also recommend the "PARC counterpart" to this book titled
"Dealers of Lightning: Xerox PARC and the Dawn of the Computer Age" by Michael A. Hiltzik. It was a beautiful summer read, full of amusing anecdotes and written in a laid back style while still conveying lots of information about the environment in PARC.
Turing’s Cathedral was an interesting book on the computing origins topic too, and chronographically it felt like it segwayed into The Idea Factory well.
Bell Labs continued doing foundational research unfettered by marketability for a long time after the conventional story has them buried and gone. (To pick a personal favourite, B. F. Logan's work on Click Modulation is a bizarro-world stumble through arcane mathematics that pops out somewhere unexpected and valuable in the field of signal processing.)
Unfortunately, a lot of this research was buried in microfiche in university basements as Bell Labs' legacy was traded from company to company without a viable distribution mechanism. As a result, although this work is now all available on-line (sadly not in open-access form), there are still more forgotten gems in there than many researchers realize.
Even though that brought the lab's demise, its then-younger staff and their mentees continued their work at other organizations, including many at Google.
Xerox Parc had early personal computers and developed GUI/mouse driven software with email, word processor and paint programs and an IDE meant for kids to be able to learn and use. But it was some college dropouts in their garages that brought it to the masses.
One of the more surprising things I remember them researching (in their earlier years) was different wood to be used for telephone poles. IIRC they ran experiments for several years to see what type of wood lasted the longest in different weather conditions. All this to drive down costs in the long term.
A new Shannon biography, A Mind at Play: How Claude Shannon Invented the Information Age, may help reverse this legacy. Authors Jimmy Soni and Rob Goodman make a strong bid to expose Shannon’s work to a popular audience, balancing a chronological narrative, the “Eureka!” moments that sprang from his disciplined approach to solving puzzles, and his propensity for playfulness.
Another book with a lengthy section on Claude Shannon is James Gleick's The Information: A History, a Theory, a Flood [1.]. A shorter but still nice explanation in on Brain Pickings [2.].
Is he really unknown? I know most of us don’t do much signal processing these days but you can’t get far in studying compression without hearing about Shannon.
... about the guy who gets credit for figuring out BOTH how to make computers do math, AND how information can be encoded, transmitted and manipulated.
A Mind At Play is an excellent biography of a fascinating mathematician and visionary thinker. Ranges from early life through final years and also does a decent job of introducing his Mathematical Theory of Communication, the basis of modern information theory. A solid 4½ stars.
Shannon was a speaker at my commencement at CMU. He struck me as kind of a self-effacing, amusing sort of character. Maybe like a much quieter, less ego-driven version of Feynman.
Also: as much as people talk about information theory in various contexts, I doubt that many take the trouble to understand it better. Like thermodynamics, people want to take away an overly broad, folksy interpretation and apply it everywhere without stopping to think if it really applies in the way they claim.
I'm not sure what you mean here, so it may be worth elaborating. I think a lot of people understand how theory, and certainly in communication theory everyone understands the famous equation. I don't see anyone claiming it applies in places it doesn't.
>I don't see anyone claiming it applies in places it doesn't.
It applies in a lot of places, but probably in a more narrow way than people think. You can say "information theory applies here" and probably be right, but to be specific about what that means requires some work. Shannon says that himself in that piece.
Thermodynamics is an easier one to see, because people always make claims about something "because of the 2nd law of thermodynamics", something about everything always becoming more disordered. But that's too simplistic a view, and fails to consider what is a closed system and so on.
Ok his fundamental theorems aren't strictly binary. He wrote of discerning possible 'symbols' from noisy measurement where a symbol could have N possible values. The optimum discernment was at 2 values (1 and 0) but his math handled any base.
In fact it need not even be restricted to discrete signals.
I think he was uncomfortable with the fact that the continuous signal story was not as well fleshed out. He came to revisit that after about a decade in "coding theorems for a discrete source with a fidelity criterion". This did for lossy compression what his 1948 paper did for lossless.
It rarely happens that a paper that creates a field also resolves its most important questions. That was Shannon's standard, I guess. That he revisited the question perhaps speaks to his inner discomfort that the story was not complete.
Actually, if I recall correctly after only 5 years, your first sentence is correct. It doesn't matter how many symbols your alphabet has (i.e. base system used in application) because they can all be represented using binary. I don't recall anything about "optimum discernment" just that it is convenient to standardize proofs using a binary alphabet. For example, Huffman coding is proved using binary but it easily applies to ternary systems as well.
Apparently the optimum information-theoretic alphabet would have e (2.718...) bits, but it's a little awkward to work with a transcendental base. The pain in working with trinary circuits is why we use binary instead.
"Anything can be communicated over a noisy channel without error" - Was something my Elec 201 professor would confidently proclaim before he would go into a lengthy aside about how amazing Claude Shannon is.
That was nearly 20 years ago.
Everyday I appreciate that statement, and Claude Shannon, a little bit more.
He also showed the limits of the efficient market hypothesis in economics. Add in Einstein (information cannot travel faster than the speed of light), and Kolmogorov (random information cannot be compressed) to get 90% of why markets cannot be fully efficient.
It's been forever since I read it, but Fortune's Formula [1] is an entertaining read that goes into Shannon's investment methodology, alongside Edward Thorp [2].
There's a kind of paradox of shannon entropy found in communications with double-entendre I think, in that they appear to increase both redundancy and entropy. I touched on that in this essay proposing what i call subtractive adversarial networks.
Not just at the Kindle store. The eBook is also on sale at Apple Books, Barnes & Noble, BAM!, Google Play, and Kobo. They are all $3.99, same as the Kindle sale price.
I read this shortly after the Internet History Podcast episode with the authors. It is a good, detailed but bite-sized book. Very inspiring and very well written.
Claude Shannon had an interest in AI and had a hobby of making electronic game machines, most of which survive to this day in the MIT Museum. A list with descriptions and photos is available here:
The odd thing about Shannon information is a coin flip can have just as much information as text, yet the former would intuitively seem devoid of information and the latter information rich. It is unclear what exactly is the relationship between Shannon information and what we intuitively consider information. Any ideas?
Shannon information deliberately only concerns syntactic information (information content). Other, more recent work, focuses on semantic information (information with meaning for a receiver).
> Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state.
Roughly speaking: The amount of computation or energy needed to perfectly reproduce a random source, such as a coin flip, is high, while the significance or meaning, for the average receiver, is low. Natural language text requires less computation to reproduce [1], but, for the average receiver, the significance is higher.
Hmm, but you could compress text to an equally random sequence. I.e. all minimal programs are by definition Kolmogorov random.
Also, what about crystalline forms, which are very orderly and require minimal computation to reproduce, but are equally insignificant for the average receiver?
> you could compress text to an equally random sequence
More or less correct. The key difference is that you could not compress a random coin flip sequence (and that a compressed text is meaningless until decompressed to original).
> all minimal programs are by definition Kolmogorov random
Compression provides an upper bound to K. Kolmogorov Randomness itself is not computable. AKA: You can't ever know if you have a minimal program.
The best approach that I've seen is a combination of Shannon information and Kolmogorov complexity. If an object has high Shannon information, then it is not crystalline. If it also has low Kolmogorov complexity then it is not random. This seems to characterize the sweet spot where meaningful information occurs. Kolmogorov called this quantity "randomness deficiency".
Rather than see the coin flip as a useless toss, one might imagine it as a binary decision. This or that?
With this in mind, you can see text as the output of an algorithm (brain) taking many such decisions. The information entropy contained in this text reveals the complexity (amount of distinct binary decisions) which needed to take place in the machine (brain) in order for the text to occur.
This is a controversial opinion but: Claude Shannon is more noteworthy than Einstein. It’s simply the case engineering science is not taken seriously by the scientific community despite its overwhelming contributions to human development, namely digital signal processing and computer engineering.
At my uni they only had communications theory, which covered stuff like software defined radios. Information theory was a significant part of it though.
I would argue Shannon’s case is far from unique. For my money, Gauss was the greatest mathematical mind of all time but does the general public know of him.
He also did early work in cryptanalysis, AI (minimax chess algorithm and a learning robotic mouse). His student was Ivan Sutherland. More or less, whether its networking, signal processing, compression, crypto, machine learning, circuit design, basically anything to do with the digital age, you'll find Shannon did important foundational work there.