The History of Microsoft Encarta

weinzierl · on Oct 30, 2023

The beauty of Encarta and its printed predecessors is that they are unalterable snapshots of the knowledge and views of their time.

I inherited an encyclopedia from the 1930s and it is a constant source of surprise and wonder to me - both in what is different from today and in what I expect to be different but is not.

I think Encarta is the last of this generation.

Sure, I can download a snapshot from Wikipedia and I did and others did and do as well. It's still not the same, because all our snapshots are different. Anything surprising in there could just be fluke, an editing error, the temporary state of an edit war.

A Wikipedia snapshot proves nothing. My 30s encyclopedia on the other hand is a stable reference. If I doubted anything in my copy I could easily find another antiquated copy and compare. Same with Encarta, there are so many copies out there that this particular snapshot of human knowledge will never die.

crazygringo · on Oct 30, 2023

Funny, everything you describe as a positive I see as a negative.

The 11th, 12th, and 13th editions of the Encyclopedia Britannica were from 1911, 1922, and 1926.

So if, as a researcher, you were interested in the state of knowledge in 1916, you're out of luck (if you're limiting yourself to a particular encyclopedia, for example).

I don't know what value you see in "stability", when the world's knowledge changes on a second-to-second basis. It's equally arbitrary whether you freeze knowledge at 1911 or in any given point in time since Wikipedia began.

When you say "A Wikipedia snapshot proves nothing", I have no idea what that means. If you're worried about vandalism or quick edits, it's trivial enough to compare with earlier and later versions of the page to ensure you're looking at relatively stable text. Heck, most of that could even be automated if you wanted.

And while Wikipedia articles are full of errors, so too were the articles in every edition of Britannica. The difference is that Wikipedia errors tend to get fixed a lot quicker, while the Britannica errors just remained on the paper they were printed on.

I genuinely don't see what difference there is between the "stability" of the Britannica 1911 edition, and the "stability" of Wikipedia at some arbitrary timestamp. Both capture a similarly arbitrary moment in time -- Wikipedia just gives you so many more to choose from.

jquast · on Oct 30, 2023

You should especially consider the role of a historian or the person who wishes to set out to correct facts and figures that are in error in Wikipedia, this is precisely what they need to correct it, to be able to trace the lineage of the error and correct the associated facts and figures like an excel sheet would, it’s very important especially with LLM’s that can’t cite their sources and google having gone to shit to find them, errors are quickly propagated across the web like some kind of crypto mixer, it becomes difficult to verify

A non-historian mathematician does a good job explaining the issue of census data errors, https://m.youtube.com/watch?si=ySApTldsYVf3jv0W&t=1630&v=GVh...

UltimateEdge · on Oct 30, 2023

> Heck, most of that could even be automated if you wanted.

I've had the idea of building something like this. The concept being that you would select an article and a time interval, and be shown the "best/most stable" revision of the article within the given window. The tool could use any number of metrics for determining which revision is best, the most reasonable one I've managed to come up with is "highest number of views during a state where the article was not locked/available to edit".

crazygringo · on Oct 30, 2023

Yeah, the idea had never occurred to me until now, I'm glad to hear someone else has already thought of this.

I wasn't thinking as much as identifying any single best revision, but utilizing more of a diff-like tool to identify the text/changes that remained most stable over time -- where a brand-new edit doesn't count for much, but the longer it stays around as other edits are made, the more trustworthy it presumably is.

I think the biggest problem comes as articles get rearranged and expanded -- a section gets split into two or three, something gets moved from one section to a more appropriate one, and so forth. Or heck, sometimes entire articles get split into multiple ones, or vice-versa. I'm not aware of any diff-like tool/algorithm that handles these situations well, to accurately track how the same information gets moved when it's not just a simple case of insertion.

UltimateEdge · on Oct 30, 2023

Huh, so a bit like git blame? And then you would merge together the chunks/edits which are most stable? That sounds awesome!!

I suppose you can't really count on the same text/markup being shifted around as articles get split and modified in the ways you've described. Also I suppose there is no such thing as a cross-article edit in MediaWiki terms iirc. Use vector embeddings? Throw an LLM at the problem? Rate editors on their familiarity with a given topic area (and track how that evolves over time)?

The idea of using edit information in addition to the raw text written by editors seems like it's extracting additional bits of information from human interactions.

I might have read some idea in a HN comment of training AI not just on code, but on how that code is edited in a git repo, or maybe I am just imagining it.

stakhanov · on Oct 30, 2023

This is a bit off-topic, but how do you actually browse a Wikipedia snapshot? I tried turning a dump [1] into a browseable website hosted locally a few years back, and it seemed anything but trivial. Has any progress been made here?

[1] https://dumps.wikimedia.org/

mvdwoord · on Oct 30, 2023

Perhaps with something like Kiwix? I have used it for offline Wikipedia during prolonged travel...

https://wiki.kiwix.org/wiki/Main_Page

weinzierl · on Oct 30, 2023

Seconded. It's a couple of years and there could be something better now, but I also used Kiwix. It's not perfect, but it works.

solardev · on Oct 30, 2023

Wikipedia lists a bunch of offline readers (that I only found with Google)... https://en.wikipedia.org/wiki/Wikipedia%3ADatabase_download?...

vwcx · on Oct 30, 2023

Could you speak to the practical values of a stabilized reference encyclopedia?

If the study of history is a long-term, iterative process, where is this stability important (beyond the filtering of short-term noise, vandalism, political influence, etc)?

polonbike · on Oct 30, 2023

I can see several advantages: -having a "frozen" dataset, meaning that I can work on the same dataset as my colleague -Having all the topics frozen at the same date, representing global knowledge at one point. They knew X and Y, but Z was not advanced yet, so they could not do W. It is hard to realized the concomitance of knowledge bumps on the time scale in a continuously changing encyclopedia.

solardev · on Oct 30, 2023

I think this is mistaking the costs of publishing with the stability of knowledge.

The knowledge was never stable, even back then. The encyclopedias could only afford a team of some fixed size, to publish an edition every few years.

Wikipedia just scales that up to many editors doing real time edits. Arguably it's a more reflective representation of how organic knowledge transfer actually happens.

rchaud · on Oct 30, 2023

Crowdsourcing knowledge from essentially anonymous contributors in this day and age is a terrible direction to be heading in.

As far as scaling up is concerned, Wikipedia has "power contributors" like any UGC platform. One guy alone has 3 million edits.

https://www.cbsnews.com/news/meet-the-man-behind-a-third-of-...

solardev · on Oct 30, 2023

I remember having this argument 20 years ago, with teachers who've never seriously used the Internet.

I don't think it's still a realistic argument to be having in 2023 :)

rchaud · on Oct 30, 2023

Using Wikipedia as a citation in college is a straight ticket to a failing grade.

duskwuff · on Oct 30, 2023

Citing any encyclopedia or other general reference text as a source is a warning sign, because it means that you haven't done any in-depth research, and/or you're trying to cite sources for information which is considered general knowledge in the field. Wikipedia isn't special in this regard; you're going to get marked down just as much if you cite the Encyclopedia Britannica or a dictionary.

solardev · on Oct 30, 2023

It's a secondary or tertiary source that does a great job or collecting and summarizing primary lit (which can be cited if more rigor is preferred, at the cost of context). My college professors encouraged the use of both, maybe with the caveat of citing a particular Wikipedia revision rather than the article directly.

But seriously, these were the discussions we've had two decades ago. If a professor today had such an issue, they're very old fashioned, and I'd probably walk out of that class because who knows what else that professor is out of date on.

So many fields today are evolving so rapidly, I wouldn't trust any single expert on a topic. Better to have a living crowdsourced reference that collects many sources.

botanical · on Oct 30, 2023

My parents bought me multiple editions of Encarta in my childhood and I absolutely loved it as we didn't have internet at that time. It was much more than an encyclopaedia; you could explore ancient sites and play educational games. It was great fun.

As great a free resource as Wikipedia is, each article's quality relies on a knowledgable contributor to really make it worthwhile, and those are few and far between. Wikipedia is dry and lacks what Encarta had

nolamark · on Oct 30, 2023

Should you want to dig deeper, and see view from the other side of the battle. (posted with distant taste of sour grapes in my mouth, aka, I played for the EB side)

https://www.hbs.edu/faculty/Publication%20Files/Reference%20...

https://web.archive.org/web/20031026131928/http://www.howtok...

BirAdam · on Oct 30, 2023

Article author here, thank you so much for posting this! I truly appreciate it :-)

bknight1983 · on Oct 30, 2023

My sister and I used to play Mindmaze for hours during the summer in the 90s. Encarta was such a great resource at the time.

agumonkey · on Oct 30, 2023

Interesting comment about how pre-wikipedia encyclopedias were more engaging.

I felt this with encarta, and old printed volumes. It's written by someone, or told in a different way. While wikipedia feels more like a text book reference.

theFco · on Oct 30, 2023

The Stanford Encyclopedia of Philosophy[1], while restricted in scope, takes the approach of having experts write the articles that are then revised and evolve with time. So a different approach is possible, but I think it would be hard for an encyclopedia with the breadth of wikipedia to do it. Maybe there is space for more specialised encyclopedias that take the approach of articles having authors and revisions instead of wikis. But all in all, I think both approaches compliment eachother.

[1]: https://plato.stanford.edu/

mananaysiempre · on Oct 30, 2023

Scholarpedia[1] also exists, but is very sparse and mostly a specialist reference.

[1] https://scholarpedia.org/

rchaud · on Oct 30, 2023

Wikipedia is amazing, all things considered. But it's true that Encarta was a more integrated experience, with plenty of licensed images and video.

Wiki is open source, so it's limited by having access to CC-NA licensed assets only. It's also firmly an "Internet-first" encyclopedia so it's beauty comes from its SEO-friendly information architecture and internal link structure .

I'd bet it's easier for someone to find specific info on Wiki quickly, whereas Encarta would be better as a slow browsing experience.

wicket · on Oct 30, 2023

It's understated the impact that Encarta had on Microsoft dominance in the home computer market. Encarta came about at a time when internet access was not yet commonplace and "multimedia" PCs were next big thing. Amiga, Atari and Apple had CD-ROM drives but only the PC had Encarta. It was revolutionary for 90s teens with homework assignments. These days some might use the term "killer app" to describe it. Obviously there were many other factors that led to Wintel dominance, but Encarta certainly played a part.

bogantech · on Oct 30, 2023

> only the PC had Encarta

Encarta was available on the Macintosh in the 90s

conzept · on Oct 30, 2023

Encarta was a big inspiration for my Conzept project (https://conze.pt) - a topic exploration system based on Wikipedia, Wikidata and other (pluggable) datasources.

See many usage examples here: https://twitter.com/conzept__

qingcharles · on Nov 4, 2023

Does it pick a random topic for each visitor?

Mine was 2 Girls 1 Cup o_O

HeckFeck · on Oct 30, 2023

My favourite trivia for Encarta is that it was built using a rich text view + a custom hypertext system, similar to the old Windows Help format before it all became HTML and IE views.

Narishma · on Oct 30, 2023

As the article states, later versions of Encarta switched to web tech.

macintux · on Oct 30, 2023

Not mentioned: Microsoft Instruments, a wonderful way to experience music snippets from instruments around the world, which was killed via absorption into Encarta.

(I never owned Windows, so I never played with Encarta; it's possible the same experience was achievable, but I like dedicated software to play with when there's no real advantage to integrating it into a larger ecosystem.)

glimshe · on Oct 30, 2023

We sometimes forget how fortunate we are to have so much information at our fingertips. Even the richest person in the world would have a hard time having enough books to compare to, let's say, Wikipedia.

snvzz · on Oct 31, 2023

As neither this article nor the Wikipedia one provide a table with version history, I did some digging and found this table[0].

It's interesting. E.g. until "Encarta 96", the application was "16 bit". I wonder if this means that 95 and before worked on a plain 8088 5150. This 95 article[1] claims otherwise (386SX).

0. https://betawiki.net/wiki/Microsoft_Encarta

1. https://betawiki.net/wiki/Encarta_%2795

ChuckMcM · on Oct 30, 2023

To be fair, I would love a curated version of Wikipedia periodically exported as a self contained reference application. I thought this would be a good use of the Internet Archive capabilities too.

At Blekko we, of course, crawled all of Wikipedia. One of the more interesting aspects was how many dead links it had to references (which presumably at one time were not dead). Sometimes you could find those references in the wayback machine and sometimes they were just "poof" gone. This is the nature of the web, information isn't persistent. Hence I think periodically pulling it into static storage would be a good thing for longevity.

duskwuff · on Oct 30, 2023

> One of the more interesting aspects was how many dead links it had to references (which presumably at one time were not dead). Sometimes you could find those references in the wayback machine and sometimes they were just "poof" gone.

This is largely a solved problem. There's a number of bots on Wikipedia and other WMF wikis which periodically trigger Internet Archive dumps for all external links which are used as references, and which can replace those references with links to the archive if a site goes offline.

qwerty456127 · on Oct 30, 2023

Are any peoprietary digital encyclopedias available nowadays? I love Wikipedia and am glad we have it for free but I also know non-free encyclopedias can happen to be much better in some specific ways.

WillAdams · on Oct 30, 2023

That would be _Encyclopædia Britannica_

https://www.britannica.com/topic/Encyclopaedia-Britannica-En...

Also, _World Book_ is still in print:

https://arstechnica.com/culture/2023/06/rejoice-its-2023-and...

gosub100 · on Oct 30, 2023

I wonder if Wikipedia articles ever cite paid encyclopedias as sources? It's kind of a conundrum: many people who don't trust wikipedia would regard paid products as more "legitimate", but Britannica would presumably not want to be cited because that implicitly helps a free competitor.

WillAdams · on Oct 30, 2023

Wikipedia started out as a copy of a public domain edition of EB.

Since then, lots of folks have edited and contributed content.

It was a pretty open secret that when Funk & Wagnalls (where Encarta got its content) was first starting that they would pay college students to write articles, the college students would crib from EB or some other similar source, and then submit that.

masfoobar · on Oct 30, 2023

Think we had Encarta '95 or abouts.

I think it was a good tool. Had to uninstall it a number of times when hard drive space was low.

I remember using it for homework like History. I was luckly it provided a sentence or two. Not enough to really learn something.

Perhaps Encarta '98 onwards were much, much better.

Today, we have Wikipedia, google search/maps, assitance like Alexa and, now ChatGPT.

Imagine the next 20 years.

giantrobot · on Oct 30, 2023

ChatGPT will happily imagine all sorts of made up facts for you. It's not really a great reference.

masfoobar · on Oct 31, 2023

I am mostly referring to the evolution side of these tools, generally speaking

bjourne · on Oct 30, 2023

We had Encarta in the library at my school when I was 12. Having previously only used C64 and Amiga it was my first stint with PCs. If you needed to research something for homework say the solar system you would kindly ask the librarian for an Encarta CD-ROM and she would allocate a 45-minute slot at the library's PC.

xacky · on Oct 30, 2023

I remember the period where Encarta allowed user contributions to their website but more moderated than Wikipedia. It's a shame that Microsoft gave up on Encarta too soon, it could have been a perfect integration with Bing AI and not subject to the vandalism and notability wars that Wikipedia suffers from.

wayne · on Oct 30, 2023

I interned on Encarta one summer shortly before the fall, and 11 years ago I wrote a Quora answer about why I think it failed, which amusingly still gets upvotes: https://qr.ae/pK2pGA

SantiagoElf · on Oct 30, 2023

The startup sound of Encarta Virtual Globe 98[0] brings me such fond memories :)

[0] - https://youtu.be/eNQ5yFlKhls?t=262

jeremycarter · on Oct 30, 2023

I remember The Hutchinson Educational Encylopedia on CD ROM. https://archive.org/details/the_hutchinson_educational_encyl...

scrumper · on Oct 30, 2023

I came here to pour scorn on the claims of an attempted multimedia CD-ROM in 1985, but turns out the first one was released in 1987! Red Book was mid '80s. I could've sworn they were a mid 90s thing but this really could have happened in 1985.

blackoil · on Oct 30, 2023

Is there any for kids Wikipedia equivalent? Targeted for 10 years old. Wikipedia is too dense and drab for someone just excited or curious about a topic.

jimktrains2 · on Oct 30, 2023

There's the simple english version of wikipedia.

https://simple.wikipedia.org/wiki/Main_Page

Synaesthesia · on Oct 30, 2023

Prior to Encarta there was Grolier’s encyclopedia.

pixelesque · on Oct 30, 2023

Indeed, as the article mentions - but the UI for that was pretty bad IMO...

The Software Toolworks Multimedia Encyclopedia was better IMO (at least in terms of UI and multimedia), and then from 1995 onwards, Encarta was generally better (again IMO)...

TheRealDunkirk · on Oct 30, 2023

People want me to believe that Microsoft is "new," when people like Phil Spencer have been there since the start, and worked up to things like being the head of Xbox.

https://en.wikipedia.org/wiki/Phil_Spencer_(business_executi...