> Instead of the above syntax combining Markdown and JSON, I could use a Unicode private use area to define bytes that mean “line,” “circle” and so on.
That would not be a good idea because of contention, as you may wish to use those code points for their meaning in some icon font.
You’d be better to use HTML elements, which is the proper way of embedding such stuff in Markdown. (Look, even hijacking code blocks as an extension mechanism is a controversial technique.) And then SVG, or at least a subset of it, becomes an obviously excellent solution; the {"p2":{"x":141,"y":85},"mode":"line","p1":{"x":34,"y":44}} example becomes:
I assumed "Plain text with lines" meant SVG, didn't even notice it was JSON instead of a <path> at first. The SVG spec is huge, but the parts for lines and circles isn't.
I haven't thought about the file format much so far, just the experience of writing in it as if it's the "ground truth". We all seldom open our text files in a hex editor.
I actually suggested SVG out of laziness, er, efficiency in the first place, saves you from have to write your own "spec" if there are good SVG libraries from Lua. (Just a thought, thanks for sharing your work.)
It kind of is, though. Markdown did not start out as a markup language, but as a tool for displaying what was then considered to be plain text which used certain popular style conventions. That is, people wrote "plain text" in markdown style for many years, on mailing lists, in newsgroups, and in doc files, long before the introduction of markdown.pl.
In mailing lists and discussion groups, they actually used * as bold marker and _ as italics. Something I wish markdown would have used, instead of ** and *.
Those punctuation marks make sense and came from old-old school notations for typewritten manuscripts. I remember them from my Strunck and White /Elements of Style/ book from English 1A (sorta-old-timer here).
Microsoft Office apps still by default apply those marks for you. Quite annoying when you are trying to send an email with a file spec and it bolds things like *suffix.*
Edit: and HN does it as well :)
Isn't it the other way? Plain text is markdown but markdown isn't plain text?
You're leveraging some semi-readable things[1] to inject into the document to have special meaning, but normally written[2] text will render just fine in markdown.
1. I say semi-readable because BS like image and link tags really stretch the bounds of "as someone would naturally write it".
2. Normally written here is carrying a fair amount of weight but the "obscure" symbols used in markdown (that aren't just plain text conventions anyways) are extremely uncommon in normal communication.
In a sense, sure. Plaintext input is completely valid markdown though. You can take text from any plaintext editor and paste it into a markdown input and it will render as HTML or XHTML that very much resembles what you copied.
Then, if you want lines, you can embed SVG in it. You could also draw in a box, use a tool like ImageMagic to make a PNG or GIF, encode that as base64, and embed it into the document as HTML source.
What the author has done is basically invent their own JSON-based syntax for Pic, Fig, or SVG and their own comment/template tag format.
> You can take text from any plaintext editor and paste it into a markdown input and it will render as HTML or XHTML that very much resembles what you copied.
This is very, very, very far from the truth. People get tripped up by accidentally invoking markup and thereby destroying their text all the time. You ever see unintended italic text and missing asterisks here? I sure do. Or on sites that do Markdown, ever talked about a generic type Foo<T> and had the <T> disappear? Or talked about __init__.py and got 𝗶𝗻𝗶𝘁.py? Or taken text where you used a single newline character for paragraph breaks, and had everything end up as one paragraph?
> Plaintext input is completely valid markdown though.
Only inasmuch as there’s literally nothing that’s invalid Markdown. If you take text that you wrote as plain, unformatted text, Markdown will routinely destroy it.
—⁂—
To address the chain of comments that brought us here:
• I spoke of Markdown because the author spoke of Markdown, and used Markdown syntax.
• Markdown is plain text? Depends on what you mean by plain text. Under stricter definitions, absolutely not. Under weaker definitions, only mostly, unless you want to say that HTML is plain text too, in which case sure.
• Plaintext is not Markdown? Certainly true. Most plain text won’t be too badly damaged by reinterpretation as Markdown, but a lot will.
Plaintext, as in text, won't get seriously mangled. Text with symbols, which I admit should fit in the general idea of a "plaintext file", will be interpreted potentially in ways not intended in a regular text edited document.
Italicized, bold, and underlined HTML text does pretty much resemble the purposes for which those symbols (asterisks, underscores) are used in text. Even bulleted lists and numbered lists are handled in a pretty straightforward fashion.
Where you see problems is when you're putting in footnote notations, equations, source code, and such. There are ways in both HTML and markdown to keep pre-formatted text for those cases. What the author has accomplished is not having a special notation for those, but also reinventing Pic, Fig, DOT, or SVG for simple images. It's done by requiring a game engine rather than using any of those prexisting tools.
Pic and Eqn are literally from the 1980s and 1970s, respectively. Pikchr (a Pic enhancement) is designed to be embeddable into markdown rather than troff, and could pretty easily embed into plaintext-with-template-tags like this. DPic is Pic-compatible and can produce Postscript, PDF, SVG, or image formats. As a bonus to the author of lines.love Pic-compatible tools already can take nested expressions in a slightly different format, and can also produce programmatically-defined shapes.
Plain text is a very powerful tool in education [1]. I use org-mode.
It's essential to have universal tools that all students have access
to at almost zero cost, work across platforms, are light enough to
work remotely over poor connections, don't go out of date, quick to
learn.
Unfortunately Microsoft keep bribing our admins to push Office365
cruft on the poor souls. I have to do a lot of work to repair the
damage that does and get them back to a stage where they can think
about ideas, not battling with buggy, over-complex, insecure
application suite.
Why is it essential? I was in school a little over 5 years ago and no one received special training on Office. You were just expected to work it out since it is obvious. Many students used Google Docs instead, I saw some use Libre Office, I used Latex. None of this was a problem. You just converted the file to pdf and sent it away. Even conversions to docx work fine since your documents are simple.
None of this seemed very consequential to me. Your choice of document editor meant little more than your choice of shoes. It doesn’t impact you in any way going forward and is trivial to swap later.
What seems to be happening in the past 5 years is BigTech aggressively
pushing into education. Get them hooked young. A sixth form
(high-school) headmaster proudly told me how his school was not under
the thumb of Microsoft, because his was a "Google Academy". I have no
idea what that even means by the way - other than that money changed
hands in a back room deal.
Kids may not be told expressly "You must use Vendor X", but network
effects and non-interoperability will pressure them in subtle ways.
This is exacerbated when teachers ask students to submit course-work
via proprietary vendor platforms. A departure from open standards and
neutral tools, toward monopoly proprietary suppliers is obviously
unacceptable in education. But the rationale, as always, is economic,
and against a background of deskilling of educational ICT.
Wherever I get to choose tools and direct students I pick the most
universal, accessible, free and open source software I can.
This was Apple Computer's corporate strategy before the Macintosh was even introduced.
Something relevant might have happened five years ago but it wasn't 'Big Tech ... pushing into education', that strategy is older than most people posting on this site.
> that strategy is older than most people posting on this site
True. And let's not misunderstand that this was only "big tech". I
can remember (being older than most on this board) our BBC micros
being ubiquitous in schools circa 1982 under government digital
literacy edicts. That was a big win for Acorn at the time.
What has changed in recent years is full-spectrum domination of
vendor values in education. Hitherto, we've at least been able to
install the software we like, even if some vendors (Dell/Novell/Apple
etc) pushed for hardware dominance. Later they pushed for operating
system dominance. And the end-game seems to be pushing for full-stack
application and service domination of fully locked-in ecosystems.
That's too much power, and it's not appropriate for schools anywhere.
Not to mention that it's ultimately a disaster for competing
educational software companies.
I certainly do not want to see any child educated with Microsoft
values, any more than I want to see kids "nourished" to McDonald's
nutritional standards.
So yes, it's a very old strategy, and has been a very successful one.
Too successful, and has reached the point it needs firmly slapping
down.
"I certainly do not want to see any child educated with Microsoft values"
Not just the children, one thing I have seen as of late is indoctrination of developers in Arrange Act Assert (AAA) - I specifically remember it being pushed because people were indoctrinated in Given When Then - however ironically most people end up indoctrinated in AAA, which is hypocritical to original point...
As an adult, constantly seek to push past (your) boundaries, because you are probably being fed some ill-informed BS about how "things should be". There are a lot of other places this holds true for as well, in tech (using the newest but not best performing tech your boss sold out for), in life (having a standard meal every day and suffering nutrition deficit or adverse health issues, eg.)
Given-when-then and arrange-act-assert are different variants on the same idea: tests shoukd show that a postcondition holds after an action occurs under certain preconditions.
It's the same thing as Hoare triples, really. It's sound computer science.
> Hitherto, we've at least been able to install the software we like
If you were very lucky and had an IT department not locking things down. Locking things down in this respect has been a key selling point for e.g. Microsoft for a very long time.
i remember this from my time at school, some 25 years ago now, and it was extremely unhelpful to education as well...
a lot of time spent explaining things like "how do i do something where i would normally right click?" or that the power button is this weird thing on the keyboard without the universal symbol for a power button on it... once they got some wintel machines from RM, another company that pushed at education, all of that went away and the lessons started becoming genuinely educational.
I see models with a power button, clearly labeled with the ISO standard symbol for a power button, and models with no power button, because it moved back to the case (and used the ISO standard symbol on it).
By the time I was 13 I had used mice on four different GUIs so I have trouble sympathizing with your difficulty. User interfaces aren't facts of nature, which is a great learning opportunity to have in school. But learning opportunities must be taken.
Apple didn't use the IEC power symbol on their keyboards until the iMac-- prior keyboards used a left-pointing triangle on the power button.
But most of these machines shipped in an era where there wasn't necessarily any standardization of how to turn the machine on in the PC-compatible world either-- the now-standard momentary electronic switch on the front of the case came with the ATX spec in 1995. Before that, you'd usually have a flip switch or latching power button somewhere on the case. (Macs were unusual for the era in having power on the keyboard from the ADB keyboard on.)
⏻: "IEC 60417-5009, the standby symbol (line partially within a broken circle), indicates a sleep mode or low power state." [0,1]
⏼: "IEC 60417-5010, the power on-off symbol (line within a circle), is used on buttons that switch a device between on and fully off states." [0,2]
Of note, the slightly different IEEE standard using these symbols was inactivated 2020-03-05.[3] Unicode lists both standards for the character definitions.[4]
Big tech has an advantage going into schools because schools LOVE end to end contracts that do everything.
No other vendors but Google and Microsoft are willing to fight over schools, because the margin is too low. Google and Microsoft however will, because they are vertical enough they can absorb the low margin.
They're willing to do this at low margin, zero margin, or potentially at a small loss. It's not just because they can absorb it. It's because mindshare is real. Having schoolchildren trained in your technology is a loss leader for the workplace contracts you're seeking from everyone from the corner store to the Fortune 500 companies.
The startup I work for in Scandinavia sells to education customers. As far as I understand, all schools here are either "Microsoft schools" or "Google schools".
And now all our kids are being tracked at school too. And in 20 years, their profiles will alter their "choices" of career because some HR will have access to it.
The problem is not to use the tool, the problem is that the schools forces it upon kids (because the teachers can't learn all the available tools -- which is understandable)
I don't see what's wrong with this. Microsoft and Google are paying to educate people on their products. There is nothing inherently evil about Office or Google Docs. There is no lock in contract, and it isn't that hard to switch later. And the schools/students benefit from a cheaper education than if they had to pay full price.
> Microsoft and Google are paying to educate people on their products.
That was 1990. Now Microsoft and Google are paying to educate people
on their values. That is different. Children having their
educational horizons limited, curated, monitored and nudged by a few
corporations is the problem here.
> There is no lock in contract
You've read them all? All of them?
> and it isn't that hard to switch later.
No this is plain wrong. Any IT person will tell you that migrating
from one vendor to another is a nightmare. Getting too deeply into a
relationship with one vendor is always a business mistake.
> And the schools/students benefit from a cheaper education than if
they had to pay full price.
Another way of looking at that is since the vendor sets the price on
am essentially zero cost service, subsidising one sector is an
anti-competitive behaviour.
But that is not the point. Students should get the best education
possible, not the cheapest. Corporate McEducations are why we're
falling behind against nations like China who are quite happy to use
our better products (Linux etc) to teach timeless principles instead
of transient products to their students.
It gets harder to change such habits the older you get. Not so much in terms of ability. They can change, but it's less likely they want to take the time to do so. They know this, that's why it's important to target the young demographic, because it primes the growth for decades to come. At least, in a probabilistic sense.
We are given AWS technologies inside specially-provisioned sandboxes to teach at the university, and the courses are marketed and badged as AWS. Students get free training, certification discounts and more.
No money changes hands, but AWS provide all the tech free, sponsor AWS awards for the students, come to jobs fairs... all the while hoping that X% of those compsci students will, in the fullness of time, come to be decision-makers who choose AWS technologies for their organisations rather than Microsoft or Google.
It's the Happy Meal principle, hard at work today.
> No money changes hands, but AWS provide all the tech free, sponsor AWS awards for the students, come to jobs fairs… all the while hoping that X% of those compsci students will, in the fullness of time, come to be decision-makers who choose AWS technologies for their organisations rather than Microsoft or Google.
They don’t need that, though that’s a bonus. A surplus of technologists skilled in AWS technologies, means that its easier to hire if you are looking for AWS than it would otherwise be, so it tips the scales for the decision-makers well before any of the students become decision-makers.
My kids are in college now, so they went to high school recently. The school subscribed to Google Classroom, and students used Google Docs to work on assignments, often collaboratively. There was no systematic training, and no use of features beyond basic text editing.
Nobody uses Google Classroom after they finish school, and most people can quickly adapt to any text editor. They already use multiple text editors if they interact with any online software. There's no use for fancy page formatting of documents that will be read once and never printed.
If anything, Google and Microsoft tools have the advantage of being general purpose, even if they are closed source. This liberates students from "educational" software that is almost universally abysmal. Subjects like math were still being taught in relatively traditional ways, e.g., by pencil and paper.
My main gripe was that you couldn't put Python on the school issued Chromebooks.
I got mandatory training in Microsoft Office as part of the state curriculum for highschoolers. I don't think basic training really promotes lock-in with Office suites. (well, except for Excel, but that's a programming environment)
It's surprising how non-obvious office suites are, because I regularly see people in a professional setting who don't know how to use them well enough. However, I don't really think training in any particular office suite isn't transferable, since the major skill is knowing what the ruler, styles, and various line/page breaks are actually there for apparently. I had no trouble using LibreOffice or whatever, but people definitely don't seem to know the concepts of word processing well at all. This isn't their fault, as it does /kinda/ work even if you just keep slamming shit into the page until it's right.
Forcing people to work it out on their own is just training people to put up with "I don't know, I got it to look right but if I put a space between these words my entire layout breaks 4 pages later." when they don't have to live like this.
I used LaTeX for years, and almost _everything_ was a problem. The moment I wanted to venture one bit beyond the obvious and typical, it's very crufty macro programming. The situation has improved somewhat w.r.t. packages supporting you in doing that, but still, "no problem" is certainly not how I'd describe it.
To be fair, converting pdf to any other format is not a trivial task. PDF has bigger interoperability problems than docx. I can collaborate with Word users using Emacs's org mode. I cannot do that with PDF.
The biggest issue with structured data is that our most powerful tools for tracking change over time only understand plain text, and even that they do fairly poorly. Any attempts to represent our programs as anything more than marked up text end up crashing against the rocks of Change Over Time concerns.
I don't know if there's a mathematical proof out there that says it's intractable, but most of us behave as if it is. But my suspicions lie elsewhere. I think the kind of intelligence that is comfortable working in this space is simply not attracted to the field in sufficient numbers, and so the Venn Diagram of people possessing the right skills and sufficient motivation is just too small.
In the late days of UML I realized that I was spending most of my time fixing path finding problems. I had a running joke with a coworker about hiring a bunch of game designers to fix this stuff. I don't know how much money you have to pay game AI designers to work enthusiastically on diagrams but I am sure I'd get sticker shock.
The nice thing about office is at least the file format is an open standard and can be opened in myriad programs as is or converted to html, rtf, markdown etc.
> the file format is an open standard and can be opened in myriad programs as is or converted to html, rtf, markdown etc.
Have you seen that work well in practice? I haven't - if you want to reliably open Microsoft Office docs, the only answer is Microsoft Office. Just because it's open doesn't mean it's coherent, consistent, or possible to implement.
Open in the sense that the standard literally states "will handle XYZ case as the case is handled in MS Office 2000. So to implement the standard one needs a copy of Office 2000 to see how it handles specific cases. That's black box testing, as Office source code is not available.
Since everybody talks about the format for figures, I feel the need to mention the .fig format [0] that was made exactly for that use case (human-editable textual description of simple line figures). It is also the native file format of the venerable Xfig figure editor [1].
And pic was slightly extended and modernized by the SQLite author(s?) to become https://pikchr.org, for use in SQLite docs. (All the SQL syntax diagrams are drawn with pikchr, IIUC.)
If you want to embed pikchr SVGs in your statically generated HTML content, you may find https://soupault.app/ useful.
Thank you for the link to pikchr--anything by the SQLite folks is something to check out.
Not having a diamond object-class for decision seems to be a large miss. [0] Also, unlike Mermaid [1], three downward arrows to boxes aren't spaced from one another, the boxes wind up overwriting one another. This gives pikchr a procedural/syntactical feel rather than an object/semantic one.
I made a few pikchr diagrams and liked the experience, though it would be good if there was a lossless WSYWIG editor that could create and edit them for quick and dirty creation.
There was a Changelog podcast episode with Richard Hipp (sqlite author) where he threw out the great idea that it would be nice if there were renderers in popular markdown engines that could display inline pikchr blocks.
https://changelog.com/podcast/454
The fig format isn’t very human-editable. It has a few nice features that SVG doesn’t (e.g. concise pattern and arrow head styling, which have to be done separately and more explicitly in SVG, though on the other hand SVG gives you more flexibility), but its actual syntax is bluntly horrible. It’s mostly just a sequence of space-separated numbers, where some are enumerated values, some integer coordinates, some float coordinates, some integer sizes, some still other stuff; where some values change the syntax for subsequent lines; and there are two different sets of units intermingled, one defined in the document and the other in 80ths of an inch (… though it might just get thinned to 160ths of an inch, and is also defined as one screen pixel, whatever that means, given that it came from before the high-DPI era); and…
Just to explain the first line: 2: polyline or similar. 1: actually polyline. 0: solid line. 3: line is 3⁄80in thick. -1: default pen colour. 0: white fill (… but ignored because of “no fill” later). 0: depth (z-index). 0: unused. -1: no fill. 0.000: to do with dashes (not used because of “solid line” earlier). 0: mitre join. 0: butt cap. -1: radius (not used as this is an actual polyline). 1: forward arrow (so the next line must now define the arrow style). 0: no backward arrow. 2: two points (so on the third line it expects four numbers—x₁, y₁, x₂, y₂).
An inexact SVG paraphrase (most inexact because of the two different types of units, which SVG doesn’t fully or consistently have—the closest is vector-effect="non-scaling-vector"; also I will note that a number of the properties are the defaults and would thus be likely to be omitted in real life):
This is vastly more human-editable. I write SVG by hand regularly, and edit tool-assisted SVG by hand even more regularly. I could not do the same for Fig in a regular text editor—I’d want tooling that labelled every field and enumeration value.
> What if we pretend we can go back in time, and create a transparent, gracefully degrading plain-text format that is trivial to build viewers and editors for?
Actually, the idea behind SGML's SHORTREF feature is exactly that: yöu can define char sequences consisting of newlines, tabs, or any sequence of chars not already interpreted as markup delimiter, and gave SGML replace those into eg a start- or end-element tag or whatever, in a context-dependent way so you can cover most of markdown, or CSV, or JSON. And we have to go only as far back in time as 1986 when SGML was published as an ISO standard after a decade or so of reaching consensus. It's not exactly trivial though, but not that hard either by today's standard.
Beyond SGML, what's really puzzling to me is that we keep on throwing mud against a wall to see what sticks, and every comp.sci. generation seem to reinvent everything from scratch without consideration what came before. It's especially paradoxical considering SGML legacy as a text format is directly in front of you as HTML, and materials on SGML and HTML are broadly accessible. Why invent text formats for the long haul if nobody bothers reading?
> Beyond SGML, what's really puzzling to me is that we keep on throwing mud against a wall to see what sticks, and every comp.sci. generation seem to reinvent everything from scratch without consideration what came before.
I'm old enough to remember the height of SGML's popularity.
I used to work in the office next to someone who was writing an SGML parser. He was an incredibly skilled developer; he was also the editor of the spec for a major programming language. And he loathed SGML.
The SGML standard was a $99 doorstop of a book. Parsing SGML (and DTDs, and capability declarations) was basically impossible using standard parsing theory. Instead, you needed hundreds of hand-built parsing routines. And even when you parsed some SGML, you were left with the frustrating fact that there was no standard "document information set" (if I recall the name correctly) that told you what your parser should return. One of our favorite lunch-time rants was "Someone should define a stripped-down version of SGML that doesn't require a 2.3 MB C++ parsing library (because our users only have 8MB of RAM, and they're using 12 MB of that to run Netscape and an OS)."
Then one day, my boss walks in and says, "Hey, our corporate W3 representative just got a copy of this. You'll like it." And he dropped an early draft of XML on my desk. The spec was 30 pages long, and easy to implement.
Within a month, pretty much everyone who could abandon SGML had done so. A few legacy vendors continued supplying enterprise SGML users, but every other SGML developer pivoted to the web and XML. It was a chance to trade frustrating niche tools for a hot web market.
SGML occupied a weird middle space: It was not only programmer hostile, but its creator bragged about how it was programmer hostile, because users came first. Except SGML was still far too technical for any mainstream users, who normally wanted GUI tools.
SGML was not mysteriously forgotten. It was buried at a crossroads with a stake through its heart. And plenty of programmers who worked with SGML volunteered to shovel.
> SGML was not mysteriously forgotten. It was buried at a crossroads with a stake through its heart.
> SGML developer pivoted to the web and XML
Appreciate your response, but with respect, that's the kind of hear-say
criticism I was almost expecting. First, SGML hasn't gone anywhere. XML is just a proper subset of SGML, written by the same people who developed SGML and updated SGML to allow DTD-less markup aka XML. Next, XML just didn't make it to the web, wich was its entire raison d'etre, as in, chapter one, sentence one of the XML spec. It's true that XML has been used for almost everything and anything except its actual purpose (eg serialization of service payloads when those are consumed by programs rather than pasted into markup, or config file formats). Finally, SGML is the only game in town able to parse HTML based on an international standard. Don't we want a perspective to get us out of the dying web, or at least preserve what's worth preserving of the extant web? Or are you ok with monopolists having captured the web into craptastic browser tech? Now that's something overly complex and user-hostile, and where 12MB doesn't get you very far indeed.
SGML complexity is a myth. You can code a parser in about one man-year, with optional features (SHORTREF, LINK, web bindings) taking another half (source: have done it [1]).
I will concede that working off the SGML spec alone is ... laborious, and studying it is indeed a non-insignificant part of the work. The spec is only about the markup meta-language as entered by an author, and doesn't say anything about APIs etc (except for ESIS used for test case suites and early Perl-based processors). Closest to a "document information set" would be HyTime "groves" and, later, DOM. Missing from the spec is a formal model for tag inference and for deciding ambiguousness of content models. That void was quickly filled by Brüggemann-Kleins "One-unambiguous grammars" paper ca 1993, plus clarifications by James Clark (author of the SP SGML parser/toolbox [2]). XML Schema appeals to the exact same theory in it's "Unique particle attribution" constraint btw.
Now don't get me wrong: there is valid criticism for SGML, and maybe a need for a new subset and/or extension (and I might have ideas to share). Just criticism with a broad brush and false dichotomies doesn't get us anywhere.
> SGML complexity is a myth. You can code a parser in about one man-year, with optional features (SHORTREF, LINK, web bindings) taking another half (source: have done it [1]).
You’re waaaay off base if you think these figures make SGML complexity a myth.
https://bundlephobia.com/package/sgml@0.2.1-beta shows sgml.js to be about 400KB of minified code, and the bundle also contains sgml-ua.min.js which is 400KB but I think a different 400KB, and a simple line count shows sgml-ua.js and everything-else to each be around 18,000 lines excluding comments.
You can implement XML (https://www.w3.org/TR/xml/) minus doctypes in a couple of hours, no trouble. You can add doctypes (including remote fetching, assuming an HTTP library) and validation in less than the remainder of a man-day. Be generous about doing a bang-up job of it all, and you’ll still easily be done in under a week. (And in a language like JavaScript, it’d probably be a couple of thousand lines at the most, depending of course on how you choose to model it all; possibly under 500.)
SGML is very complex compared with XML.
SGML has departed like Jehoram of old: to no one’s regret.
sgmljs.net also bundles a markdown parser, a DTD for W3C HTML 5.2 (a really long string), SGML declaration support, loads of diagnostic messages, an URL parser, SGML LINK with pipelining and templating, implementations for a couple of (HyTime general arch) formal system identifiers such as for date and CSV parsing/formatting, plus a couple other things. It has essentially no external dependencies.
The XML subset of SGML is simpler (that's the entire point of it) but implementing DOCTYPEs with the subset of markup declarations supported by XML still will take you much more than "the remainder of a man-day" LOL. Yes unlike with SGML, you can use an off-the-shelf regexp lib to perform actual checking, but you still need to perform model admissibility/ambiguousness checking with halfway usable diagnostics, and require hundreds of test cases.
XML might be ok as a markup delivery format (that didn't make it on the web), but it absolutely doesn't begin to meet the bar of an authoring format. If it were, nobody would be using markdown, MediaWiki syntax, or other custom Wiki syntax, and your collection of helper apps and CMSs, template engines, and whatnot. All of which is included in SGML.
I suppose I should actually give more details. You probably already know most of this history.
I was basically still a kid, working for a vendor in 1997, and my job was to write a library for manipulating SGML parse trees. Today, it would be a DOM library, but the HTML DOM wasn't standardized until the fall of 1998. I think I remember hearing about HyTime, but I didn't have a copy of any of their related work. I basically had a copy the SGML standard, and of James Clark's nsgmls parser. A very experienced and capable colleague was in charge of writing an SGML parser. Our target machines had roughly 8MB of RAM and maybe 80 MHz CPUs. Which meant they could parse SGML, but it took a noticeable amount of their resources. My frustration with implementing SGML was first-hand, although I was working on the easy bits.
SGML at the time was mostly an expensive enterprise niche, except for James Clark's excellent free tools. The most popular application I saw in the wild was probably DocBook? A lot of SGML vendors were playing up the connection between SGML and HTML in an effort to seem relevant, but their tools were notoriously expensive. None of the vendors were especially mainstream.
The arrival of XML really was a sea change. Judging from my colleagues who actually attended trade shows and who paid attention to the other players, the enterprise SGML vendors basically saw XML as an attempt to break out of their niche.
XML was almost immediately "mainstream", at least for people who wanted something other than HTML tag soup. Of course, as you point out, HTML ultimately turned away from XML. But the number of people who knew and cared about XML seemed to rapidly exceed the size of the SGML community.
> Finally, SGML is the only game in town able to parse HTML based on an international standard.
Is this actually true? Can any standards-compliant SGML parser actually parse arbitrary valid HTML 5 as SGML, and pass the test suites? https://www.w3.org/html/wg/wiki/Testing
> Also, since neither of the two authoring formats defined in this specification are applications of SGML, a validating SGML system cannot constitute a conformance checker either.
HTML 5 has two syntaxes. There's the "HTML" syntax, which is basically tag soup, but at least there's an actual parsing spec these days. There's also apparently still an XML syntax for HTML 5? But the HTML syntax is the one that's common on the web.
> Missing from the spec is a formal model for tag inference and for deciding ambiguousness of content models. That void was quickly filled by Brüggemann-Kleins "One-unambiguous grammars" paper ca 1993, plus clarifications by James Clark
As far as I can tell, this sort complex and opaque documentation is one reason why promising standards get outcompeted by clear, simple 30-page specs. I've seen so many interesting technologies wind up as obscure footnotes in history because few companies could implement them at an acceptable cost. (Of course, today, a good open source implementation can play a much larger role.)
But to respond to your original remarks, I think this one of the major reasons that "each comp sci generation" ignores so much of the past. The past was frequently frustrating. It sometimes had glaring drawbacks. Often it was expensive and proprietary. And sometimes the ideas were too far ahead of their time.
I fondly remember at least a dozen technologies that contained brilliant ideas, but which never succeeded in becoming popular. Usually the reasons why they failed (or at least remained obscure) are obvious in retrospect.
I've reported results of my SGML DTD for W3C HTML 5.2 vs existing test suites for HTML in [1] (the "TALK" slides). That's more than can be said for almost all HTML5 parsers, including those of browsers. You have to keep in mind that test case suites for HTML are hopelessly out of sync with the spec text; W3C's nu validator for HTML 5 doesn't even reference a version/commit (and what version should it reference anyway when WHATWG's "living standard" is just a stream of consciousness without publication dates and versions after all, written by Chrome devs?).
People (like you ;) have this idea that HTML is "tag soup" but it just isn't. It's almost all bog standard SGML, and explained in great detail in [2], obscured by WHATWG HTML having very generous recovery rules accepting almost any byte sequence as HTML.
Citing a previous comment I made when that came up about a year ago here:
HTML is basically SGML with a DTD declaring rules for tag omission/inference, empty elements, and attribute shortforms. This is what the majority of XML heads seem to struggle with, but which nevertheless is every bit as formal as XML is, XML being just a proper subset of SGML, though too small for parsing HTML.
Then HTML has a number of historical quirks including:
- the style and script elements have special rules for comments that would make older pre-CSS, pre-Javascript browsers just see markup comments and ignore those
- the URL syntax using & ampersand characters needs to be treated special because & starts an entity reference in SGML's default concrete syntax
- the HTML 5 spec added a procedural parsing algorithm (in addition to specifying a grammar-based spec) that basically parses every byte sequence as HTML via fallback rules; for most intents and purposes, the language recognized by these rules, taken to the extreme, is not what's commonly understood as HTML
- WHATWG have added a number of element content rules on top of the HTML 4.01/HTML 5.0 baseline with ill-defined parsing rules (such as the content models for tables and description lists); the reason is precisely that WHATWG, once Ian Hickson had distilled the HTML DTD 4.01 grammar rules into the HTML 5 grammar presentation as prose, a formal basis was no longer used for vocabulary extension
XML isn’t hard to parse, sure, but you still have to do parsed entities and default attribute values for the (mandatory) “internal DTD subset”, don’t you? If you omit that, then sure, the rest[1] is basically in the ballpark of JSON. But for a full parser you still have to do a full textual macro system so you have to allocate given untrusted input so ... so now you’re basically doing a good chunk of a simple compiler frontend. (And the macro system it can’t even do namespaces because SGML.) Kind of annoying for a feature that basically nothing except MathML uses.
I'm tired of Markdown hacks to try to justify not using Microsoft Word or LibreOffice. Every kind of functionality you can imagine already exists in open formats with powerful programs designed to make it easy for you to compose rich documents. For the love of god, use them.
Yeah, those large documents don't work well with Git and other line-by-line diff systems. That doesn't mean line-by-line is better, it means we're too lazy to build a better version control system. Our tools should adapt to us, not the other way around!
> Our tools should adapt to us, not the other way around!
Agreed! I actually provided some concrete critique of Google docs in OP. Do my criticisms also apply to Microsoft Word and LibreOffice?
More broadly, I'd like to persuade you that your software choices have unanticipated secondary consequences for you. The software world today is not a tame, urban place where you can afford to select tools based just on the features they offer. It's a jungle out there. You need to take into account the entities providing the tools, their goals and incentives. It's not a choice between hacks and no hacks, there's hacks everywhere. And hacks aren't even the most important lens to view your situation through. What matters is the intention behind the hacks. Are they working for you?
The key design constraint for me lately is to depend as much as possible on software that is easy to build. Open source is necessary but not sufficient.
Back in 1973 Ivan Illich had some useful things to say about the relationship between a society and its tools: http://akkartik.name/illich.pdf
> I'm tired of Markdown hacks to try to justify not using Microsoft Word or LibreOffice. Every kind of functionality you can imagine already exists in open formats with powerful programs designed to make it easy for you to compose rich documents. For the love of god, use them.
> Yeah, those large documents don't work well with Git and other line-by-line diff systems. That doesn't mean line-by-line is better, it means we're too lazy to build a better version control system. Our tools should adapt to us, not the other way around!
I've used (and still use) all those tools. Our tools shouldn't adapt to Office applications, because Office applications can't even adapt to themselves! Anyone who has ever tried to copy/paste even between Microsoft applications knows the pain and headaches involved. There is no way you'd be able to offer clean diffs for all that garbage. The reason pure text editors are popular is because we've known this for a long time.
Almost every time I open up a Word document in LibreOffice there is something missing or wrong. It's because the Word document files does not have an open standardized format. Having a standard format means that you can build on-top, it enables developers to build add-ons like real time collaboration, and version control, instead of having it built into the app like in Word. A standard format also allows other document viewers and conversions between different formats. Microsoft have spent a lot of money buying or bullying competitors to Word, and they keep the format incompatible with other software because the main selling advantage for Word is that it's compatible with Word itself which has become an enterprise standard. Even Microsoft engineers think the word format is shit. But it's kept like that because of business reasons. If you want quality tools you should choose an open format, then buy the best tools for that format. There are plenty of "what you see is what you get" editors for open formats. But many people prefer to edit human readable source code directly in order to get precision - so the document look exactly how they want it to look. And also use tools like Git or Mercurial to collaborate by sharing code diff. With an open standard you as the user are in control, especially if the source code is human readable. Not everyone wants to consume software like force fed broiler chickens.
I think you're right. In my very limited-need case, it worked fine in the opposite direction. Generating .docx files for customers who would not accept older word format files, pdf's, and certainly not odf's. But admittedly, this is not enough for most folks.
> Every kind of functionality you can imagine already exists in open formats with powerful programs designed to make it easy for you to compose rich documents.
Not only that, but this kind of thing has existed for ~30 years now. I remember MS Word 6.0 on Windows 3.1, 16 bits and 4MB of RAM, with large documents with significant styling, embedded Powerpoint and Excel documents, MS Draw vector art, etc.
I concur that the line-based approach most Git tooling uses is not good enough. At the very least, even for code, it should be integrated with Tree-Sitter etc. by now and be comparing changes at the AST level, not just the raw text; and when it comes to documents, we should get side-by-side diffs showing the rendered document.
But is there a market for this, besides just you and me?
I just use HTML. Personally I detest Markdown, I cannot remember how many asterisks is bold, how many is underline, how many is italic, in HTML it's just <b>, <u>, <i>, much more human-friendly.
And HTML allows inserting SVG or image data URLs in-line as plain text which isn't far off from what this is.
And if you write "polyglot" html (html written as syntactically correct xml) then you can use xpath or other tools to extract data from it (i.e. from tables).
You’ve literally contradicted yourself between the two paragraphs. You’ve said that powerful tools exist, but for people who need or appreciate line-by-line diffing, they should adapt to the state of the tools as they exist today.
I would argue that we’ve been too lazy to create powerful programs with open formats that do work well with Git and other line-by-line diff systems. Version control as functionality is a solved problem (yes, it’s got lots of usability problems and should probably make it impossible for a user to lose work, but those are separate from this discussion). Creating a hack that does in fact adapt the text tools to us (defining us as, like the author, someone who wants to use plain text) is exactly what they’ve done.
Aren't these hacks the act of adapting our tools to us (at least in an experimental sense)?
Actually, I'm not clear on the context of your complaint. If it's about Word and LibreOffice not playing well with text-based diff typically used by version control systems, then maybe the problem is with the file formats used by Word and LibreOffice.
I don't write many large documents, but I'll use MS Office at work for such things. They don't seem to care about version control for presentations and documentation.
For personal notes, I use Obsidian and Fossil SCM. For blog posts (which I'd like to get back to doing) I used Markdown, Hugo, and GitHub.
> If it's about Word and LibreOffice not playing well with text-based diff typically used by version control systems, then maybe the problem is with the file formats used by Word and LibreOffice.
Text-based diff requires you to make a number of assumptions:
1. I'm writing a plain text file
2. The text file is broken up into multiple lines, each with distinct statements/expressions
Compare that to composing a multi-media file:
1. I'm writing a document with plain text, graphics, possibly video, possibly sound, together on one page
2. The text may not be distinct statements/expressions on different lines
3. The media may change incrementally or wholly
4. The layout on the page may change incrementally or wholly
Some multimedia document formats have a text form (ODF is zipped XML), but it still has different assumptions from the text-based diff. An intelligent program understands when the formatting has only changed slightly, without having line-by-line assumptions. Word & LibreOffice already have such version control/diffing built in. So the technology already exists, but we're not applying it in a way that makes it convenient to store these documents with our other versioned/diffed documents.
What we need is a way to combine different methods of diffing/versioning different kinds of files within one contextual view. The one view would allow you to view code diffs/Word diffs/image diffs/etc from one interface, and push/pull/share them to some kind of remote repository. Like Artifactory, but less terrible.
I mean, there are plenty of plain text document formats that don't suck. Asciidoctor is my favorite. But there's a lot of middle ground between markdown, which sucks, and word, which also sucks.
Perhaps someone ought to make an Ask HN about source control that works with prose. What can give the best diffs for soft-wrapped paragraphs using solely the default command?
The use of LOVE is the biggest limiter IMHO, simple stuff is missing because a game engine does not really cater to text editing in any meaningful way. The copy and paste keybindings are wrong on Mac for instance. Selection behaviour is a bit weird and non-standard. I am not even sure you can paste from an external text source etc. Everything clearly has had to be custom made.
I appreciate your comments, Tom. Particularly that you actually installed it and tried it out before coming to a conclusion.
I'm not sure what you mean by "external text source". Ctrl-v does read from the system clipboard regardless of who wrote to it. Did I understand you right?
I think what Tom might mean is that because Ctrl-v is not the standard system paste shortcut on Mac (it is Cmd-v), the editor is not inheriting default functionality here.
If the editor does in fact implement its own copy and paste, there is a possibility that it would not use the system clipboard (although it might, the NSPasteboard interface enables this).
yes your right, I can get external copy and paste working, I just have to switch key bindings mid-flight. Still, my main point is that LUA and Love conspire to make development of this kind of software difficult. IMHO. It's a suboptimal tool choice for the job. Its making the journey longer. The choice of using JSON over SVG is basically because you have no SVG facilities at hand. Please continue your good work on a better substrate, I mean this as constructively as possible.
It's very interesting that the article has resonated so well. I am fairly sure it means 99% of people have not tried it. People will not try it because the install process is painful, which is another issue with LUA/LOVE.
The pronunciation of the letter v is one of the few cases of ambiguity in German orthography. The German language normally uses the letter "f" to indicate the sound /f/ (as used in the English word fight) and "w" to indicate the sound /v/ (as in victory). However, the letter "v" does occur in a large number of German words, where its pronunciation is /f/ in some words but /v/ in others.
A few years ago, I created Markdown Notes with the same goal in mind. Markdown with LaTeX formulas and graphs was a powerful tool.
However I eventually left university and stopped using it. Then I started using paper and pen to great effect. I can draw in seconds what I can type in minutes. Now I use Notability on an iPad Mini. Same thing but with better editing tools.
Frankly, I just like pencil notes a lot more than text. It just works better for free form thinking. Immediate and short-term convenience is more important than having my notes in text files.
I use something similar for my blog (I don't have a WYSIWYG editor for the widgets though, that's pretty neat): The posts are written in markdown, but then there's a few "widgets" that are basically markdown code blocks with a special "language" tag set that will render a React component with the text content as a input prop. I also use pandoc-flavored markdown (https://pandoc.org/) which already has many features that make formatting a bit more complex things easier (tables, latex math, ...)
[...] it will be much slower:
```barchart
title: Searching in 65 pdfs with 93 slides each
series: run time (seconds, lower is better)
data:
pdfgrep: 19.16
rga (first run): 2.95
rga (subsequent runs): 0.092
```
becomes a colored bar chart with three bars. So the file is both kinda readable in a plain text editor, in a browser with js disabled (with React server side rendering), and in the richest version with JS enabled.
SVG is very complex. This format is very simple because it is not supposed to solve a very complex problem. That makes it easier to write other viewers and editors.
You can write simple svg but if you are interpreting all of svg it is not simple to write that code. Mozilla hasn’t successfully implemented it all properly for example.
That's true, but they're explicitly suggesting SVG path syntax which is very simple. The downside is it's also very limited, and might well prove too limited for the author going forward.
> Mozilla hasn’t successfully implemented it all properly for example.
interesting. I didn't know that. I tried googling to find out what parts are not implemented but couldn't find anything (admittedly not a long search).
A subset of SVG with just basic lines and shapes is fairly simple, and in fact is very similar to what OP is doing. It's only when you go into colours, fonts, animations, and stuff like that stuff gets progressively more complex.
But then it might easily become unclear exactly which subset of SVG is supported. I am personally leery of implementing subsets of other formats. I find it better to define entirely new formats, at least for simple things like this. To render this "lines" format, you could of course easily write a translator that turns it into SVG. It's mostly the other direction that I would be hesitant about.
Now you replaced a trivial file format that (from a quick glance at the code) needed about ~35 of easily readable and self-contained Lua code to parse with an external dependency that would be much larger and harder to follow and either having (at least) an XML parser as its own dependency or implementing its own XML parsing, as well as being at the mercy of their developers. Also unless you are using some highly popular library, you may end up with some abandoned dependency.
Examples of both are at [0] (C++ based parser, you'd also need to write some bindings for lua) and [1] (Lua based parser for a subset of the format, abandoned for almost a decade).
There are times when using an external dependency might be a good idea, but a text-based file format that describes lines and can be implemented in a few lines of code is not one.
Hmmm. There actually a lot to it and you don’t want to end up in a situation where documents don’t render properly unless you’re using a really complete viewer.
Some people appreciate simplicity as a virtue - simplicity of implementation as well. To such people, using a third party library to conceal an incredibly complicated implementation is still in principle ugly. I don't know if that is what motivated the author here, but I certainly know lots of people who would cite that as their motivation.
In another thread[1] I've mentioned the SVG Tiny profile[2], which has been defined almost twenty years ago and is apparently already used in some cases where the full power of SVG is not required.
I've often just wanted a "simple SVG" subset with just basics, as you describe, omitting all the fancy stuff that tends to lead to surprises and incompatibilities. For example, as the format for a very simple icon editor.
Do you know if there is some formal way to 'declare' such a subset (something to do with XML/namespaces maybe?), or if there's any existing standard "simple SVG" format that someone has already defined? Obviously I could just informally stick to a subset by avoiding whatever features I don't want to use, but I think it would be useful to have a formally defined subset, and to be able to scan an SVG to detect if it's within the subset or not.
Brand Indicators for Message Identification, part anti-phishing feature, part marketing gimmick. Essentially favicons for email. The Wikipedia article does the best job of summarizing it without marketing industry fluff:
SVG in its entirety is of considerable complexity but its path data attribute syntax is fairly simple, well documented and optimized to be forgiving and compressed while still using nothing but a few ASCII letters, digits and plus and minus signs
I have, on occasion, made plain text files with simple diagrams in them using unicode box drawing characters. These copy the advantage of markdown of being readable in a plain text editor. They can also be written in a plain text editor, if a simple macro is written to enable drawing boxes and lines. A program could be written to convert such a diagram to html, to enable more elegant display.
The original markdown spec of course included the ability to interleave html, but I always interpreted this as a compromise, a recognition that occasionally, features beyond what is otherwise supported might be necessary. The idea of writing markdown in a WYSIWYG editor, which outputs to a human unreadable format - not just human unreadable like html, but in fact nonsense to any human that reads it - encoded in a generic object format, as an extension to markdown - and to call the result plain text - is quite ridiculous to me. At some point we have jumped the shark, with respect to the purpose of markdown. There is now little distinction from existing document formats, like, say, rtf, which is "plain text" in the same sense as this is.
* * *
I had thought about calling my extended markdown, with box drawings and the like, sharkdown, for which I at one point even drew a cute logo. I got no farther than this.
I think once I read that Temple OS' "text editor" works kind of the same way - regular plain text but lets you incorporate things like 3d gradient meshes
This is a great idea. I would for more plain text editors to natively support common text to diagram solutions like PlantUML etc. PlantUML notation is somewhat readable even without rendering it.
There's emacs for that use case. Orgmode with HTML SVG, ditaa and plantuml are well suited for almost any use case I could think of.
Why reinvent the wheel again and again?
I found it significant for readability but have only seen it so far in github gists. I use vim and sometimes sublime text so a plugin could probably insert a faint line per row similar to how it is done indicate character length.
This is cool. I could see myself adding support for this in Tentacle Typer. It's a text editor with a similar just-save-plain-text philosophy, main difference is that it has more tentacles. Giving a tentacle a paint brush for mid text doodling would be fun.
i think i'd rather draw directly on my screen than use a whole new text editor for this. what if i'm typing in vim and want to just draw some vectors on the screen? you could use a vim plugin to associate the viewport of your pad in vim with the svg data.
Now we should build a converter for turning vector files from the ePaper device reMarkable.com into this file format via OCR + line drawings. Would be nice for sharing, archiving and further processing.
It is said that the father of LISP, John McCarthy, lamented the W3C's choice of SGML as the basis for HTML : « An environment where the markup, styling and scripting is all s-expression based would be nice. » S-expressions can be an answer to write "plain text with lines", for instance
http://lambdaway.free.fr/lambdawalks/?view=textanddraw
This looks not too different than modern versions of the block-based blog editors in Wordpress or Ghost where you can alternate Markdown blocks with rich elements.
Ghost persists the data using the Mobiledoc format, which is JSON, not rendered HTML. This allows flexibility for display.
This quite nicely matches the use-case that causes me to go to a notepad - mixing text and little simple diagrams or sketches. I'm gonna have to give it a try :)
Reminds of using pic (e.g. [0]) in the mid 80's to create typeset the images in my thesis (which was written using vi). It was possible to programmatically generate complex images with repeating elements.
I assume the author may be referring to the projects at http://akkartik.name/code, notably Mu - though not 100% sure, doesn't seem to really extend to the "computer" part at a glance.
It would be great if this were available as a plugin for Visual Studio and Qt Creator. Store the drawing commands as special comments, and display them in real time in the code editor.
Instead of trying to hack drawings into plain text, why not add enough Unicode drawing symbols to create functional, basic line drawing using just Unicode symbols. It seems doable:
Currently Unicode's box drawing range (U+2500 - U+257F) [0] has 128 symbols, but the resolution seems much too low and the angles and options too limited. Visually, you get one line segment per 'monospace' (specifically, each visual segment extends from the center of one monospace to the center of an adjoining one), and generally it's straight and only at the four 90 deg angles.
More useful is the Block Elements range (U+2580 - U+259F) [1]. I don't know the intent of it, but it effectively subdivides each character space into 4 cells that could be used like pixels. Due to Unicode's effectively unlimited capacity, we could provide much more powerful:
Simply divide each character space into a grid, as with Block Elements (I think font designers already do this, but don't recall the details well), let's say 12x16 = 192 cells (a somewhat arbitrary number; if that ratio is wrong or there's an existing standard, it can be adjusted). We might need higher resolution: Think of each cell as a pixel: A text character usually is much smaller than 1 inch, but zoomed into 1 inch/character you would have only ~12 dpi. What resolution is needed for basic line drawing?
Displaying a line would require characters to overlay each other, which is built into Unicode and necessary for many languages, though I don't know if there's currently a limit to how many characters can be overlayed. We would need up to 192, or more if we adopt a higher resolution.
We could add lower-resolution symbols for manual drawing - the Block Elements range does something similar - otherwise, manual drawing would be tedious. A solution with 192 cells is designed to be used programmatically. With a little structure, we could organize the code points so that the developer can predict the location of each code point without looking them up, e.g., a predictable offset from the first codepoint.
Also, I wonder if drawing is within Unicode's current scope. The Box Drawing range says [0]: "All of these characters are intended for compatibility with character cell graphic sets in use prior to 1990.". That implies a disinterest in its use or development, though the Block Element range [1] includes no such disclaimer, and scopes and interests can change. Also, are there other Unicode ranges that cover graphics besides the Box Drawing range and Block Element range?
Hey @dang: Why would this comment start about 5 threads down the page, and then drop another few threads within 2 minutes? It's not the end of the world, but maybe there's a bug? It doesn't seem to fit the model of a spammy comment, at least not to my eye.
Edit: Oddly, my next comment started at the top, as comments usually do:
If everybody used things that are "already better", then nobody would ever start something new.
Even more, if everybody had to have the goal to arrive "something better" at some time in the future, most projects wouldn't get started. It's super valid to do something for your own edification and learning purposes.
Another good motivation is to explore new ways to try to write the same (or even less functional) thing but in a better way.
They are open source, not closed source, that's the main reason I said "reinventing the wheel".
xournal++ supports typing text, drawing with stroke recognition, supports touch displays, stylus, mouse, equations and also, pdf files can be used as background.
I repeat: When a better "approach" is available, why go for an inferior approach. I am comparing the ways things are being done, not just comparing final products.
The very first line of the linked article gets at why xournal++ is not suitable for the author of this:
> I like plain text.
xournal++ uses XML. While I'm not likely to use this editor much, the use of plain text for all but the illustrations makes me at least curious. Xournal++ is totally uninteresting to me - if I need a document format I have plenty of choices with broader support. That's not a criticism. I'm sure it's great. But people have different needs.
I disagree with your assumption that this approach is better than xournal++ approach.<br>
Do you really believe this is "better way" than xournal++? If if had been a better way than xournal++, I would not have commented about xournal++ in first place.
That is't my assumption. My assumption is they don't have to be "better than". If they feel they got something out of it, that is enough justification. What is wrong with you?
Welcome to Hacker News. The community is generally very open to new ideas, and loves to see personal projects, but they are not exactly "advertisements" and there is no reason to think of a project that is somewhat similar to your own as being "in competition". That kind of confrontational attitude is disfavored here.
Xournal++ looks like a very nice piece of software. Thank you for sharing; I had never seen it before. It is worth submitting it separately so others can see it too (perhaps not today, unless you want people to make the connection with your comments here -- I don't think it would go over well).
That would not be a good idea because of contention, as you may wish to use those code points for their meaning in some icon font.
You’d be better to use HTML elements, which is the proper way of embedding such stuff in Markdown. (Look, even hijacking code blocks as an extension mechanism is a controversial technique.) And then SVG, or at least a subset of it, becomes an obviously excellent solution; the {"p2":{"x":141,"y":85},"mode":"line","p1":{"x":34,"y":44}} example becomes: