Hacker News new | past | comments | ask | show | jobs | submit login
Nghttp3 1.0.0 – HTTP/3 library written in C (nghttp2.org)
145 points by neustradamus on Oct 22, 2023 | hide | past | favorite | 116 comments



Very cool! And congrats for reaching version 1.0.0!

Other nice HTTP/3 libraries include H2O: https://h2o.examp1e.net as well as Facebook/Meta ProxyGen: https://github.com/facebook/proxygen


>Other nice HTTP/3 libraries include H2O:

But H20 hasn't been updated for years. 2.3 Beta 2 was released in 2019.


It's constantly updated https://github.com/h2o/h2o/commits/master

Also see: "switch to a new release model" https://github.com/h2o/h2o/issues/3230

"Every commit to the master branch is considered ready for production"


OH That is great. I really wish this was better known though. They could have at least make a version 2.3 Release or 3.0 release with that announcement.


There are constant updates; its author (Kazuho Oku) is working for Fastly.

H2O is absolutely battle proof -- its h2 and h3 implementation perform amazingly well.


Funny, I saw this on HN while I was taking a break from testing Caddy, which has HTTP/3 support, and is even enabled by default. I suppose it's based on Go's implementation.

I'm using Apache at the moment and it's working great. HTTP/3 isn't even in the horizon for Apache, and it is the reason why I'm trying out Caddy/Nginx/OpenLitepseed in the first place.

Would nghttp3 eventually bring HTTP/3 to Apache?


> I'm using Apache at the moment and it's working great.

You know, Apache is pretty decent.

I'm actually surprised that they added ACME functionality with mod_md before Nginx, so now you can get Let's Encrypt certs without the need for certbot: https://httpd.apache.org/docs/2.4/mod/mod_md.html

If you turn off .htaccess the performance is okay as well (closer to how Nginx has nginx.conf), their mod_security performs pretty well and I'm not sure of that many alternatives to mod_auth_openidc elsewhere.

Of course, Nginx, Caddy and other web servers are really nice too!


Yeah I've built quite a lot of muscle memory having used Apache for about a decade too, and the configuration felt quite intuitive, but Caddy's configuration is such a fresh breath and I'm really enjoying it so far. More than Apache's to be honest.


Actually, the go stdlib doesn't support it yet. Here's the relevant issue, including some perspective by caddy's lead developer:

https://github.com/golang/go/issues/32204


I’m a big fan of nginx. Incredibly fast and reliable. But it doesn’t support HTTP/3 out of the box using the distributions available from system package managers like apt.


There's always the fork some nginx devs just made

https://angie.software/


HTTP/3 is supported in Nginx 1.25, available from nginx.org's repo. I tried it a couple days ago, and worked great.

None of the Nginx builds available out there seemed to fit my use cases perfectly, so I resolved to customize it (https://gist.github.com/Ayesh/7c7a1bf08d37497aef8672644a7fd5...) to my exact use cases.

I gave up on Nginx because it was impossible, or incredibly difficult to figure out how to append/modify HTTP headers. I wanted to send CSP headers if the reverse proxied application did not set on their own, and I wasted a whole fruitless day trying to do it. I gave up.


Sorry for what's probably a dumb question, but why do some of these start with 'ng'? Like nginx, and now this?


ng is a common shortcut for "Next Generation" (ex: syslog-ng…)


Unless in Japan where it means "no good" hence :NG emoji:


For me it means 'not gonna'.


O-oh, thank you! :)


Or Star Trek TNG.


It's supposedly meant as an upgrade to nghttp2 which brought HTTP/2 and is used by eg Apache httpd, though I have no idea it's by the same authors, or has API backward-compat, etc. At a glance the linked doc doesn't answer these and other questions either such as a license terms and/or git link. Btw the Read The Docs menu seems broken on mobile (displays a fold-out menu with details when that should probably be hidden).


Engine.

https://en.wikipedia.org/wiki/Nginx

Nginx (pronounced "engine x"...


I believe it's "ngin" that makes it "engine"-x, not just "ng."


How do you pronounce "nghttp" then? "enge http"?


Next gen http


Angie('s) HTTP?


I heard back on the day, it was, "Nighty"


Hence lighttpd being pronounced "lighty", I suppose.


Ah, that's a good point too! :) And glad to know that my pronunciation EN-JINX has a more plausible and informative alternative.


How does it compare to https://github.com/cloudflare/quiche?


Can turn this into a C extension for Ruby or will it be to messy?


Just skimmed over the API reference page, looks useable from ruby.


How about using a memory-safe HTTP/3 implementation which is also callable from C or any other language? https://github.com/cloudflare/quiche


Why does discussion always have to devolve into this? This isn't an advertisement asking brand-new greenfield projects to base their work on C libraries. This work is done because it's an update to nghttp2 to add support for HTTP/3. The reason you would use it is because you're already using nghttp2 and want to know about the added features. The list of tooling using nghttp2 is pretty extensive, including at bare minimum the Apache web server and curl, which collectively likely initiate at least a plurality of the Internet's HTTP traffic because of their age and ubiquity.

Why are these tools written in C? Because they're old. nghttp2 was released in 2009, before Go was released and before Rust was even a twinkle in its eventual creators' eyes. Adding comments like this any time a feature announcement to a C library is made comes across like you're saying nobody should ever update tools that are already in use. Should Linux just stop doing work until the kernel is rewritten from the ground up in Rust? Should all longstanding software take a five-year pause to do total from-scratch rewrites? Can you see why that is impractical? If you can, then you can see why libraries like nghttp2 exist and continue to add features.


The title of this post puts emphasis on "written in C", making me wonder when this would ever be a desirable feature, given that more secure implementations are available, and can be integrated into old C projects just as easily.

No need to rewrite everything from the ground up: https://github.com/cloudflare/quiche#curl


Just stop. It is not a requirement for you to wage holy wars against languages you dislike. In fact, as a form of "ideological battle," it's against the Guidelines.



quiche may be great - I haven't tried it - but it definitely isn't memory safe. It uses unsafe constructs liberally.


What does “liberally” mean here? Because without even looking at the code, I bet less than 5% of the code in quiche is unsafe. It’s probably about 2% or something. And remember, unsafe rust is still borrow checked and often run through Miri as well - which is crazy strict. Miri rejects a lot of correct programs.

There is no comparison with C, where trivial programs exhibit UB and memory unsafety all the time, and the standard library is covered in foot guns.

I really dislike the puritan wing of the rust community who see the unsafe keyword as some sort of blight to be excised. Unsafe is a necessary and important part of the language. It is needed for lots of reasons - like how you can’t make high performance container types in purely safe rust. Or to interact with code written in other languages.

Unsafe is not a devils mark. It’s a good and useful tool. Get over it.


It often suffices to find one remotely triggerable unsafe memory write in a huge executable to achieve RCE.

Memory safe would be 0% unsafe.


“n% unsafe” here refers to `unsafe` blocks that can’t be automatically guaranteed safe by the Rust compiler, not to actual concrete instances of memory unsafety.


Right. And even if you only work in “safe rust”, a massive part of the software stack you rely on still isn’t formally verified like that. LLVM and rustc aren’t formally verified - and they likely have bugs. The kernel is all “unsafe” code. And the rust standard library is full of unsafe code. And so on. Moving heaven and earth to excise the last 2% of unsafe code in random rust libraries just doesn’t make engineering sense. Especially when doing so comes with a significant performance cost.

If reliability is what you’re after, that time is almost certainly better spent adding more tests, fuzz testing and validating your code with Miri and friends.


That doesn't mean it's not memory safe - it means that those constructs need to be thoroughly scrutinized for memory safety, which is significantly better than the entire codebase needing to be thoroughly scrutinized for memory safety


By that reasoning C is also memory safe.


No? Your entire C codebase needs to be thoroughly scruitinized, whereas in Rust, you can build a small portion with unsafe code, thoroughly audit that, and build a sound abstraction on top.


Examples, tests and interfacing with C are almost all the instances of unsafe.


I assumed the FFI would be the unsafe bit, you can't interact with an unsafe language from a safe context - the mere act of dereferencing a pointer that originates from unsafeland is fraught with UB.


[flagged]


Most languages are memory safe. Literally every language with a GC is. C and C++ are pretty much the only 2 languages that are not memory safe by default.


Go is famously not memory safe. Not only can you trivially create data races, if you data race on an interface fat pointer then you can corrupt its vtable (entirely in "safe" code; no `unsafe.Pointer`).

It is of course better at memory safety than C and C++, but these issues do come up in real deployments.


Data races have nothing to do with memory safety. Please do not conflate these two. Java, C#, Go, etc. can all have data races, but they are all memory safe languages.


Golang permits safe code to use a data race to cause type confusion and violate memory safety though.


Data races aren’t memory unsafe. The vtable thing is, but at least that’s accidental.


Go is memory safe, data races are inherent to all languages beside Rust? As for interface fat pointer it's a none issue in real life code.

https://research.swtch.com/gorace


Good thing Go has a race detector!


and C++ doesn't?


"Only 2" is very short-sighted, there's also Assembler, Pascal, Ada, Objective-C...

One general comment: it's kind of funny to see a C library being bashed right away at V1.0. As far as I can tell, nghtpp2 isn't even a CVE nest to begin with [1]. Some people must be desperate for "easy" targets.

[1] https://www.cvedetails.com/vendor/15772/Nghttp2.html


I’ve never written a line of Rust in my life. Though as a user of computers I have an interest in software being secure, especially an HTTP client which I may at some point be subject to. I’ve been a developer for a long enough time, worked with enough developers, and written enough C (not much), to have formed the strong belief that: C’s typical use is not memory-safe due to completely understandable human error, these footguns do not justify its use in most greenfield contexts, and any developer that denies the existence of these footguns has bought into a delusion that I can’t even begin to understand.

I don’t care if it’s Rust or something else. The fact that this completely legitimate and long-proven drawback of C is always met with “Rust fanboy!” comments speaks volumes, and not about ‘Rust people’. It’s incredibly disappointing that this is where we’re at with these sorts of discussions. Comments that essentially amount to “fake news!” and reject any premise of a shared reality.


It's hard for me to imagine using any new network-touching software written in C. We have safer options now...


Indeed. The written in C part of the headline is a red flag, rather than the advertisement it is probably intended to be. It basically translates to "a piece of software that must be thoroughly audited by highly specialized security experts every few months in order to be even remotely confident that it doesn't contain catastrophic vulnerabilities that allow arbitrary code execution by unprivileged remote attackers".


Given your extreme hyperbole, you probably aren't the target audience then.

For me, it translates to "a piece of software universally compatible with a vast amount of compilers and platforms that is probably trivial to compile even in 10 years".


Those are orthogonal concerns, and you're both right. But that absolutely wasn't hyperbole.


just don't. and I am waiting for your contribution in free and open source software to write correct, safe, and performant in Ada. It'd be much more welcome if you write the formal business requirements in Spark.

Rust? why would want to write SAFE software in a language without formal spec? I question the morality af those people.


I too question why people make flame-baiting comments about topics they don't understand.


agree, always see that kind of replies in threads related to software written in C. what those people want, eh? always derailing thread like that. I guess it's RESF in action. makes me despise Rust, tbh.


I was making a remark about your comment.

And it's because most security vulnerabilities these days have to do with memory issues, which must be manually enforced in languages like C. In Rust you get most of that for free, often even more optimized than being hand-written.

Also, in my opinion, Rust is also just a nicer language to use than C - as much as I love C, I'm wildly more productive in Rust than in C/C++, and have a lot of peace of mind knowing I can more strenuously audit the behavior, security, and other aspects of the code with relative ease.

I suggest trying Rust, understanding it, and then making a judgement call. Perhaps the community is 'loud' about it for good reason.


same, I was making a remark about your comment.

also, in my and industry with much more as the stake, opinion, Ada is better.

i suggest you to try Ada.


I'm using Rust now and it seems quite nice but it could use a spec and defined semantics for provenance or whatever. It's difficult to discuss memory safety if what is actually meant is "Whatever Rust does" while the Rust docs repeatedly admit they aren't sure what Rust is meant to do in relevant areas.


Not sure what you mean. Memory safety is one thing, provenance is another. Which spec are you missing?


Well, there's no spec for any part of the language, there's a bunch of non-normative documents about how rustc might happen to work. In the non-normative documents that speculate about what guarantees rustc might require of unsafe code in order to avoid undefined behavior and ensure memory safety, provenance certainly does come up.


> why would want to write SAFE software in a language without formal spec?

A formal spec just means someone took the time and effort to write one down.

A formal declaration that a language makes no attempt at preventing memory safety is not safer than an informal declaration that a language is memory safe.


You seem to be implying that C now has a formal spec.


We have had safer alternatives for ages, Modula-2 was created in 1978, Ada in 1983, Mac OS used Object Pascal (1984),....

Your point stands though.


> We have safer options now

Not really. We just don't know the failure modes of the new hotness, and that makes us feel safer somehow. (Lol)


nghttp3 is not new software.

nghttp3 can be used to create qpack standalone binary which does not touch the network.

https://raw.githubusercontent.com/ngtcp2/nghttp3/main/exampl...


Not that it matters. OpenSSL was already 15 years old when Heartbleed was discovered, and the horrible code that caused it remained unnoticed for two years. The idea that software maturity and "enough eyes" can overcome the glaring problems from using memory-unsafe programming languages is a fantasy.


I believe the point is that this is http/3 support added to the nghttp2 library which has been around for a long time, longer than Rust's mainstream popularity started to build. While I haven't used it personally I have heard good things about the library it's good to see it continue.

Dismissing a new feature in an existing library seems quite excessive and possibly zealous. Or perhaps we should just stop using OSes entirely because they continue to release new features in memory unsafe languages.


What are you using instead of Linux?


I’d love to have something else to use.

Just because we sometimes have to make compromises for pragmatic reasons doesn’t invalidate the original point.


Their comment specifically mentions that they're talking about _new_ software.


ChromeOS, Android/Linux, iDevices, and Windows, where JavaScript, C++, Swift, Java/Kotlin, Rust, and .NET languages are the advised ones for new code, leaving C behind only for existing code.


> new network-touching software

I imagine their emphasis is on the new bit here.


"And yet you participate in society. Curious!"


Maybe it's secretly funded by the NSA.


While in reality they are coordinating with CISA to try and urgently move people away from C

https://www.cisa.gov/news-events/news/urgent-need-memory-saf...


>"It's hard for me to imagine "

Simple solution - don't


It's going to be sad when the browser oligarchs deprecate HTTP/1.1. The last web protocol that humans could understand. But, perhaps it will force people to go back to the drawing board and develop new protocols that aren't dependent on web browser tech. Maybe the state of the art will finally advance past this networked document reader.


I think http/2 will be the odd one out (and abandoned) longer term, with http/1.1 being retained for backwards compatibility, and http/3 being more common.


I don't think it's responsible to replace http/1.1 with either of these protocols. They both seem like complex beasts with poorly understood corner cases and little consideration for the abuse potential.


My feeling is they both suffer “version-2-isms”, ie having added many nice-to-haves that are complex. That said, I think QUIC is a much, much bigger step. We are literally adding congestion and flow control to user space, as well as packet-level routing. This has been done before, but mostly for bespoke purposes (like UDT - or with per-OS batching, kernel extensions or at least kernel tuning). Now, it’s supposed to be general purpose, across hardware, platforms and languages.

Personally I think either QUIC makes it into kernels, or it will have a loooong time ahead of it with language- and vectorized/batched IO in the OS (maybe even down to the NICs?) catching up. Even the more mature implementations struggle compared with TCP today, for things like high bandwidth on consumer hardware. Not to mention CPU overhead and the battery drain that comes with it. (At least from my own high-bandwidth experiments)

Yes, I know a large part of web is already http3. But remember that http is used outside of browsers and data centers. I don’t know enough to back any specific proposals, but to me it sounds a lot easier to fix the tcp handshake, open 2-3 conns for the HOL-blocking issue, than to rearchitect the entire stack (and add new features) under UDP. I’m saying this as someone who is still very bullish and excited about QUIC.


Check out the Gemini protocol!


Can humans understand the IP protocol? The TCP protocol?


> Can humans understand the IP protocol? The TCP protocol?

Yes, but you have to frame the exchange correctly or people will look at you strangely when you start addressing and acknowledging them.


RST


I mean, not over the wire, but an average developer can definitely grok either protocol pretty quickly. IP is connectionless and fairly simple. TCP requires a more complicated state machine, and has more extensions if you want to implement the entire thing. IPv6 makes both more complicated. But the actual functionality is limited, so there's not a lot of complexity outside of "Did they turn on option 1? Do a thing. Did they turn on option 2? Do another thing. Is option 2 on but option 3 off? Do another thing."

It's not like implementing transport encryption and multiplexing and congestion control and etc etc etc. I doubt many people in the world actually know how to implement QUIC. I implemented tcp/ip as a teenager when I was bored over summer break.


Oh sure, they're not even that complicated. I've been asked (and asked myself) about them in interviews.


They are pretty easy compared to the others mentioned


That's been happening. Gemini is an example.


I've been working on such a protocol. I can't wait to turn my back on browsers forever.


Say more? What kind of protocol do you mean, and what comes after browsers?


"turn my back on browsers" was perhaps hyperbole. I do want to make them irrelevant, but realistically I'm just targeting a subset of their domain. I'm still getting a handle on the idea, so pardon the lack of brevity. One day I'll be able to put it succinctly.

Working title: Semantic Paint.

---

The plan is to take what our web does poorly and do those things well. From that I have two main goals, the first of which is:

Permissionless Annotation -- I should be able to attach annotations to datasets (or subsets thereof) that I find in the wild without having write access to those datasets. Links are implemented as annotations (as are edits). Unlike our web, they are undirected and might link more than two things. Instead of a directed graph between documents we now have simplices which connect (sub)sequences of arbitrary bytes. These connections are typed (I'm calling these types "colors").

Have you ever played Mad Libs? It's a game which has partial sentences, like: "____ had a great time ____ing the ______". Fun is had by filling in the words before you know the sentence and then laughing about how silly the sentence is. In semantic paint, colors are like that: they're tuples with an associated partial sentence, each tuple element goes with a blank. At any one moment, your client will be configured to display (or act on) one or more "colors". A color is a list of tuples in this form.

So you might have a 3-color:

______ (code) is malicious, writing it to stdin of ______ (executable code) with ______ (parameters) will write a non-malicious copy to stdout.

This color would be used for annotating malicious javascript with enough metadata to fix it automatically. It functions as a link between three items. If you have any one of them, you can find the other two.

Here's the browser-killing part: at some point we stop annotating the malicious parts so that we can cleanse them, and instead we target the desirable parts so that we can make them more accessible. Embrace, enhance, extinguish.

Note that we're not talking about the filename or the server whence the malicious script came, we're talking about the data itself. Naming things is hard, so I want to see how far we can get without naming them at all. Instead, a user can just just point at the thing without naming it, and apply paint---er apply annotations--to the thing they're pointing at. It operates on fragments of data scraped from a screen, tee'd from a pipe, or OCR'd from a camera--not on files or other named abstractions.

The tuple values are pairs: a cryptographic hash, and a list of features that come out of a rolling hash (think rsync). The later is used to re-anchor the tuple (brushstroke) even if the canvas is paginated differently or has other small differences. Fingers crossed: I can keep false positives down to a tolerable level.

For instance, if you copy some code from stackoverflow into your project, and later I annotate that code while browsing stackoverflow, you should then see my annotations on your code as viewed in your IDE (that is, provided you have opted in to seeing my paint, and are running your IDE through a semantic paint client--software which will sit between you and the IDE. The first draft is shaping up to resemble tmux).

One could imagine similar functionality on a piece of paper you found blowing in the wind. Point your camera at it, extract the features, query... maybe there are annotations on that text which will tell you more about its origins. If Fermat had had this tech, he wouldn't have complained about the margin being too small, he'd just have linked the proof with a brushstroke.

Imagine also people with allergies leaving annotations on menus at restaurants: "they say this doesn't have gluten, but it totally does," that sort of thing.

In this sense you can think of it as a sort of distributed search algorithm, where either a cryptographic hash of content, or this list of fuzzy-hash features, is the search query. You'd just sort of leave it running as a filter over whatever data you're working with. I kind of imagine it like augmented reality... for data.

---

The second thing I want do do well is Partition Tolerance.

I want apps using this protocol to function without a persistent internet connection. They'll function slowly, but how up-to-date do you really need that blog post to be anyway? For most things, days or even weeks of latency is ok.

If you're in your car, stopped at a light, your device will be gossiping with others that are stopped at the same stoplight. Pedestrians in range may also end up participating. Delivery drivers put nodes on their vans, which silently gossip brushstrokes while the drivers deliver packages. Imagine a train full of people with gossiping devices... Sneakernet, on autopilot, even in a disaster or a protest.

Secure Scuttlebutt Protocol comes to mind here, but that's append-only. This is unordered. You just grab all of the brushstrokes you're interested in and provide strokes that your peers are interested in. Retention policies and algorithms for deciding what "interested in" means, are the domain of the app. Convergence will be hard to orchestrate, but that's no reason not to try (or maybe instead we diverge).

Peers come and go, but since everything is content addressed (cryptographically, or fuzzily), what really matters is whether those peers are interested in the same colors that you are. I know it sounds crazy ambitious, but if you don't have to protect the referential integrity of a globally consistent name, lots of problems go away.

The goal is to keep data nearest the things in the real world that it is relevant to. If you run across contradictory entries within a color, you can scrutinize by author (who do you trust more?) or by which peers gossiped it (which is more local?). I anticipate that handling trust explicitly like this (and focusing on data, not server names) will change the game re: misinformation.

One thing I like about this strategy is that you can synchronize this gossip (like cicadas synchronize their mating habits). I want to be ad-hoc wifi/bluetooth tolerant, but now imagine a node running in the cloud. Rather than leaving it on 24/7 so that you're ready to respond to a user at any moment, you can have your node on a 5 minute cycle: Sleep for 4:30, gossip for 0:30, repeat. That's paying ten cents instead of a dollar for server uptime. Yeah, users will have to tolerate data that's 5 minutes stale, but for most applications that's fine. If the data is relevant to them, it should make it to their node before they go to look for it.

Another benefit is that you can enlist your node to someone else's cause without having to talk with them first. Much like how IPFS lets you "pin" data published by somebody else so that that data doesn't go away if their node goes offline, you could instruct your node to notice square pegs and square holes and publish annotations about having fitted the peg in the hole. This means that if you don't want to pay somebody in money, you can instead pay them in operational support:

> I don't want to pay you $5 / mo. Instead I've been hosting way more than my share of your service on this stack of hard disks for a month. Let that be my payment.

I think there are some things that capitalism and zero-sum games are doing poorly, that cooperation and reciprocity could do well. That idea is not fundamental to the protocol, but it is fundamental to why I want to build the protocol.

----

Whew, I could go for pages and pages, but that's my best shot at a sketch. Thanks for asking, sorry it's not elevator-pitch-grade.


I like your madlib and paint metaphors. IMO that's the most important part of UI/UX, what metaphor can people use to apply their world models to an application that can technically work in infinite ways.

The file/folder metaphor has had fantastic success but it leaves people constrained to hierarchical thinking when databases can actually represent arbitrary graphs, cycles and all. When prototyping my knowledge graph I wanted to build directly on existing filesystems because readdir is fast and cached by the OS, a lot of work done already. But I was stymied by the fact you can't hardlink a directory to multiple directories, because cycles aren't allowed, so I'm stuck with softlinks.

I also weighed the pros and cons of tuples (unnamed associations) and triples (subject verb object, verb being the name of the association) and decided there's utility in having both without too much added conplexity - tuples are just triples with NULL for a verb.

Let me ask you this - architecturally, I'm still weiging building on top of unix filesystems, with git for history/multi-user collab sync VS sqlite and figuring out history/sync later, but take the abstract case of tables with columns: Does it make more sense to have only two tables - one for associating hashes/uuids with strings/buffers and the second table being "left id, right id" for links, or is it worth the added complexity to create a new table for each type - a specified set of named attributes each with specified type, such that when you want to pull up all the metadata on an mp3 file you already know which table to read, instead of doing dozens of reads on the simpler model and then verifying the type ex post facto... would love to know your thoughts on how to structure the graph on disk...

fwiw the tagline for my project is "global media graph of associations, attributes, & annotations"


Thanks for the feedback. Working on this has been a lot of fun for me, but I'm also aware that it's too big to digest in a sentence or even a paragraph, so there are very few people I can talk about it with. From what you shared about yours it seems we're thinking in similar directions.

Aside:

I think it's a tragedy that bit torrent is associated with piracy because it's just a better way to move data around in general. In order to not have that happen to me, I'm seeking a first application that is unlikely to ruffle feathers, so I've been taking a bioinformatics class. As I get familiar with workflows in the genomics/proteomics world, I'm finding that this:

> The file/folder metaphor has had fantastic success but it leaves people constrained to hierarchical thinking

Is resoundingly true. We end up with filenames like: "ncrassa.H3K9me3.ChIPseq.subtracted.merged.bed" which is not a name so much as a directed graph, a recipe of how this file came to be. Coming up with names for the resultant files at each stage in the graph and keeping them straight is this chore which could be dispensed with if what you were looking at was the graph that we're all holding in our minds while we write these names.

Semantics Note:

> tuples are just triples with NULL for a verb.

The way I think of it, pairs are just triples with null for a verb. When I say tuples I mean things that can have arbitrarily many values. Anyhow, I think I know what you mean.

As for data representation, I don't really like triples in the subject, verb, object sense. It's very Semantic Web™, particularly it's the vision of the semantic web which Doctorow argues has failed here: https://people.well.com/user/doctorow/metacrap.htm

I agree with his critiques, and I would add this one:

> People struggle enough to do their jobs as it is, if yours is a system that asks them to do something more in the name of useful metadata, you will be ignored.

If you have write access to important data, you have way too much responsibility to way too many people. We need to split that responsibility up so that different personae can handle different things. If I'm the one that wishes that this data had annotations, let me be the one to annotate it, and don't make me seek approval first.

A consequence of this is that you now need an "according to whom" field on all of your triples. Maybe you don't trust that particular author. Fine, quadruples then? Maybe, but when I actually try to get a complex problem to lay out nicely in this way I end up with a sort of predicate soup--which is how I ended up with my mad libs approach: arbitrary n-tuples. In some sense I'm passing the buck and making it the app developer's problem (which is, of course, also me, but wearing a different hat).

I'm not enamored of reddit these days, but when it was new it was quite innovative to just let any-old user create any-old subreddit. Most of them fizzled out into obscurity, but occasionally a community had its shit together, and those thrived. That's a dynamic that I want to lean into. If the data sucks, don't find a different protocol, find a different community of data curators using the same protocol.

So that relates to your question because unless you're going to be in-the-loop as arbitrator of everything, people are going to disagree, and there will be a certain amount of churn as they align themselves with different sides of that disagreement.

I have to imagine that at some point somebody is going to want to express something new. To borrow from my world, a mad lib:

> ___ and ___ played in superbowl ____ and ____ won (according to ____)

I'm pretty sure that this quintuple can be normalized down into some set of triples which sort of reference each other when the proposition needs to be reconstructed, but when you decompose it like that you lose the feeling that it's a single thing--a thing that can be trusted or not by a user--a thing that can be gossiped between users--a thing that can be purged once it's no longer needed. It may be the case that having to reconstruct the more complex thing while performing these operations is actually more complex than just paying the up-front cost of having them be more complex things in the first place.

So I guess I have to ask: Is it definitely more complex to have a separate table for each type? Or does that complexity just bite you at different times?

I once had to test assertions about a rather large database over a rather slow connection. I was very fortunate that I could strip off the foreign keys and just flat-out ignore tables that were not related to my testing (this required that I sync much less data). That's harder to do if the "real" things are expressed in terms of simpler things which now have to be teased apart.

My opinion is relevant to the "sqlite" side of your dilemma, I think. I'm personally a big fan of git and filesystems--they are among the most potent tools that I feel comfortable using. I also considered wrapping git to handle conflict resolution. But ultimately it's the road I didn't take, so I can't really weigh in there.

I avoided it for.... weird reasons:

Generally, a git repo is run by somebody with authority. They accept or reject PRs in a rather top-down way. That feels like the "consistency" side of the CAP theorem, and I have chosen "partition tolerance" instead. It's an idea I got from Unison (https://www.unison-lang.org/). If you don't do globally unique names (such as the URL for a git repo) you don't get fights over where the authoritative name points. Instead, each fork is created with equal validity and its up to the users to decide which one to install.

It's kind of like back in the day when BTC forked. The optics of the scenario was that segregated witness won, and the losers of that fight went off and created BCH. But realistically it was was just that there was first one protocol, and then there was a choice between two new protocols. The notion that one side of the conflict got to carry the old name is well... not as content-addressed as I'd have liked.


I'm interested too. What protocol are you working on, if you don't mind sharing?


not OP but of similar sentiment

I have some designs for a Plan9 kind of thing on top of ssh/git/torrent

Similar to the Holochain project, which is a distributed datastore with application layer and sums itself up on the homepage as "Think BitTorrent + Git + Cryptographic Signatures" but I don't enjoy holochain. It's one of the attempts to rewrite the stack tho, see also Urbit.

https://www.holochain.org/ https://urbit.org/


When I first ran across urbit I spent a few days trying to understand if they were building what I am building, because maybe I should just join them. When I discovered that artificial scarcity was involved, I felt confident that no, we are not building the same thing. But it did take quite a bit of research to come to that conclusion, much similarity.

I'll check out holochain, if just for contrast. I'm a bit leery about "chain" though, blockchains go to great lengths to achieve globally consistent outcomes, and I don't think people need that for most things. Even about money. Especially about money. If you're rich in here-dollars, and you go there, you shouldn't expect them to respect your here-dollars unless you earned them doing something that is also appreciated there. Liquidity comes from shared values, and different people often have different values, so like... drop the consistency criterion.

I know a few people with similar projects. We had put so much hope into what we thought the web could be, and we're pretty damn disappointed with what it is, so you've got a generation of nerds trying to build what we thought it could be.


I begrudginly respected the imposed scarcity of urbit if only to fund developement but yeah I was in the same boat, "oh good. someone already did it, I can just pay $15/month and relax" but unfortunately the peer discovery for installing apps is fragile AF

Holochain is unfortunately named since its about as much of a blockchain as git is (merkle tree, no proof of work mechanism) but they obscured it willingly by trying to integrate ETH [0] as a way to pay peers for cloud capacity / capitalize on the crypto hype.

[0] https://holo.host/host/

edit: "Liquidity comes from shared values" this is an interesting axiom, I'll have to think about that. I've always held that the neat thing about money is it allows collaboration between groups with opposing values, because at least we have one motive in common


I realize that this is a hot take, but I think that

> money ... allows collaboration between groups with opposing values, because at least we have one motive in common

...only really holds if our common motive is that we both don't want to be killed by some Leviathan type thing (originally this a Roman soldier who would kill you if you didn't trade Caesar-face coinage above its buillion value). Money today is just a proxy for the same old violence as before.

This places a cap on how legitimate money can be. The have-nots, should they ever manage to coordinate, would be better off turning their backs on it. When a rich nation shows up and starts throwing money around, you can be sure that the poorer nation is getting screwed. Probably there's some kind of violence hanging around ensuring that they don't act up. See the history of US policy in South America in the 70's for more on this sort of thing.

If we want to move past all of this yucky stuff I'm talking about, we need to your acceptance of my money synonymous with me proving to you that I've done something that benefitted you specifically. As it is now, it's just proof that my activities somehow benefit a bank somewhere.


It's mostly notes and dreams, only a little bit of working code. I'm going to seem a little insane... but here: https://news.ycombinator.com/item?id=37972446 (or see cousin thread).


Hmmmm but is it time for a new domain?


[flagged]


No?


After looking around for ~5 minutes I still can't find out if this HTTP/3 lib will allow establishment of an HTTP/3 connection with a self signed cert. Or if it will allow non-encrypted connections. Does anyone know?

Given the lack of any function names addressing TLS/CAs/encryption/etc I'm going to assume it's like all the other HTTP/3 implementations and fails to fully implement the spec. It's funny. Everyone defends MS/Google's HTTP/3 based on QUIC by saying the lack of ability to use plain text or self signed certs is the fault of the HTTP/3 implementations. But all the HTTP/3 lib implementations say it's the fault of QUIC.


> I'm going to assume it's like all the other HTTP/3 implementations and fails to fully implement the spec... the lack of ability to use plain text or self signed certs

Really? Does trusting a self-signed cert as a root cert not work in most libraries? quiche at least lets you control the parameters of their TLS implementation (https://docs.quic.tech/quiche/struct.Config.html#method.with...), which lets you go as far as allowing you to manually verify certs yourself (https://docs.rs/boring/latest/boring/ssl/struct.SslContextBu...).


Where in the HTTP/3 spec does it allow for non-encrypted connections? HTTP/3 is a mapping of HTTP semantics to QUIC specifically, and QUIC mandates TLS 1.3 negotiation.

Also quiche supports self-signed certificates (I use it with them). In fact I'm not aware of any raw QUIC or HTTP/3 implementations that don't allow implementing your own certificate handling.


Considering you can't run nginx with http/2 unless you specify SSL, I'd say TLS/SSL is a requirement with this library.


> Given the lack of any function names addressing TLS/CAs/encryption/etc

Not only in its public API, but those are missing from its source code as well. I can't see where it could be doing any encryption—it doesn't even link to e.g. OpenSSL.


I don't think this is a free-standing implantation. The docs start with:

> nghttp3 is a thin HTTP/3 layer over an underlying QUIC stack.


Just create your own root CA and pin it in the clients.


How do I host a website visitable by random people I've never met on HTTP/3 without getting the continued/periodic approval of a third party corporation for a CA based TLS cert? I can't. That's bad. It's a very significant change from HTTP/1.1 (HTTP/2 is only used by corporations/institutions/non-human persons) in the past which allowed anyone to connect to anyone. It was truly a web.

There's absolutely no way I can get my self-made root cert in the trust store of the major browsers so that a random person I've never met can connect. HTTP/3 is designed for the use cases of non-human persons. For human persons it's a terrible protocol.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: