For everyone pining for a Jq with a different syntax: I have a bunch of links to alternatives collected, you might want to try some of them (some may be for different things than JSON):
My personal favorite solves the same problem but attacks it differently.
> Make JSON greppable!
> gron[1] transforms JSON into discrete assignments to make it easier to grep for what you want and see the absolute 'path' to it. It eases the exploration of APIs that return large blobs of JSON but have terrible documentation.
Personally I'd prefer Fennel, which is on Lua and thus a whole lot faster, especially in regard to the startup time—but as I noted in a thread on Fennel, Lua's omission of a proper ‘null’ makes it awkward to handle exchange and transformations of data from third parties. And, since I'm likely to fiddle with the queries for some time, startup delay is less important here.
1) refuses to operate on stdin; requires a filename argument, which is so irritating.
2) doesn't accept values that jq accepts
% time jq -r '[expression]' < parcels | wc
365 1454 7978
jq -r < parcels 1.39s user 0.00s system 99% cpu 1.390 total
wc 0.00s user 0.00s system 0% cpu 1.390 total
% time ~/.yarn/bin/q '[expression]' parcels | wc
q: internal error, uncaught exception:
Yojson.Json_error("Line 56, bytes -1-32:\nJunk after end
of JSON value: '{\n \"OBJECTID\": 155303,\n \"BOOK\"'")
1 is easy to work around (handy tip incoming for any tools that _seem_ to not support stdin but actually do, as stdin is also available as a file in unix):
Tools that accept filenames often expect you give them a real file, as they’ll do things on it that may not be supported by the various “it’s a file descriptor pretending to be something on disk” solutions.
Hm, do you have any examples handy? It's not that I don't believe you, it's just that in all the years I've been using this, it has always worked. Granted, I'm only using it for reading data, not for saving stuff to /dev/stdin, which would obviously fail.
`file` on BSD/OSX has this notice for the option '-s':
> Normally, file only attempts to read and determine the type of argument files which stat(2) reports are ordinary files. This prevents problems, because reading special files may have peculiar consequences.
One example that comes to mind is /dev/urandom, which sucks randomy values out of the entropy pool (at least in Linux)—and the pool can be exhausted, or at least it could back in the day, not sure about now. Other possible cases are things in /proc (though unlikely), and particularly stuff like serial ports—where presumably reading could gobble data intended for some drivers or client software.
Not the best example here, but try to open a file descriptor in VSCode. I had a scenario where I wanted to diff a file on two different servers with a user-specified diff tool, and certain ones won't even operate in a read-only manner.
Untested, but I expect unzip will fail because it keeps metadata after the files, so needs to seek back. (Unless they detect the pipe and buffer everything instead)
fseek(3) etc, and using stat(2) to determine allocation sizes springs to mind as things you might want to do on a normal file that will not work as expected on a character device.
I try to avoid the dummy files /dev/stdin, /dev/stdout and /dev/stderr, since I've been bitten when they're not available, or when I hit permission denied errors.
Two examples I can remember off the top of my head:
Yes, I don't understand how people end up with assertions that the filename is a require argument. At least we've got /dev/stdin or /proc/self/fd/0 as workarounds.
Much Unix software today is written by people who really don't appreciate or understand Unix. It leads to things like Homebrew (early on, at least) completely taking over and breaking /usr/local; to command-line utilities using a single dash (-) precending both short and long option names (granted, --long-opts is a GNUism, but it's a well-established standard); commands that output color by default even when the output isn't a tty; etc.
It's not hard to fix things like this, but it exemplifies a lack of familiarity with the Unix command line. There are an enormous number of tools out there that only exist because people don't know how to chain together basic 1970s Unix text-processing tools in a pipeline.
Anybody long enough to remember Unix from the beginning or even just the last 25 years... which is a tiny percentage of this site... should know that a unifying Unix or Unix "tradition" as noted in a follow-up comment is a pretty much a myth. The tradition is whatever system you grew up on and tribal biases you subscribe to and the only true Unix traditions are mostly trivialities like core shell syntax and a handful of commands, and a woefully underpowered API for modern purposes. And long option names are definitely not part of any tradition.
Myths like "everything is a file" or file descriptor is complete bollocks, mostly retconned recently with Linuxisms. Other than pipes, IPC on Unix systems did not involve files or file descriptors. The socket api dates to the early 80s and even it couldn't follow along with its weird ioctls.
Why are things put in /usr/local anyway? Why is /usr even a thing? There's a history there, but these days I don't seem much of anything go into /usr/local on most Linux distributions.
It's also ironic to drag OS X into a discussion of Unix, because if there was one system to break with Unix tradition (for the best in some ways) -- no X11, launchd, a multifork FS, weird semantics to implement time machine, a completely non-POSIX low-level API, etc, that would be it.
All this shit has been reinvented multiple times, the user-mode API on Linux has had more churn than Windows -- which never subscribed to a tradition. There's no issue of lack of familiarity here, the original Unix system meant to run on a PDP-11 minicomputer only meets modern needs in an idealized fantasy-land. Meanwhile, worse is better has been chugging along for 50 years while people try to meet their needs.
> more churn than Windows -- which never subscribed to a tradition.
My understanding is that Windows has always had a very strong tradition of backwards compatibility. Even to the point of making prior bugs that vendors rely on still function the same way for them (i.e. detect if it's e.g. Photoshop requesting buggy API, serve them the buggy code path and everyone else the fixed one).
That's just as much a tradition as "we should implement this with file semantics because that's traditionally how our OS has exposed functionality".
There are X server implementations for Windows, Android, AmigaOS, Windows CE!!, etc... I don't think this is relevant.
> macOS has a POSIX layer.
So do many systems, again including Windows in varying forms through the years. I think the salient issue is that BSD UNIX and "tradition" are conflicting. The point of the original CMU Mach project was to replace the BSD monolith kernel.
Tangent: Homebrew itself doesn’t really choose to take over /usr/local; rather, it just accepts that there exists POSIX software that is way too hard for most machines to compile, and so must be distributed precompiled; and yet where that precompilation implies a burning-in of an installation prefix at build time, which therefore cannot be customized at install time. And so that software must be compiled to assume some installation prefix; and so Homebrew may as well also assume that installation prefix, so as to keep all the installed symlinks and their referents in the same place (i.e. on the same mountable volume.)
You have always been able to customize Homebrew to install at a custom prefix, e.g. ~/brew. It’s just that, when you do that, and then install one of the casks or bottles for “heavy” POSIX software like Calibre or TeX, that cask/bottle is going to pollute /usr/local with files anyway, but those files will be symlinks from /usr/local to the Homebrew cellar sitting in your home directory, which is ridiculous both in the multiuser usability sense, and in the traditional UNIX “what if a boot script you installed, relies on its daemon being available in /usr/local, which is symlinked to /home, but /home isn’t mounted yet, because it’s an NFS automount?” sense. (Which still applies/works in macOS, even if the Server.app interface for setting it up is gone!)
The real ridiculous thing, IMHO, is that Homebrew doesn’t install stuff into /usr, like a regular package manager. But due to macOS considering /usr part of its secure/immutable OS base-image, /usr is immutable when not in recovery mode.
I guess Homebrew could come up with its own cute little appellation — /usr/pkg or somesuch — but then you run into that other lovely little POSIXism where every application has its own way of calculating a PATH, such that you’d need to add that /usr/pkg directory to an unbounded number of little scripts here and there to make things truly work.
Or use `/opt` which is a POSIX-standard location. Every managed MacOSX laptop I've gotten from "Big Corp" has had `/usr/local` owned & managed by IT with permissions set to root meaning you're fighting Chef (or whatever your IT department prefers) if you use the default homebrew location.
But again, you'd have that problem whether you used Homebrew or not, as soon as you tried to (even manually!) install the official macOS binary distribution of TeX, or XQuartz, or PostGIS, or...
Homebrew just acknowledges that these external third-party binary distributions (casks) are going to make a mess of your /usr/local — because that's the prefix they've all settled on burning in at compile-time — and so Homebrew tries to at least make that mess into a managed mess.
And, if some other system is already managing /usr/local, but isn't expecting the results of these programs unpacking into there, it's going to be very upset and confused — again, regardless of whether or not you use Homebrew. So it'd be better for those other systems to just... not do that.
/usr/local isn't supposed to be managed. It's supposed to be the install prefix that's controlled by the local machine admin, rather than by the domain admin. Homebrew just happens to be a tool for automating local-admin installs of stuff.
> I guess Homebrew could come up with its own cute little appellation — /usr/pkg or somesuch —
/opt/homebrew would be a somewhat traditional place to put it.
> but then you run into that other lovely little POSIXism where every application has its own way of calculating a PATH, such that you’d need to add that /usr/pkg directory to an unbounded number of little scripts here and there to make things truly work.
What? You should be able to add it to the system PATH that's set for sessions and call it a day on a POSIX system. PATH is an environment variable and inherited. If MacOS is in the habit of overriding PATH on system scripts I have to imagine that's because they completely screwed it up at some point in the past. Generally, you just add it to your use session variables in whatever way your system supports (.profile, etc) if you want it for your user, or at a system level if you want it system wide (I could see maybe Apple making this hard).
The only times in over 20 years I've ever had to deal with PATH problems are when I ran stuff through cron, because it specifically clears the PATH. More recent systems just specify a default PATH in /etc/crontab for the traditional / and /usr bin and sbin dirs.
Maybe you're thinking of the shared library path loading? That should also be easily fixed.
> and yet where that precompilation implies a burning-in of an installation prefix at build time
Not necessarily. Plenty of software uses relative paths that work regardless of prefix. Off the top of my head, Node.js is distributed in this way.
> you’d need to add that /usr/pkg directory to an unbounded number of little scripts here and there to make things truly work.
How so? Are there that many scripts that entirely replace the PATH environment variable? In Linux, I just include my system wide path additions in /etc/profile which will be set for every login. For things like cron jobs or service scripts, which don't inherit the environment of a login shell, you will need to source the profile or use absolute paths, but that's about the only caveat I can think of.
> You have always been able to customize Homebrew to install at a custom prefix, e.g. ~/brew. It’s just that, when you do that...
...and then try to build something entirely sensible like Postgres, but hours of fiddling with different XCode versions and compiler flags still lead to a dead end of errors, you're stuck because you're running an unsupported configuration.
I still don't understand how the PG bottles for Mojave can be built.
I would argue that Go's design as a whole is characterized by an attitude of ignoring established ideas for no other reason than that they think they know better.
Something being established is not a grand argument for it's usage. The reasons it got established are relavent, and if you feel the end result of said establishment is obtuse or inane, why would you use it?
That's not to say Go's decisions to toss some established practices are "wise" or "sagely", just that broad acceptance is not a criteria they seemed concerned with. Which is fine.
>they think they know better.
It's safe to say Rob Pike is not clueless or without experience in unix tooling. You should listen to some of his experiences and thoughts with designing Go [0]. I don't always agree with him, [but it's very baseless to suggest he makes decisions on the grounds that they were his, not they have merrit.]
Edit to clarify: [He makes decisions on merrit over authority]
Sure, there's nothing that says established practice is better. That is not, in my opinion, a good defense of Go which makes many baffling design decisions. Besides, an appeal to the authority of Rob Pike is surely not a valid defense if mine is not a valid criticism.
I'm (perhaps unfairly) uninterested in writing out all the details, but “they think they know better” is because I see Go as someone's attempt to update C to the modern world without considering the lessons of any of the languages developed in the meantime. And because of the weird dogmatic wars about generics, modules, and error handling.
Rob Pike explained it thusly in a 2013 Google Groups reply:
> Rob 'Commander' Pike
> Apr 2, 2013, 6:50:36 AM
> to rog, John Jeffery, golan...@googlegroups.com
> As the author of the flag package, I can explain. It's loosely based on Google's flag package, although greatly simplified (and I mean greatly). I wanted a single, straightforward syntax for flags, nothing more, nothing less.
This is a fair point, although ironically it's probably because Pike predates GNU and still has a problem with all the conventions those young upstarts eschewed. Conventions change, usually for the better. I think this is one the Go team got wrong, regardless of the reason.
"There are an enormous number of tools out there that only exist because people don't know how to chain together basic 1970s Unix text-processing tools in a pipeline."
Arguably that is why the original implementation of Perl was written. If I remember the story correctly, we can never know for sure whether, e.g., AWK would have sufficed, because the particular the job the author wrote Perl for as a contractor was confidential.
Are people using jq most concerned about speed, or are they more concerned about syntax.
JSON suffers a problem from which line-oriented untilities generally have immunity: a large enough and deeply nested JSON structure will choke or crash a program that tries to read all the data into memory at once, or even in large chunks. The process is resource-constrained as the size of the data increases. There are no limits placed on the size or depth of JSON files.
I use sed and tr for most simple JSON files. It is possible to overlfow the sed buffer but it rarely ever happens. sed is found everywhere and it's resource-friendly. Others might choose a program for speed or syntax but the issue of reliability is even more important to me. jq alone is not a reliable solution for any and all JSON. It can be overkill for simple json and resource-constrained for large, complex JSON.
> It leads to things like Homebrew (early on, at least) completely taking over and breaking /usr/local
Fully agree with you, but oh well, most if not everything is available on Macports anyway.
> There are an enormous number of tools out there that only exist because people don't know how to chain together basic 1970s Unix text-processing tools in a pipeline.
Speed. A specialized tool you need often beats manually wrangling the dozen or so Unix tools you need to replace it, plus many Good Options are only available on the GNU/Linux coreutils and don't work on Macs (sed -i, my most common annoyance) or busybox.
Another common thing you can do is accept a generic stream as input, but have some code that penetrates the abstraction a bit to see what kind of stream it is, and if it is a file or something, do something special with it to go even faster. This way, you start with something maximally useful up front, and easy to use, but you can optimize things based on details as you go.
That's how Go's static file web server works. It serves streams, but if you happen to io.Copy to that stream with something that is also an ∗os.File on Linux, it can use the sendfile call in the kernel instead. (A downside of making it so transparent is that if you wrap that stream with something you may not realize that you've wrecked the optimization because it no longer unwraps to an ∗os.File but whatever your wrapper is, but, well, nothing's perfect.)
This is not a simple redirection. cmd <(subcmd) is a bashism that redirects the output of command subcmd to something that looks like a regular file to command cmd. Command cmd receive a path at the place the <(subcmd) syntax is used. Different from cmd < f, which redirects the contents of file f to cmd's input.
My most common jq usage is to copy and paste some json into quotes to make it easier to read. My second most common action is to chain curl and jq together. A replacement for jq that doesn’t use stdin is literally useless to me.
That’s a cumbersome extra step for unclear benefit.
Yes, q is supposedly faster than jq. But it is exceedingly rare for me to ever have any performance problems with jq, especially since it’s essentially a one off utility I use occasionally, not as part of the hot loop of any workflow where performance matters.
Thanks for this. I've been planning a similar work for years and haven't gotten off my ass (too many other projects lol).
I definitely agree that reading from stdin is critical if I'll be able to use it. Don't take the criticism too hard though (especially the "author doesn't appreciate unix" stuff. Sometimes we can be such assholes to each other).
The incompatibility is apparently due to the fact that jq is happy with a concatenation of JSON objects and q is not. For example {'foo':1}{'foo':2} as opposed to [{'foo':1},{'foo':2}]
Yes, but the inputs to jq are not JSON, they are "a sequence of whitespace-separated JSON values which are passed through the provided filter one at a time" which is the relevant thing if you are going to try to replace jq.
The comma operations means that the "filters" are duplicated, so instead of one json state you would pass two if there's one coma; and ofcourse any number of commas are allowed.
Are we sure it should get a single-letter 'q' binary name though? Docs seem to point that it's short for 'query-json'? Why not call it 'query-json' and let the user decide that as a shell alias or whatever. Even the ubiquitous 'ls' and 'cd' are two characters.
Appropriating Shepard Fairey's graphical style into a QAnon T-shirt and then wearing it to a Trump rally is an arresting masterwork of postmodernism. One wonders: is the wearer trolling the rally, or is he trolling himself? Perhaps we are all trolls. Perhaps we have all been trolled. HAND.
Given these are two of the least-used characters in English and basically never appear in this sequence, it would certainly help for all the reasons people are mentioning. It'd at least follow suit with like 'rg' for 'ripgrep'.
If so, and for anybody else having this wish, check out jql[0], I've created it exactly for this reason, to have the most common jq operations available in a more uniform and easier to use interface.
Give jet a try! Uses a lightweight query language over EDN. If you're familiar with Clojure, it'll be very natural to use and if you're not familiar with Clojure, the query language used is very easy to pickup :) https://github.com/borkdude/jet/blob/master/doc/query.md
xmlstarlet supports XPath 1, but the W3C did not stop there. They made XPath 2 featuring variables, lists and regular expressions, XPath 3 featuring higher order function, and finally XPath 3.1 featuring JSON support.
That's news to me. Very cool actually. It's not far from jq on speed, either. Looks like you are the maintainer of this tool, so thanks!
~/xidel % time ./xidel - -e '?SitusAddress!(.)' < ~/parcels | wc
**** Processing: stdin:/// ****
29066 149317 903704
./xidel - -e '?SitusAddress!(.)' < ~/parcels 2.55s user 0.18s system 99% cpu 2.733 total
wc 0.01s user 0.00s system 0% cpu 2.733 total
~/xidel % time jq -r '.SitusAddress' < ~/parcels | wc
29066 149317 903704
jq -r '.SitusAddress' < ~/parcels 0.95s user 0.00s system 99% cpu 0.958 total
wc 0.00s user 0.01s system 1% cpu 0.957 total
The oj command in https://github.com/ohler55/ojg uses JSONPath as the query and filter language. Maybe it is more in line with what you are looking for.
Hint: you can do live jq query preview for any jq-like command using fzf. It looks like this for jql, an alternative I've created (you can find it in a neighboring comment):
> Aside from that, q isn't feature parity with jq which is ok at this point, but jq contains a ton of functionality that query-json misses and some of the jq operations aren't native, are builtin with the runtime. In order to do a proper comparision all of this above would need to take into consideration.
> The report shows that q is between 2x and 5x faster than jq in all operations tested and same speed (~1.1x) with huge files (> 100M).
While faster for somethings....that's a pretty large set of caveats!
Would be good if someone adds an explanation why this new approach is better, is it that the OCaml is faster, more efficient algorithms were used, etc?
According to the "Purpose" section of the readme, it doesn't look like beating jq's speed was ever a goal. It was meant to be a learning exercise.
But if I had done something like that, and then serendipitously discovered that I was exceeding the original's performance, I certainly wouldn't be shy about it.
Also, this comes across as armchair criticism purely for the sake of armchair criticism. My own experience has been that, when I'm doing ETL that involves wrangling JSON, the "wrangling JSON" bit of it is almost always the bottleneck. So any improvement is more than welcome and deserves to be cheered. Even if it's an improvement on something that's already the current fastest way to do it.
Reimplementing a piece of software that is 12 years old which mimics their UX and improves performance and error messages it's more than welcome in my opinion. My purpose was to learn the OCaml stack of writting compilers, so I personally found that I "needed" a language already created.
As an outsider I get very confused by the Reason / Reason Native / OCaml / Bucklescript / Rescript?! ecosystem. What does it mean for it to be written in Reason Native/OCaml?
That means it produces a native binary (for example, a .exe file on windows platforms), so ultimately you're aiming to run the program in a terminal. This is the normal way for OCaml to operate.
In this case the author is using Reason as an alternative syntax to OCaml. Reason resembles javascript a little more, and some people find that nicer to work with. So the idea is that you write Reason code, then translate it into OCaml code using the Reason tools, and then ultimately you compile it down to a native binary.
If instead you want to write a web-app which runs in a web browser or node.js, then you'd need to compile it to Javascript, which is what bucklescript helps you do.
Where does Rescript come in? As explained above, Reason can be used for writing either native apps or javascript apps. However, it's hard to evolve the syntax of Reason in a way which satisfies both aims. So they've now split the work -- going forward, Reason will specialize on native, and Rescript will specialize on javascript apps. Their syntax is expected to diverge from each other, in order to support those aims as best as they can.
Thank you for the detailed answer! I check in on the status of the related projects from time to time and was often confused by the relationship between the components.
Right, the explanation of Reason - BuckleScript - OCaml is always nebulous.
I used Reason to compile to Native, so using OCaml's stdlib and OCaml's dependencies and compiling it with OCaml, but my source code is written in Reason syntax.
I remember one of the first times I tried installing Linux software in the wild. The bash script asked for your password, sent it to their server using curl then returned you the script with the password hard coded into it, run itself with sudo, all over unencrypted http. I was 17 but even then I stopped to think if this was a good idea.
That is pretty amusing. I’ve seen some bootstrap scripts that pipe the curled output to the terminal for approval before executing it. That seems like an ergonomic alternative to curl | bash. It would be at least as useful as the terms of service warnings before you install something, anyway.
I used to be a regular user of jq, but I was never parsing very large JSON. I now do what I used to do with jq in my browser's developer tools console. Map and filter are far more familiar than jq's syntax where I found myself referring to the documentation most of the time.
I'm sure other people have use cases where the browser wouldn't meet their needs, but for me, I find jq unnecessary.
When it got to the point when I needed a script, I just preferred Python. I can understand how some might prefer jq and a shell script, I just realized it wasn't worth it for my particular needs.
Jq appears to have its own hand written json parser and requires flex/bison. I suspect something about the hand written parser is slow for large data sets.
I was somewhat surprised it didn't use an existing json parser library.
I'm doubly surprised that such a popular utility uses bison; generated parsers tend to be slower than handwritten parser, and JSON isn't exactly the world's hardest language to parse
I often have to pluck out attributes from streams of json records (1 json object per line) - often millions/billions.
jq is almost always the bottleneck in the pipeline at 100% CPU - so much so that we often add an fgrep to the left side of the pipeline to minimize the input to jq as much as possible.
It's very slow. It was immediately standing out in our automated tests when we've added a json protocol to our system and used jq to test some assertions.
Not OP, but I routinely call a specific HTTP API for millions of entities or pull down entire Kafka topics - all in JSON format. For various reasons those are the canonical sources of data and/or the most performant, so I end up ripping through GBs and GBs of JSON when troubleshooting/reporting on things.
I don't think it's quite 90GB, but I've processed Wikidata dumps in the same order of magnitude before (which are one JSON object per line) with jq, and it could've certainly been faster.
Yes. I was looking to embed it in a tool, but decided against it after looking at its implementation. It parses the expression with a stack and executes it directly, and its JSON parsing is much the same. I doubt the parsing would be close to competitive with RapidJSON, let alone simd-json. The conditionals and pointer chasing of such an implementation are stumbling blocks to performance.
The C code is clean enough as C code goes, but fairly monolithic. And it’s C, so it’s not noticeably slow until you start processing GB. But it would probably take a rewrite to improve its performance significantly.
The speed is not concern for me. I am wondering if there something better than `jq` in terms of syntax. Whenever I want to get something more that just prettify json output in the console or simply get value by specific field name I have a problem, for me it is just difficult to remember jq syntax without looking into history. As well have in my notes links to examples like this one
In case anyone is interested in yet another alternative, I have this old, unpolished project: https://github.com/bauerca/jv
It is a JSON parser in C without heap allocations. The query language is piddly, but the tool can be useful for grabbing a single value from a very large JSON file. I don't have time for it, but someone could fork and make it a real deal.
I use jet all the time when I need to quickly examine a json snippet in Emacs. I would use <C-u M-|> (shell-command-on-region with a prefix) and execute jet to convert selected json part to EDN. That cuts out all the visual noise. EDN is much more concise, cleaner and easier to read. I'd use it even if I don't write Clojure.
Slightly out of context here, I find the entire stack of bsb, bsb-native, ocaml and esy pretty cool. However, I just dont find enough resources, good tutorials etc on Google search. Is there a good set of beginner tutorials anyone can point to ? Thanks in advance.
The documentation is a problem in the OCaml world and a problem with Reason Native as well. I found myself pretty lost some times, esy.sh should be a initial point in contact for most of Reason related stuff.
Menhir/sedlex and others are pretty high accessibility barrier for new commers.
One of the nice things about all of it it's the discord, it's friendly and always helpful.
Hope it helps, just let me know if there's any specific!
This is cool, but I’m not sure it’s fair to claim it’s “faster” yet when it doesn’t do 95% of what jq does—-particularly the command line options. If it’s still faster when you can match 80% of the functionality, then it might be a claim worth making.
How, though? I agree that jq's syntax isn't exactly the most straightforward, and it gets raised as a point of criticism anytime jq is mentioned, but its scripting language seems like a pretty good compromise between compactness and rich features.
Replacing that with, say, traditional command line flags would make it a lot less useful for me, I'd probably have to build much longer pipe-chains to do things that are relatively simple and readable jq snippets (if one knows the syntax.)
Using an established scripting language in its place would make it pretty much just python -c/ruby -e or whatever with some pre-loaded functions, but what's the point? You can always just write a quick python/ruby/whatever script, jq to me is an alternative for cases where a script feels unnecessary. It would also mean everything gets more verbose, so less of my jq transformations can be inlined without loss of readability.
Aligning it to more established languages would probably cause confusion as well in those cases where it doesn't match the reference language 1:1. Looks like javascript, writes like javascript, but only for a tiny subset of the language, etc.
Doing this only for a few function names or syntax constructs still results in a pretty unique and unusual language that will require people to reference the docs a lot, just now lots of existing scripts break.
Just because jq is very well stablished doesn't mean their APIs are well designed and we shoudn't improved because will break existing scripts.
There're a lot of quirks from the usage of it and people struggling with learning such a great tool, so in the area of query-json it will try to make a better interface for users.
I'd love to hear some speculation - from the author or otherwise - as to why a fresh OCaml implementation would so dramatically outperform a mature C implementation
There are a few good asumtions about why is faster, there are just speculations since I didn't profile jq or query-json.
The feature that I think penalizes a lot jq is "def functions", the capacity of define any function that can be available during run-time.
This creates a few layers, one of the difference is the interpreter and the linker, the responsible for getting all the builtin functions and compile them have them ready to use at runtime.
The other pain point is the architecture of the operations on top of jq, since it's a stack based. In query-json it's a piped recursive operations.
Aside from the code, the OCaml stack, menhir has been proved to be really fast when creating those kind of compilers.
I will dig more into performance and try to profile both tools in order to improve mine.
It's definitely popular but “only viable alternative” is a bit strong: that's only if you need compatibility with particular tools which support only one of the two formats. There's no reason why anyone who doesn't like those tools couldn't create a different syntax to scratch whatever particular itch they have.
It's embeddable and available as a library for all languages [0]. Everything else is nothing but an CLI tool pretty much, which further limits its adoption.
Well, there is XPath 3.1 if you want standards[1] but my point was simply that it depends on whether your question is “I need compatibility with existing jq scripts”, “I need an embeddable library I can integrate in other programs”, or “I want to process JSON for my own usage”.
For example, someone who works with a lot of Python might prefer something like https://github.com/kellyjonbrazil/jello to write comprehensions using the full capabilities of Python, especially since that would provide a direct path to using the final expressions in a Python program or even embedded in one of the environments where Python is used as a scripting language. Is that a viable alternative? The answer depends entirely on who's asking.
https://github.com/fiatjaf/awesome-jq
https://github.com/TomConlin/json2xpath
https://github.com/antonmedv/fx
https://github.com/fiatjaf/jiq
https://github.com/simeji/jid
https://github.com/jmespath/jp
https://github.com/cube2222/jql
https://jsonnet.org
https://github.com/borkdude/jet
https://github.com/jzelinskie/faq
https://github.com/dflemstr/rq
Personally I think that next time I might just fire up Hy and use its functional capabilities.