Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
K Fragments (beyondloom.com)
42 points by lelf on March 19, 2020 | hide | past | favorite | 35 comments


If this was interesting, check out Eugene McDonnell's "1000 K idioms" (in k2): https://web.archive.org/web/20071230205056/http://kx.com/tec...


So this:

> {.+(k;({x@&~_n~/:x}x@')'k:?,/!:'x)}

Turns a list of dictionaries to a dictionary of lists. It is clearly amazingly expressive as you can do so much in just half a line of code.

However, for someone not used to APL, this is utterly and totally unreadable. I can pick pretty much any ML derivative, C-style or lisp language and make heads and tails of most code, but with the above line I don't even know where to start. Might as well be hieroglyphs :)


Sorry. This page wasn't really written as an introduction; more a collection of ideas.

To break that example down for you, the outer {} indicate that this is an anonymous function. It is implicitly a function over a single argument named "x"; our list of dictionaries. Expressions read right to left, so the first semantic unit is:

    k:?,/!:'x
The variable "k" gets the distincts (?) of the concatenate-reduction (,/) of the keys (!:) of each (') dictionary:

      !:'x
    (,`a
     ,`b
     `a `b)

      ,/!:'x
    `a `b `a `b

      ?,/!:'x
    `a `b

The other embedded lambda "{x@&~_n~/:x}" is read as "x indexed (@) by where (&) not (~) null (_n) matches (~) each-right (x)"; it filters nulls from a list.

      _n~/:(22;;33;55)
    0 1 0 0

      ~_n~/:(22;;33;55)
    1 0 1 1

      &~_n~/:(22;;33;55)
    0 2 3

      {x@&~_n~/:x}(22;;33;55)
    22 33 55
Then the overall construction "{x@&~_n~/:x}x@'/:k" can be understood as "reach into each dictionary with all the keys and filter them to non-null values". Now you have a list of lists of values in the order that the keys appeared.

Finally, ".+(k; ... )" joins our keyset with the values we found for each key. Make dictionary (.) from the transpose (+) of the pair of vectors (k;v).

Does that help?


At a glance, I would think someone was programming using just regular expressions. Or maybe speech-to-text translation of line noise.


It looks way too long in the number of different tokens.

  This is the TXR Lisp interactive listener of TXR 233.
  Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
  1> (defun merge-hashes (hlist)
       (let ((hout (hash)))
        (each ((h hlist))
          (dohash (k v h) (push v [hout k])))
       hout))
  merge-hashes
  2> (merge-hashes '(#H(() (a 1) (b 2) (c 3)) #H(() (b 5) (c 7) (d 8))))
  #H(() (c (7 3)) (b (5 2)) (a (1)) (d (8)))

Now here is that function fully code golfed with unnecessary spaces removed and all symbols one character long:

  (defun m(x)(let((o(hash)))(each((h x))(dohash(k v h)(push v[o k])))o))
That's down to 70 characters: only about double the K3 size.

If defun and the other built-ins were one character long, it would be down to 50 chars:

  (d m(x)(l((o(H)))(e((h x))(D(k v h)(p v[o k])))o))
Now we have a fair comparison where we have leveled the field, eliminating the difference due to whitespace elimination and token condensation.

Though still significantly by raw character count (35 versus 50), there is less clutter in it. Also, it defines the name m whereas the {.+...} syntax needs a few more characters to define a function, I think.

But wait; that's far from the shortest code necessary. What I wrote is a decently efficient way of doing it which iterates the input hashes imperatively and builds up the output. It can be done in other ways, like this:

  1> (defun merge-hashes (hlist)
       [group-reduce (hash) car (op cons (cdr @2) @1) [mappend hash-alist hlist]])
  merge-hashes
  2> (merge-hashes '(#H(() (a 1) (b 2) (c 3)) #H(() (b 5) (c 7) (d 8))))
  #H(() (d (8)) (c (7 3)) (b (5 2)) (a (1)))
We can obtain a flat "assoc list" of all the key value pairs from all the dictionaries by mappend-ing them through hash-alist. hash-alist retrieves an assoc list from a single dictionary, and we map over that, appending these together.

Then we can group-reduce the assoc list; group-reduce populates a hash by classifying elements from an input sequence into keys, which denote individual reduce accumulators. So all the a elements are subject to their own reduce job, so are the b elements and so forth. The reduce function is (op cons (cdr @2) @1); an anonymous function that takes the cdr (value element) of each pair (that pair coming as argument 2), and conses it onto the accumulator (coming in as argument 1), returning the new accumulator. Since nil is the empty list, and fetching a nonexistent hash key yields nil, the accumulation can boostrap implicitly from an empty hash.

Now if we were to remove all unnecessary whitespace and give a one-character name to everything, we now get:

  (d m(h)[R(H)c(o n(r @2)@1)[M L h]])
That shows there is a potential to get this down to 35 characters, with suitable function/operator and variable names.

Moreover, these 35 characters are yet easier to parse because all the parentheses/brackets are there: they comprise 14 out of the 35 characters! And we still have 5 spaces. So 19 characters out of the 35 are punctuation having to do with shaping the syntax; 16 are semantic.

If you remember that R is group-reduce, you know that its arguments are (H) c (o ...) and [M ...]. There is no guesswork about what goes with what.

The above K expression is actually verbose because it doesn't use very many characters for shaping the syntax tree. Most of the characters you see do something.

Languages like K leverage their terseness not just from compression of the input notation, but from the semantics of the operations. But they do not have a monopoly in semantics. Good semantics that enables terse programming can be found in languages that don't use terse notations at the token level.

The K3 example we have here is not leveraging good semantics; if that's the best that can be done, it suggests that k doesn't have a well-rounded library of operations for concisely manipulating dictionaries. (Could it be that the insistence on one-character names creates a pressure against that?)

It's better to have thousands of functions with descriptive names, and then be able to choose the best ones for expressing a given problem tersely, than to reach for one-letter naming as the primary means for achieving terseness, and then try to pull a one-size-fits-all library within that naming convention.


I think that k3 example does illustrate weaknesses of k3. At that time, dictionary types were very limited and inexpressive. Pretty much all you could do with them was index into them and destructure them.

In k5/k6, dictionaries had a more complete algebra. If you only wanted to take the union of a set of dictionaries (more common) it's just raze:

      ,/([a:1];[b:2];[a:3;c:4])
    [a:3;b:2;c:4]
The full problem could be expressed as

    {k!{x@&~^x}'x@'/:k:!,/x}
Note that there's no longer any clumsy transposition to assemble the dictionary; generally much less nesting as well. The syntax for creating symbol-keyed dictionaries is also much nicer.

Something that apparently I left on my workbench for oK was the "adverb form" of drop (_), which would let you express filter more directly:

    {k!(^:)_:'x@'/:k:!,/x}


I added some arguments to TXR Lisp's hash-uni (which merges two dictionaries) to be able to do this. It already has an optional join function to resolve values with clashing keys. Now it will take two additional functions to map the left and right values.

If we have a pair of dicts with values that we'd like to list, we can do [hash-uni h1 h2 append list list]. The values are turned into lists, and clashes resolved with append.

A list of hashes could be merged into lists with left-reduce, like this.

  [reduce-left (op hash-uni @1 @2 append : list) hashes #H()]
Our initial value is an empty hash #H(). The reduce kernel function is hash-uni. The values of every has in hlist are turned into one element lists, and then accumulated onto the lists in the hash merged so far with append. The initial accumulator is the empty hash.

It's more than vaguely monadic: the two functions are like "unit" and so on.


Yes, but your comparison isn't fair. How many distinct functions does TXR Lisp have? I guarantee you that k has less.

Moreover, if I can do something in a couple of characters, why do I need names? This trend towards huge stdlibs and intellisense and heavyweight IDEs is ridiculous. I can write k or q with notepad if I felt like it. The ability to understand the entire language and never have to reach for documentation is very powerful.

I've noticed you criticizing k and array languages in general in some other threads so I'm going to ask you a genuine question: do you have any experience with them? Not even professional experience, just like, spent a weekend programming something cool? Because before I tried them I was very dismissive and now I absolutely adore k.


> How many distinct functions does TXR Lisp have?

A few more than it had a month ago. When you can name things in any way you want, the sky is the limit. You have to watch the image size and startup times, of course. Nowadays you can cram a lot into something that is tiny by modern standards.

> Moreover, if I can do something in a couple of characters, why do I need names?

Firstly, those characters are names.

Secondly, you need bigger names than one character if you want more functions, so then you have a better choice of functions for writing programs that are smaller in terms of the number of operations they combine.

Note also that my second example:

     (defun merge-hashes (hlist)
       [group-reduce (hash) car (op cons (cdr @2) @1) [mappend hash-alist hlist]])
doesn't introduce any names, other than the one it is defining, merge-hashes and the local variable/parameter hlist.

Everything else is a built-in, resulting in "point-free" style.

If by "why do I need names" we mean locally introduced names for intermediate values, I completely agree. I don't need to introduce names if I can achieve the calculation just by a functional combination of operations.


I mean, I'm just saying that most functions are unnecessary. We don't need them, and they actually make things worse. For example, I just counted, and q has a total of 209 distinct functions in the language (and that's pretty large for some apl/k purists). And it's not like the language is lacking functionality, additional functions are simply not necessary. Compare that to really any other non-trivial language, and the difference is pretty stark. Like, look at Common Lisp, for example.


For those of you fluent in K, when you glance at these, is their function apparent, or do you have to sit down and diagram them out?


Varies. I think among these examples "is interleave?" and "limit run length" require some thinking, or a bit of playing in the REPL. Most of them are pretty easy to follow, though.


Can someone tell me what K is about? Looks interesting, is new to me.

I have been thinking about my use of Perl, shell scripts, (I used to live in Bash) learning PowerShell this week.

My solutions tend to converge towards heavily-commented one-liners.

Then I read Wolfram's 12.1 release note; Mathematica started as a Lisp for maths. Stephen Wolfram writes these exasperating things of polished beauty that seem incomprehensible without pages of notes.


I would recommend getting started with q, instead of k. q is a lot easier to get started with and after you learn the basics (takes maybe a day or two), learning k is much easier. A good resource is: https://code.kx.com/q4m3/

k/q aren't actually very difficult to learn, you can memorize the entire language in a period of a day, maybe. The hard part is actually writing good code with it. I could spend the rest of my life programming in k and never be 1/10th as good as Arthur Whitney (the creator).


K is a vector language. There is indeed some overlap in ideas with Mathematica and Lisp, but it draws most heavily from APL. There are a handful of decent open-source or at least freely available interpreters online:

https://ngn.bitbucket.io/k.html


It's so weird having one k post a day, like clockwork.


I think people are getting excited by Shakti (or maybe it's coordinated?). I'm hoping array programming is going to break into the mainstream, it's remarkable that even many experienced programmers know nothing about array languages. And when exposed, most just dismiss them out of hand as 'unreadable' or 'line-noise.'


Array programming is already mainstream -- matlab, python/numpy, R are used daily. The first time one does it, it is very enlightening -- "what do you mean I can operate on millions of entries without loops"?

The only thing which is unclear is why would one want to use unreadable line-noise languages for that.


> The only thing which is unclear is why would one want to use unreadable line-noise languages for that.

I find this attitude unrespectful and not very productive.

In all these threads, many of us try to explain why we find this way to write software convenient, readable, and powerful. Then, people who do not even consider giving it a try, make comments like yours. This does not create a good atmosphere.

If you are not interested, move on to another story. If you really want to understand it, you will need to invest some time, and we will be glad to answer any question.


I like array programing, and I think that it is an extremely nice and convenient tool for the right task. The grandparent is right, “it's remarkable that even many experienced programmers know nothing about array languages”. I wish more people learned about them!

And that is precisely why I find statements like “I'm hoping array programming is going to break into the mainstream” very annoying - it is not 1970’s anymore, you don’t get to pretend your only options are APL or write matrix multiplication by hand, every time. There are entire university departments full of people who only know matlab. You cannot just pretend they don’t exist.

I wish there was an modern tutorial which compares K to something like numpy or matlab.


> I wish there was an modern tutorial which compares K to something like numpy or matlab.

That's a very reasonable request. In my opinion, one of the problems with array languages is the lack of introductory material.

If you know a good tutorial that includes numpy and Matlab, I may try to add k and/or j. Any suggestion?


They are clustered probably because HN readers see one story, and then find and submit similar content that they encounter.

The same thing happens with gwern or rachelbythebay blog posts. If you see one, you will see more in the next week or so. But then it dies off


Off topic, but does anyone know if Shakti is going to have a q like interface? I was under the impression that Whitney decided that having operators that can be both monadic and dyadic was a bad idea?


What is Shakti?


https://shakti.com/ It's the next evolution of kdb+/q.


You can get the binary releases here, https://anaconda.org/shaktidb/shakti/files


What limitations are there with those releases?


It's highly restrictive. If you like array languages, you should use J. It is a superior general-purpose language with GPL licensing.

https://www.jsoftware.com

vs.

https://shakti.com/license.php

Subject to Customer’s ongoing compliance with this Agreement, Shakti hereby grants to Customer a non-exclusive, non-commercial, non-transferable, non-sublicensable, revocable, limited license for a period of 30 days (the "Evaluation Period") to download, install and run the Software for Customer’s evaluation purposes and for internal, non-commercial use only.


Eh. The performance and the notation is important to me, and I haven't seen a lot of benchmarks (or any...) indicating that J is particularly fast. And from what I understand, even Iverson eventually decided that J was a step backward from his notation-first APL.

But mostly what I'm wondering is if those downloads are 32bit-only, or time-limited in the software, or memory limited, or if it's a winzip-esque honor system limitation. Not as interested in the license.


I don't know about shakti, but kdb+/q has two different free versions. The 32-bit version is unlimited and you can use it for commercial stuff. The only restriction is that you can't run it on the cloud (whatever that means). The 64bit version is non-commercial and you must have an active internet connection to use it. In practice, I haven't found the 32bit version to be very limited, since you can still do IPC very easily.


>>The 32-bit version is unlimited and you can use it for commercial stuff. The only restriction is that you can't run it on the cloud (whatever that means).

The software license agreement for the 32-bit version specifies non-commercial use, and also still has the software time-out clause where Kx "reserves the right to insert a time-out function into the 32 Bit Kdb+ Software that will cause it to halt after a specific period of time (e.g. two hours), requiring the software to be re-started."[1]

[1] https://kx.com/download/


You won’t see any real performance comparisons because any benchmark related to kdb+ is completely restricted by its company. J without the database (Jd) is dual licensed with gpl as an option, so you can try to compare the base language by yourself for your application. But Jd again is free only for non-commercial use only.

Those shakti downloads seems to be fully functional.


Shakti has an arbitrary memory limit (like 4 GB or so).


Any idea if there's documentation anywhere?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: