I'm not up to that, but I have a working S-Tree builder and basic AVX2-using `lower-bound` for it (as described in the Algorithmica article linked to in the post) up and running in SBCL. Haven't played with any of the fancier optimizations yet, much less done any benchmarking. Should put it up in a gist or paste site to share...