Hacker News new | past | comments | ask | show | jobs | submit login
CDuce: XML-oriented functional language (cduce.org)
81 points by mucholove on Nov 24, 2019 | hide | past | favorite | 23 comments



I've actually used CDuce in production. It was certainly interesting, a kind of "XSLT/XPath done right". However it had major drawbacks such as being very difficult to deploy and not having any easy way to integrate with existing OCaml code (ocamlcduce was a camlp4 extension which tried to integrate the two but was tricky to get going). For example you can really easily and powerfully parse XML documents, but you can't save the result in a database or even to a flat file (even printing the result is hard).

I should say that it's no longer developed and no longer works with any currently available versions of OCaml.

Nowadays I use libxml2 from OCaml instead and just insert untyped XPath expression strings into my code, so that's a step backwards from the point of view of safety, but at least it works.

https://web.archive.org/web/20101125015934/http://merjis.com... http://ocsoap.forge.ocamlcore.org/


Lately I've been having some luck with using parser combinators to parse XML in OCaml. I just define a lazy list type that takes input from Xmlm's streaming decoder and a few parsers for xml elements (`data : string parser` and `el : string -> 'a Attr.parser -> 'b parser -> ('a * 'b) parser`), plus all the usual combinators. It's probably not as fast as using libxml2 but at least it's type safe.

EDIT: The Attr.parser above uses the same combinators but it operates on a simple list of Xmlm attributes using a function that takes an attribute name and returns the value of that attribute, if found, and the list without that attribute.


Cool! What was the motivation behind CDuce? I’m super interested in exploring the language because all the new things look like old things. For instance—SwiftUI looks like XML and therefore why couldn’t it have been done with Nibs? XML seems like a treasure trove.


CDuce was the result of my PhD thesis (about 20 years ago); mostly just a research prototype with enough engineering efforts to make it usable for small enough projects. It came after XDuce, which introduced the idea of building a functional language around regular expression types (used to XML schema languages, DTD, XSD, Relax). My work focused on distilling the theory from XDuce into more primitive constructs from type theory (products, unions, recursion), and embedding them into a more expressive type system and language (with set-theoretic intersection and negation, function types, extensible records -- used to model XML attributes, etc), also with a powerful XML pattern matching engine and an efficient implementation of type-checking (just deciding subtyping is in theory exponential in the size of schema, but works well in practice). The theory could probably be used to serve as the basis of statically-typed languages working, on, say, "typed" JSON structures. The work was/is continued by my PhD advisor and other colleagues to include parametric polymorphism (original CDuce supported ad hoc overloading polymorphism only).

The idea was just that if your language could directly express constraints on your document types in its native type system, the compiler could directly type-check statically complex transformations and make sure they produce documents from the expected output schema (assuming the input complies with the announced input schema). This is more direct than having to rely on mapping between XML and "native" data types, which (usually) don't fully preserve constraints imposed by XML schema languages, and are themselves tedious and fragile to write. This works well for XML->XML transformations. Of course, in most applications, XML parsing and/or generation is just a tiny part, which shouldn't affect the choice of an implementation language. With OCamlDuce, I explored the idea of extending OCaml to include XML types. The combination felt a bit ad hoc, but was ok. Today, it could be rebuilt indeed about PPX extension points + some type-checking hooks in the OCaml compiler.


OCaml and Haskell people are used to everything being strongly typed. So they dislike it when the languages we use to query things (XPath, SQL, etc) are basically string blobs embedded untyped in their programs. Hence the search for typed approaches, especially ones which safely cross between the programming language and the DSL. (For PostgreSQL I wrote a type safe OCaml extension called PG.OCaml which type-checks the SQL and integrates it into OCaml code). I'm not familiar with SwiftUI to talk about it.


I actually spent the last week looking for something like XPath done right, because I'm doing a lot of xml based tree processing. I resigned to implementing something on top of the generative xpath VM, but CDuce looks amazing!

How difficult would you say would it be to bring it up to date with the current ocaml? How did you solve the output problem in production?


I did not try it myself, but here's few ideas: Scala has good support for pattern matching and Scala has dedicated syntax for XML literals. It might allow for very concise code for XML processing.


I'm also exploring that avenue, but with Haxe, which also support xml literals, macros and pattern matching, but can target any language.


Do you mean xslt? Xpath always looked like the right-est xml technology to me, but xslt like an interesting idea hampered by a horrible implementation language.


Well I guess I'm just hooked on jq. Functional pipelines with named blocks you can recall. Here's a recent script I wrote: https://gist.github.com/turbo/36e87947a56cfaacec9d0356b3e521...

I'm mad that I don't have that for XML. I do a lot of tree and graph processing in Postgres and xpath lends itself well for that. However debugging sucks! Also, since you can't store results along the line (like in jq), querying order dependent pairs needs a union. Not so in xpath 2 but PG dropped work on XML right when Json was ramping up. I'm still mad! There was a discussion whether to integrate Zorba into Postgres (which speaks JQ and XPath, using the same type system and query engine) but that never got any traction. So xpath was bastardized into its "modern" variants but something (subjectively) intuitive like jq never developed for it. This is where I see the need for CDuce.


> Well I guess I'm just hooked on jq. Functional pipelines with named blocks you can recall. Here's a recent script I wrote: https://gist.github.com/turbo/36e87947a56cfaacec9d0356b3e521...

> I'm mad that I don't have that for XML.

Really not fond of that (past the first 3 lines I'd rather break out a proper scripting language), but it feels closer to xquery.


ocamlcduce (CDuce + camlp4 syntax extension) is probably the thing to concentrate on / rescue because as I mention in the comment above that's the only really practical way to use cduce.

Unfortunately OCaml itself has moved beyond camlp4 extensions, so you'd have to port the whole thing to use the new OCaml PPX extension points. That in itself is a significant chunk of work. (You could also stick with camlp5 which is a fork of the old camlp4, but incompatible in other ways, so still a lot of work). Then you'd have to keep maintaining it. So a large amount of work, and a lot of ongoing work too.

At the end of the day too CDuce is not a standard way to manipulate XML, and XPath and XSLT are. I think the work would be better spent on fixing up one of the two existing type safe XPath implementations and going from there. See: http://alan.petitepomme.net/cwn/2018.06.12.html#1


EDIT: Sorry, I overlooked, that you are looking for something to plug into OCaml. I let this comment here, nevertheless, since I put so much work into it ;-)

> I actually spent the last week looking for something like XPath done right

Have you tried XQuery? XQuery is a superset of XPath, currently (like XPath as well) at version 3.1. There is three major processors:

- eXist-db (much more than just an XQuery processor, but I believe, it can be used stand-alone)

- Saxon (only the PE/EE editions (commercial) implement current standards, SaxonHE, however, should fully support XQuery 1.0 (which is based on XPath2)

- BaseX (my favorite) Can be used as stand-alone, in REPL mode, behind a webserver (Jetty, Tomcat), or a very easy to use and simple but powerful IDE, and is extremly well done. Simple install.

    declare function local:make-dynamic-element-node(
      $name as xs:QName,
      $attributes as map(*),
      $contents as item()*) 
    as element(Q{http://example.org/ns/joecool}joecool) 
    {
      element { $name } {
        map:for-each($attributes, function($key, $value) {
          attribute { $key } { $value }
      })
      , $contents
      }
    };
    
    declare function local:make-literal-element-node(
      $foo as xs:string?,
      $bar as xs:string?
    )
    as element() 
    {
      <joecool xmlns="http://example.org/ns/joecool" foo="{$foo}" moo="{$bar}">lorem ipsum</joecool>
    };
    
    
    let $at := map{
      "foo": "bar"
    , "moo": "test"
    }
    return (
             local:make-dynamic-element-node(QName("http://example.org/ns/joecool", "joecool"), $at, "lorem ipsum")
           , local:make-literal-element-node("value1","value2")
           )
    
    Result:
    
    <joecool xmlns="http://example.org/ns/joecool" foo="bar" moo="test">lorem ipsum</joecool>
    <joecool xmlns="http://example.org/ns/joecool" foo="value1" moo="value2">lorem ipsum</joecool>
Of course, this is just a small glimpse on this language, which can be typed or not, just as you prefer (I prototype untyped, then increase typing incrementally and release fully typed). It is a functional application programming language, that does great as backend, REST is supported and so much more. Just a glimpse on the build-in functions of BaseX (these are in addition to the XPath 3.1 function catalog, which has approx. 200 functions, now) http://docs.basex.org/wiki/Module_Library


I sometimes use Basex as an XSLT/Xquery processor from Python. It is possible to do this with Saxon, but it is little easier to use Basex. Maybe someday we'll get proper Python bindings for Saxon-C, but last I looked, it hadn't been done.


Oleg Kiselyov's ssax for scheme is very neat. I have been using it for some time as a more easily programmable XSLT. I heard a rumour that there is a Haskell port as well.

Anyway, here is the web page: http://ssax.sourceforge.net/


A bit off-topic :

Recently started wondering what would be the preferred language to code some general "web service plumbing" tool : quickly generate client to access a web api, as well as easily provide data mapping and data "passing" from endpoints to endpoints.

I started investigating language that would let you create DSLs, but i'm wondering if this field hasn't already been explored, and some tools created for that purpose.


This is the JSX before JSX was a thing.



You must be thinking of Coldfusion.


That name may have somewhat unfortunate connotations in Italy…


Not sure about intent, but I assumed it was at least partly named in analogy to XDuce: http://xduce.sourceforge.net/


It doesn't sound all that great in English either...


Out of curiosity clicked on a page (the scars of XSLT and XPath are almost healed), but complete lack of any kind syntax highlighting in code examples (even books 20 years ago had bold/italic/etc) immediately made me close it




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: