The multicore team has done an excellent job of communicating their plans and progress over the past few years. Innovative research, solid engineering, and a top-notch communications game. Whether or not you care about Ocaml, many other SE projects could learn a lot from studying this team.
My wish for Ocaml is that it could somehow be popular enough to have more… casual users, let’s say. I enjoy playing with it more than any other language I’ve run into, but everyone else falls so deeply into it that I don’t generally follow their discussions very well. (Maybe I’ll fall in deep myself in time, but it remains to be seen.)
I don't know if you've seen this language yet and I haven't used it, but Grain seems to take a lot of inspiration from Reason. Seems like they are trying to target a more casual audience.
With all the problems arising from leaky abstractions and trying to adapt to a foreign runtime? Imho there's no point. Especially now that OCaml Multicore is on the way. A much better effort, for anyone interested, would be to vastly simplify the OCaml build and package management story to be more Go-like.
> A much better effort, for anyone interested, would be to vastly simplify the OCaml build and package management story to be more Go-like.
OCaml's package management and build system is not complicated for consumers and builds are very fast. What do you feel is complicated about opam and dune?
What is complicated (but got better) is submitting packages into the official package repository. However, this ensures that packages have correctly versioned dependencies, which is good for consumers.
Go does not rely on a central package repository and this makes it easier to use by essentially just pointing to GitHub. OCaml and Go differ in the way they try to use updated dependencies for a build. OCaml by default is aggressive whereas Go prefers stability.
> With all the problems arising from leaky abstractions and trying to adapt to a foreign runtime?
Huh? The point is Go might be a lousy language, but I can't think of anything OCaml needs to do that the Go runtime cannot do. This wouldn't be leaky and the FFI could be quite good.
> Especially now that OCaml Multicore is on the way.
As I wrote in the other reply, this would be a naked ploy for new users, and flexing the language isn't wedded to a single run time.
Multicore OCaml isn't just a runtime change, but also demonstration of the new effects indicating that the language can do parallelism without bad concurrency problems. All that language-side work carries over to the other runtime: you get to show off Goroutines done more safely!
Does Go's runtime do generational collection? No. Does it support tailcalls? I have no idea but I doubt so. Without these there cannot be any chance for OCaml-on-go; more blockers would probably come to mind quite quickly.
The GC would be shitty, yes. I doubt the runtime itself really cares about tailcails. There could be tiny calling convention issue, you can share the runtime while using a different calling convention, I assume.
People forget that the call stack data structure naturally allows tail calls, and restrictions against it invariably have a certain degree of artificiality.
Not sure about now with multicore, but pretty sure OCaml had a "bump" memory allocator previously. These are just a little bit slower than stack allocation as they just involve a pointer increment and is important in functional languages since they allocate a LOT. Go's memory allocator is pretty fast (I played with making a bump allocator for it and it was only 4-5x faster, so very fast), but the slowdown would be noticeable. Also, Go uses a "mark and sweep" style collector (which is the reason it can't use a bump allocator - those require moving on collection, which Go can't do) which just isn't designed for the amount of garbage that FP languages generate. It might work, but it would be quite a bit slower.
Go uses a simple non-generational collector because it avoids generating garbage on the heap by doing escape analysis and allocating on stack. I get the ocaml does a lot of recursion, being functional. But could stack allocation also work for ocaml when using growable stacks?
c-cube mentioned a couple, here's another: algebraic effects, or to put it another way, 'resumable exceptions'. Go doesn't even have exceptions natively. The level of indirection this alone would introduce to handle it, would make it an extremely leaky abstraction.
Beware that in your naked grab for users, you don't disappoint them and end up actually driving them away.
This is a very bad idea considering Go runtime is optimized for way less allocations due to 1) how good Go in stack allocation is and 2) how mutable everything is.
OCaml on the other hand has an extremely well optimized runtime for lots of small heap allocations (typical in FP), and produces good asm code for functional patterns.
In that case interoperability with Go libraries might be better: just running on the same runtime wouldn't bring many advantages (other than having a single GC instead of 2) without being able to easily call libraries written in another language.
(Although TBH I'd run them as 2 separate processes and have them communicate via some form of RPC, OCaml would just need good integration with <insert-favourite-GO-rpc> protocol)
Agree. If Ocaml add 10 more users, it would more than double its users in the world. Also it is never too late to remind everyone again that JaneStreet uses Ocaml. If Go were any good JS would have used that.
With F# or TypeScript or similar things you can compile down to a common bytecode or to the base language and use the result. Unless you were writing an OCaml to Go transpiler you'd need to ship a copy of the Go source code alongside a new OCaml compiler and still have to build a new build system and ecosystem and such. You couldn't just bolt this on to an existing Go installation or something.
Is the previous plan for adding effects in OCaml 5.1 still in place?
I am very excited not only for the native parallelism that's coming in 5.0 but also about the effect handlers! I am sure many others are looking to start creating very interesting things with them, e.g. Rust-alike async capabilities or Erlang's preemptive green threads / actors runtime.
Oh. I haven't followed lately. Thanks a lot for the updated info!
Really looking forward to what the community will build with OCaml 5.00! IMO it will shoot the language straight into the mainstream and I can't wait. OCaml is applicable for like 90% what's out there, including as a Python replacement. And its syntax is often much terser than Rust (although the higher-level typing constructs can be confusing to read).
After that, all that's left is a tool like Elixir's mix or Rust's cargo and the language is basically not only in the 21st century but much farther than many others! Looking forward to it.
This so much. Mix is so pleasant to work with, but getting an OCaml project off the ground requires messing with dune + esy if you want consistent and isolated package management. It's a massive pain.
Yes, but note the absence of syntactic support. Effects will be usable for writing libraries and experimentation, but routine use won't be very ergonomic until a later release.
You may find the “nnpchecker” configure option in 4.13.0 useful. This will print a detected use of naked pointers to stderr, which can hopefully be triggered by test suites.
Noalloc and registering roots should all be the same, as are callbacks (for sequential code, which all existing code will be)
To echo what avsm said, our intention is to preserve the C API. With the exception of naked pointers, if you follow the rules around the existing C API then your extensions should continue to work in sequential code running on 5.0.
We have a scheduled build and test of every package in opam with multicore: http://check.ocamllabs.io:8082/ to try to shake out C API incompatibilities and that's proved fruitful.
If you do find things that don't work on 5.0, please let us know (and if you can, get it in to opam so we test it automatically!).
I think the challenge will be how to convert existing C FFI/stubs to work correctly with code that wants to take advantage of multicore(domain) capabilities.
i.e. previously some code could've in theory assumed it only ever gets called by one thread at a time due to the global lock (and that lock would get dropped in a controlled manner by the usual released/acquire). I don't necessarily mean just global state in the C stubs (I don't recall seeing many of those), but implicit global/shared state in the C functions called.
Reading manpages will usually tell whether a given function is safe to be called (or when not there are sometimes _r variants that are).
This is no different than when writing regular multithreaded C code, one has to be more careful about what functions they can safely call. (and in fact current OCaml code can already be multithreaded, so if the C stub was meant to work correctly with multithreaded OCaml code with C->OCaml callbacks then it should've already taken necessary precautions).
I like the approach taken wrt to backward compatibility of sequential code though: only code that wants to take advantage of multicore will need to audit its C stubs for safety, if you don't use multicore everything will work as before.
How do you recommend to find these kinds of multicore-specific bugs in existing (or newly written) OCaml-C stubs?
Would tools like parafuzz help here?
I've searched a bit for this and haven't been able to find a good answer. Let's say that I have m tasks that I want to run concurrently on n cores, for example handling HTTP requests or searching for something in files. I'd also like to not have to manage manually how the tasks are going to be distributed. Basically, have the same experience as when writing Go code. Is there a way to do this in multicore OCaml? I've found the task pool in domainslib [1] but I'm not sure if it's what I'm looking for.
You could use eio with an event loop per domain and the domain manager to distribute work to other domains. The restriction at the moment is that the tasks you spin off to other domains can't do asynchronous io.
There is work on-going at the moment to bridge or even unify eio (concurrency via effects) and domainslib (nested parallelism via domains and effects) but it's a few months out.
- domainslib schedules all tasks across all cores (like Go).
- eio keeps tasks on the same core (and you can use a shared job queue to distribute work between cores if you want).
Eio can certainly do async IO on multiple cores.
Moving tasks freely between cores has some major downsides - for example every time a Go program wants to access a shared value, it needs to take a mutex (and be careful to avoid deadlocks). Such races can be very hard to debug.
I suspect that the extra reliability is often worth the cost of sometimes having unbalanced workloads between cores. We're still investigating how big this effect is. When I worked at Docker, we spent a lot of time dealing with races in the Go code, even though almost nothing in Docker is CPU intensive!
For a group of tasks on a single core, you can be sure that e.g. incrementing a counter or scanning an array is an atomic operation. Only blocking operations (such as reading from a file or socket) can allow something else to run. And eio schedules tasks deterministically, so if you test with (deterministic) mocks then the trace of your tests is deterministic too. Eio's own unit-tests are mostly expect-based tests where the test just checks that the trace output matches the recorded trace, for example.
Thank you for the clarification. So if I understood correctly, distributing independant jobs that use async IO between core is okay? For example, I recently wrote a program that reads all the files in a folder, calculates a hash of their contents, and then rename them as their hash. I did this in Go. Since all operations are "perfectly independant", I only had to do synchronization for whole program stuff: make sure the program doesn't exit when goroutines are sleeping, and use channels to not exhaust the file descriptors. From what I understand, I can launch every opening-hashing-renaming operation in a separate goroutine, and Go will take care of making everything async and multicore at the same time.
Now let's imagine I want to do the same program in OCaml. I think my options are:
- on current OCaml, thread-based concurrency but no parallelism
- on current OCaml, monadic concurrency (Lwt, Async) but no parallelism
- on multicore OCaml, direct/effect-based (I'm not sure what's the right word) concurrency with eio, which is deterministic. If I want parallelism here, I have to explicitely create and use a shared job queue, while the Go runtime does this implicitly. Since the standard library Queue is not thread safe, I would have to use Mutex to avoid concurrent access.
Is this correct? I've read the eio documentation but it's hard to wrap my head around all of that without examples. I've found the Domain_manager which looks like what I want. For example, I could have the main thread fill the queue and for each core available, I could launch a Domain_manager.run toto, with toto taking jobs from the queue that would be shared between all domains?
I continue to be rather impressed by raku's 'Atomic Int' type plus the atomic operators (and while normally I dislike emoji in identifiers, them using the atom symbol to make the atomic operators stand out when debugging actually seems rather neat in this scenario).
I think libraries like Lwt will be the ones to offer the level of abstraction which you're describing. I too would like to see the simplicity of goroutines make it into OCaml 5 ASAP.
This is only glancingly related to the topic of multicore but I figure I’ll venture the question anyways:
There were three big reasons pushing me towards other languages (C++, Rust) for a certain class of throughput-focused workflows:
1. Lack of multicore execution
2. Lack of control over how many bytes a datatype is (useful for many things, like ensuring something fits in a machine word or can avoid chasing a pointer in a loop)
3. Difficulty controlling where memory is allocated/copied vs moved (though some of this is less relevant when everything uses reference semantics and mutability is tracked with ref/mutable)
I’ll be glad to see (1) fading into history! Do you have any personal tips/anecdotes for memory optimizations in practice, or any suggestions for something to read/follow/search?
A bunch of questions: Is multi-threading happening at the level of user-defined functions? How are threads scheduled? What's the underlying method or library for enabling multi-threading (Cilk, openMP, some LWT library...)? To what extent has this level of granularity been tested against other levels of nested parallelism (e.g. SIMD or otherwise parallel operators)? Have you tested performance by OS, and if so, have you noted any necessary OS-level modifications related to thread management? Is this part of a broader roadmap for accelerator integration?
1. Domains are the unit of parallelism. A domain is essentially an OS thread with a bunch of extra runtime book-keeping data. You can use Domain.spawn (https://github.com/ocaml-multicore/ocaml-multicore/blob/5.00...) to spawn off a new domain which will run the supplied function and terminate when it finishes. This is heavyweight though, domains are expected to be long-running.
- Is merging this PR going to be more or less a formality now? I'm assuming subsequent PRs will make further improvements/fix any further bugs discovered. When is the merge to trunk likely to happen?
- (Not that it really matters, but I'm curious). Will this PR be merged as-is with nearly 4000 commits or will it be squashed? The history exists in the multicore repo but will the history be brought into the ocaml repo?
1. Before multicore got to this PR stage it went through two phases of detailed review by the core team. A summary of this is on November's Multicore Monthly: https://discuss.ocaml.org/t/multicore-ocaml-november-2021-wi... . The tasks coming out of that which are marked post-MVP will be follow-up PRs before the 5.00 release.
2. That is unclear at the moment. There's a lot of useful history in those commits (which link out to issues and PRs on ocaml-multicore's repo) but at the same time, it also includes a lot of experiments that were ultimately backed out.
It's similar to Rust in certain respects, but it feels very different in several others.
* There's no ownership tracking. This might be obvious - it doesn't need it, it has GC - but is actually kind of unfortunate; even in GC'd languages, linearity can be used to safely model state machines/typestate or for optimization.
* There's no traits/ad-hoc polymorphism at all.
* There's no impls, though a common pattern with a similar feel is to define types inside modules with the actual implementation of the type hidden, and provide an interface via functions to create or interact with instances of the type.
* It uses exceptions for error handling (when it doesn't use an option type) and even as a lightweight control flow mechanism. And now there's algebraic effects as well.
* It has an extraordinarily powerful module system, with functors (pass modules into other modules to produce new modules) and first-class modules (modules that exist as values at run-time that can be passed around to functions).
* It has some more advanced type machinery - GADTs, polymorphic variants, and a very interesting, powerful, almost completely unused object system.
* This bleeds a little into "ecosystem" and a little into "no ad-hoc polymorphism" and a little into the oddness of the syntax and a little into subjectivity, but in general doing really bog-standard imperative stuff in OCaml, while possible, tends to feel more awkward than it does in Rust.
* exceptions for error handling: you can also use the 'result' type (an OK of 'a|Error of 'b type in the stdlib) if you don't like exceptions.
At least the function's signatures will give more details in how it can fail and force you to handle the failures (or pass them through to your caller).
Unfortunately you lose backtraces that way (though you can use wrappers like reword_error, or logging to give more details on what goes wrong, but backtraces can be convenient in locating the source of an error sometimes).
A compromise is to use a wrapper like trap_exn (from Bos) that puts the exception with its backtrace in the Error type and then you propagate that...
* There’s no explicit typing also, everything is polymorphic by default
* Compiler errors won’t help you much
* The tooling is lacking compared to cargo. It’s pretty much makefiles in a lisp language (better love parens)
* The ordering of your code matters, shadowing functions and structs is a a thing
* use A::* is pretty much the default, and transitive dependencies are imported at the top level, so be ready to be confused about the origin of a function or module or variable
I really wish more people would write OCaml so that we’d have a better ecosystem.
If you’re looking to get into a new language, you should seriously consider it as there are SO MANY low hanging fruits and opportunities to have an impact on the language.
I started using OCaml in 2017, and had a rough start. But since then Dune (https://dune.build/) and the now-excellent LSP implementation have resolved the biggest issues I had. I haven't had such a pleasant development environment since Turbo Pascal! (To be fair, IntelliJ was also pretty good other than the shortcomings of Java.)
I like OCaml a lot but I agree that it suffers from this tooling/ecosystem problem. Even un-fancy languages with good tooling are just going to win, at the end of the day. Go is the main example of this. Rust also has great tooling that's gotten better over time.
I think over time Jane Street will (continue to) open source more of its libraries, and hopefully soon will also migrate onto Dune (internally they use jenga[0]). This should mean that the ecosystem within Jane Street more closely matches the external environment and tooling should get better as they push patches upstream.
I mean, maybe. But Core and Async have been open source for a long time, and there's still plenty of people out there using Batteries/Containers/Iter/BOS/Lwt.
I definitely will use it as my primary server language on side projects after it has a standardized async lib (effect handlers and eio). I was also hoping that rust would be my primary lang but its a little too verbose with all the "<>" for it to be my guilty pleasure. Been dabbling in OCaml since 2018.
Last time (ca. 1 year ago) I tried learning OCaml, I ended up reading the beta version of Real World OCaml 2nd Ed. IIRC and for some reason all I remember now is that I didn't feel confident regarding the learning material.
The RWO-website https://dev.realworldocaml.org claims that the 2nd. Ed. has been published in Q4 2021, but I can't find it anywhere.
Can somebody with experience tell me if RWO 2nd Ed. is the way to go?
We are just finishing edits of a few chapters (the tooling, testing and GADT ones), and then it’ll be off to the publishers early in the new year. The online one is therefore pretty up to date.
since you're here, what do you think about starting a new OCaml project with esy as best practice today? I was working on https://o1-labs.github.io/ocamlbyexample/ and I was thinking of starting with that approach as it seems to provide a more modern experience than just opam/dune. (I read your book as introduction to OCaml and still struggled a ton with all the edge-cases of dune and opam. Even today I'm iffy every time I start working on some OCaml code again because I know I'll have to deal with opam/dune issues.)
Please let's just all be friends. :) It's fair game to discuss and differentiate the two languages on features, but "better for engineers" is going to take us right into the valley of flame-wars, by way of No-True-Scotsman town. :)
Haskell and OCaml are both brilliant languages, with excellent feature sets and amazing dev teams... and lots of "engineers" who like them.
Kototama: ...I have the feeling that OCaml may be better for engineers.
You: let's not say that OCaml is better for engineers.
People are allowed to feel things.
But I'll give you a concrete reason why OCaml actually can be better for engineers: named and optional arguments. This single feature does wonders for code readability and maintainability.
Actually I'll throw in a bonus reason: no laziness by default i.e. small and simple, predictable runtime that behaves almost exactly like a C executable.
But yes. Optional arguments are a big problem at scale. And laziness is the worst antifeature I could imagine.
But. You can disable laziness and get great performance. You just need to be careful when using libraries and pick ones that have the right strictness.
lets not say that OCaml is better for engineers because thats impossible to prove and we can easily get into a "What is an engineer?" question. Its quite clear that engineers like Haskell _and_ OCaml, so the notion that one is better than the other for that class is just inviting Flamewars.
Agreed. Being able to drop seamlessly into an imperative style is very useful.
I’m someone who isn’t an experienced functional programmer, or that deeply knowledgeable about type systems and category theory etc, but OCaml is still approachable to me, I feel I can get stuff done in it. I didn’t feel that way with Haskell - I tried, but found the number of ways to do things overwhelming, and a lot of it felt very theoretical. Of course, Haskell’s pureness is an impressive and interesting feature; but it isn’t for me.
OCaml also has much better systems programming abilities compared with Haskell, IMO. It feels like OCaml is the Rust of the functional programming world (and this is no accident, the Rust compiler was first implemented in OCaml in fact).
Totally agree. OCaml is, in general, a better choice for engineering and real-world application. I think Haskell gets a lot of attention because it's interesting (for lack of a better word). I don't mean "OCaml is a boring language", but rather "Haskell was specifically designed as a playground for advanced ideas in type theory, so lots of weird things happen in that ecosystem." For doing personal projects for fun, I'd rather work with Haskell, but for getting things done probably not.
Actually OCaml has more operator stew than Haskell. Haskell's use of typeclasses/HKT avoids a lot of operator/function-name noise in the code.
Where OCaml wins is that it does not hate mutation. This makes the code more impure but simpler. Sometimes just incrementing a mutable variable makes more sense than introducing a state monad!
I'd say OCaml wins in term of operators and maybe as a language until you start using monads. My entirely subjective feeling is that modern OCaml codebases use more and more monads, the language is becoming less elegant and tedious as a result.
The introduction of let+ and let* made using monads fairly elegant in my opinion but this is a recent addition so you might have never seen it used. Outside of concurrent code monads are not that common in my experience.
I'd even argue that OCaml has way less operator stew than Haskell if we include Haskell's strictness annotations, type class instance selectors, and the large number of operators they have in the standard library. By comparison OCaml has a very modest number of operators.
You don't have to introduce a state monad to increment a mutable variable in Haskell, you can do that with IORefs pretty much the same way you do it in OCaml. But in a multi-core environment you will eventually want a guaranteed STM.
You are correct. However, IORefs are not idiomatic in "normal" Haskell. However, mutable variables are very much accepted in OCaml.
Adding mutability to the mix via IORefs + Haskell's laziness by default means you don't _really_ know when a particular IORef mutation may really be executed.
In OCaml, evaluation is eager and you know when it will happen. Even when you introduce bangs and special Haskell language extensions, fully removing laziness from your code in Haskell is difficult.
TL;DR: Because of Haskell's laziness you are in some ways forced to be purely functional. OCaml allows you to embrace mutation in ways Haskell does not, in my view.
> But in a multi-core environment you will eventually want a guaranteed STM.
You actually may not always always want STMs. STMs are awesome but something more basic like locks may do the job. It can be more performant to use, say, a mutex. If STMs were superior on every parameter they would have taken over the world. They have not -- they are good in some scenarios and not so good in others...
> Even when you introduce bangs and special Haskell language extensions, fully removing laziness from your code in Haskell is difficult.
Why would one prefer to fully remove laziness?
> TL;DR: Because of Haskell's laziness you are in some ways forced to be purely functional. OCaml allows you to embrace mutation in ways Haskell does not, in my view.
unless there's a strive to be more on the side of the Safe[1] language subset, Haskell can be told to embrace the same mutation property with a combination of `unsafePerformIO` and `seq`. People tend to avoid it because the language offers a tooling to achieve better mutation properties. Overall, my argument here is that OCaml's idiomatic approach to permissible mutations doesn't seem to be a competitive advantage when it comes to software development in general, but rather a flavour that imperative programmers prefer due to their familiarity with the concept.
> You actually may not always always want STMs.
Whenever there's a non-trivial retry policy, one should prefer to choose STM + queues over debugging adventures with Locks and Mutexes. Almost all of multithreaded software I can immediately think of have non-trivial retrying logic.
Laziness on itself does not prevent one from properly evaluating resource utilisation. In fact, laziness helps in resource preservation. But what makes it less predictable is various optimisation techniques that come to aid the lazy environment: strictness analysis, rewrite rules, and stream fusion[1][2]. They may change significantly between compiler releases and affect the previously optimised codebase.
will print first `0` and then `1`. You cannot put a non-integer into the box because the box has type `int ref`, meaning only values of type `int` can be put into it with the `:=` operator.
GP is (rhetorically, I assume) asking about whether or not the action of incrementing the mutable variable is typed, not that the variable itself is typed.
For example in normal Haskell the code that uses mutation and code that does not use mutation have different types. A function may have the type
f :: Int -> Int
but with mutation it can become
f :: Int -> ST s Int
or
f :: Int -> IO Int
or
f :: Int -> STM Int
depending on the circumstances. That's not the case in OCaml. The kind of mutation in OCaml references is the most similar to Haskell's ST monad (not to be confused with State or StateT), but in Haskell the type tells us a mutation is happening inside the function call.
One of the things that Haskell and monads bring to the table is typed side effects. Each action in a monadic computation has a type and cannot be freely mixed with other actions of different types. Of course, it's much easier and more convenient to go side-effecting willy nilly, as OCaml and imperative languages let you do, but that creates too many places for bugs to creep in and hide. Monadic programming gives you more control over what side effects are allowed when.
When I tried to dabble in OCaml, it definitely looked like operator stew to me, but I’m coming to it from the C, Python, JS, etc world so maybe Haskell is even worse in this regard?
Yeah, I played around with Reason but at the time the tooling kind of sucked and it only seemed to be used to generate JavaScript via BuckleScript. Not sure if things are better these days or not.
OCaml definitely has a lot of built-in operators, but Haskell makes it absurdly easy to define your own operators as well, with built-in syntax for using binary operators as prefix functions and vice-versa. I'm not sure there's a definitive answer, but that's probably in large part because the idea of "operator stew" is pretty subjective.
Ocaml makes it super easy to define your own operators as well. I was debugging a performance issue in an Ocaml codebase when I was very junior and new to the language, and proudly reported my finding that we shouldn't be using the (**) power operator in some algorithm but just multiplying and surely this was the cause of the problem. Turned out somebody had defined (**) as 64-bit multiply :)
I don't remember exactly how it works in OCaml (it might be similar!) but in Haskell, I think functions beginning with certain characters are automatically operators, but you can use them prefix by wrapping them in parens, e.g. `2 + 2` is the same as `(+) 2 2`. Similarly, a two-parameter function can be used infix by wrapping in backticks. Off the top of my head, I think I remember getting warnings from the linter about saying something like "elem some_list some_elem" to check if an element existed in the list instead of the preferred "some_list `elem` some_elem". The idea that functions are operators are interchangeable and have first-class syntax for swapping between them is a lot of fun, but also one of those things where I feel like I'm kinda glad it's not mainstream due to the shenanigans that I might have to deal with in production code.
It’s been a while but it’s definitely similar in Ocaml, you define a function with a name that consists purely of the set of allowed symbols and it’s useable as an operator. The thing I found really fiddly/tricky is that the first character is used to define whether it’s prefix or infix, and its associativity. Which like you said, it’s very clever and concise and powerful, but can be a bit of a nightmare if you have these operators defined all over the place and have to piece together how they interact.
I don’t think Ocaml had that backtick infix syntax for non-symbol function names though!
There are many paradigms in programming. Each have their strengths. A purely functional approach ala Haskell is not the only way. Based on your comment, it would seem no one should use C/C++. Yet many do. It depends on what you want to achieve, what is your abstraction budget, your performance requirement, legacy code...
OCaml offers a pragmatic functional approach to programming. And now you are going to be able to have your OCaml code run in a truly parallel fashion on your multicore CPU.
In the future there are plans to add typing to effects though. (There is support for effects currently but its untyped and experimental). When that happens you can track changes to state (which is a kind of "effect") if you want to...
> Based on your comment, it would seem no one should use C/C++
Maybe, there are better options out there these days for parallel and concurrent programs. Parallel programming in C and C++ is extremely fraught for the very reasons the parent brings up. There are so many footguns, starting from the fact that the favorite debugging method of C programmers, printf(), is not even thread safe.
It’s so bad that Rust markets as a feature “fearless concurrency”, capitalizing on the recognition that the prevailing emotional state of a C or C++ dev writing concurrent or parallel programs is one of fear.
And the very thing that makes Rust concurrency fearless over C and C++ is that borrowing and mutability are explicitly tracked. As we enter a world where over a dozen CPU cores are the norm, we are learning what works and what doesn’t in writing programs for these machines, and integrating those learnings into new languages.
One detail that usually is left out of the fearless concurrency story is that it only works in-process across threads, it does very little to help in distributed concurrency across multiple processes accessing shared data, eventually written in various languages.
Which in the age of micro-services is quite relevant as well.
Definitely better than other languages, still doesn't prevent one to actually think about end-to-end overall system design.
I prefer printf because it works pretty much anywhere, handles concurrency by default (as in you can see the interleavings, though the log call itself is locked), and allows me to have a custom tailored view of the state I want to see.
The last point may not be obvious, but debuggers have tons of noise for complex programs. In practice I just want to see how my program state changes over time while a debugger shows the entire program state or a large subset of it.
I think the future of debugging is going to be structured program state logging. Ideally we should be able to take our logs and partially reconstruct program state over time. For example, in addition to source location, we should save the lexical information for each variable logged so you can have interactivity with your logs and source code.
I sympathize with what he's saying, and I imagine he's correct, for his context, but some of us don't work on never-ending code and just want to write the best code we can, as quickly as possible, so we can move on to the next challenge. For those that are more pragmatic, and enjoy working on the hard problems the code is trying to solve rather than the hard problems of the code, using a debugger can be beneficial.
There are tons of ways to use debuggers, even in places that might seem one needs to contend themselves with printf debugging, what many miss is learning what is actually available.
So you end up with legions of Linus hatting on debuggers and proud themselves of never having to use one, some kind of macho thing I guess.
I've been programming for years and I still find printfs an extremely useful debugging technique. they let you debug an entire program in parallel, as opposed to the more focused single snapshot debugging you do with a debugger.
> A purely functional approach ala Haskell is not the only way.
that wasn't the OP's argument though, the argument was that OCaml is somehow generally better for engineers.
> OCaml offers a pragmatic functional approach to programming.
an evaluation of something as pragmatic depends purely on what one whishes to practice. There's no universally objective notion of pragmatism.
> And now you are going to be able to have your OCaml code run in a truly parallel fashion on your multicore CPU.
no, it won't be able to do that automatically. Your code will have to respect certain invariants to function properly, and you as a developer will have to enforce these invariants with the available tooling at hand. Haskell has purity, guaranteed STM, and `par` labels for that. OCaml doesn't have those and the existing codebases will have to eliminate their thread-unsafe public interfaces first.
> When that happens you can track changes to state
how are you planning to track state changes without purity?
> > And now you are going to be able to have your OCaml code run in a truly parallel fashion on your multicore CPU.
> no, it won't be able to do that automatically. [...]
I was comparing it to the old situation in OCaml is that it was impossible to have threads that were executing pure OCaml code and were _not_ IO bound to execute truly parallely. That limitation is removed.
To me it is pretty obvious that you will need to use things like atomics, thread safe data-structures, mutexes etc. to ensure your code runs properly in multicore OCaml. It was implicit in my response. But I should have been more explicit.
> that wasn't the OP's argument though, the argument was that the OCaml is somehow generally better for engineers.
Engineers tend to be pragmatic. OCaml is pragmatic. So in quick short form, OCaml may be better to solve a certain kind of problem than something that is more pure and abstract like Haskell. It was intended as an informal argument and not an argument in a court of law :-).
Maybe. But at this point it just looks like a mean dig at Haskell. A better word would be opinionated. Haskell is opinionated, and that is fine.
The biggest bait in this thread is using the term "engineers". It should rather be: people trained in imperative programming in mainstream imperative languages. Then it makes sense.
Gosh yes engineers would rather not use math. If there are approximations that work well enough (and when incorrect goes the way that leads to buildings not falling down) engineers definitely like to use less math.
For one thing, less math means less room to make mistakes, and often ugly but works well is better than elegant with a higher chance of failing due to people screwing up.
> And approximations are not math all of the sudden?
The context was Haskell versus OCaml, so yeah — Haskell is the more academic, pure math. Ocaml (in this context) is the more approximate, but more practical option for larger projects. In practice people use things like C and Python because spherical cows are close enough.
Ok, you are another person in this chain who decided to just drop "more practical" and call it a day. Why bother.
It is especially funny in the thread about OCaml getting soon (2022 maybe) multicore support. Something this less practical ivory tower Haskell had before it was even cool.
> How can parallel untracked mutations of untracked state be better for engineers?
What is untracked? Is
unsafePerformIO (printLn ...)
tracked? What is tracked? Is
foo :: IO ()
thread-safe? Maybe, maybe not. What meaningful information does this signature says to me? That it does some IO? That's an extremely useless information, especially if most of your code is IO something.
What granularity does IO have? Does
foo :: IO ()
throws any exceptions? Maybe, maybe not.
The need for effect tracking for writing correct programs is way overstated by some Haskell programmers. It's usually much more prudent to write DSLs which hide effects like variable mutations and logging, than expose them.
For example you can start your program with a pure correct-by-construction core DSL, and then add logging, mutable variables where needed underneath the DSL's terms without breaking the semantics of the DSL. With effect tracking you are doomed to either reinvent custom effects to be able to switch interpreters painlessly, or you'll need to break the DSL by the addition of effects.
Neither is prudent in real world, neither gives more value than takes. What is really funny is that some Haskell programmed believe that logging should be tracked, but allocation apparently should not (yes, it's a side effect).
Where in fact all the "effects" are need to be tracked only when they are meaningful, i.e. when they are part of our DSL's semantics, and not some hidden part of the interpreter I don't need to know about.
Imagine a theorem prover function which creates a conjunction of two terms:
val conj : term -> term -> term
It doesn't matter if it allocates or logs, at the precision that is interesting to us, it's just a function creating a conjunction of two terms.
This repetitive talking point is getting boring. Go figure whether it's tracked now:
{-# LANGUAGE Safe #-}
-distrust-all-packages
> What meaningful information does this signature says to me? That it does some IO? That's an extremely useless information, especially if most of your code is IO something.
That's an extremely narrow view which you wouldn't have if you ever tried to implement a safe sandboxed environment.
> What granularity does IO have? Does [...] throws any exceptions? Maybe, maybe not.
what does it have to do with tracked parallel mutations?
> Neither is prudent in real world, neither gives more value than takes
define prudent and define real-world.
> It doesn't matter if it allocates or logs
have you heard about referential transparency? It's the thing your "theorem prover" example does not have.
Is there an ELI5 writeup somewhere about what's so nice about algebraic effects?
I skimmed through some preprint on arxiv, and I got the impression that one could implement async/await as well as something like green threads if you select a suitable runtime. But beyond implementing other concurrency abstractions, why should I care?
Effect handlers are a foundation of all non-local control flow abstractions. Anything that requires fancy control-flow can be implemented with effect handlers. This includes green threads, async/await, generators (or iterators as some languages call it), but also higher-level applications such as algorithmic differentiation, probabilistic programming, model checking and fuzzing of parallel programs, etc.
A few of these applications were discovered only very recently. The aim is that this language primitive will inspire further discoveries for expressing useful abstractions elegantly.
I don't know much about OCaml, only Scala and Cats Effect/ZIO/Monix, but would appreciate if I could get a gist of what Multicore OCaml brings to the table.
Is there somebody who's familiar with both worlds and could compare them and explain how Multicore OCaml (and possibly the new effect system) work?
Multicore OCaml works using a thread-safe generational garbage collector. Each domain (i.e. thread) gets its own minor heap, and the major heap is shared. Each heap is GCd independently. The domain-level minor heaps are restricted so different domains can't write to each other's minor heap. And the shared major heap is protected with more bookkeeping. More details in https://kcsrk.info/multicore/gc/2017/07/06/multicore-ocaml-g...
Scala of course just uses the set of GCs that are available on the JVM. OCaml's GC has an advantage in that it's optimized for quickly creating and collecting many small objects–perfect for functional programming. But some advanced JVM GCs probably come close to that.
The new effect system is basically 'resumable exceptions'. An effect is declared somewhat similarly to an exception. When the effect is 'thrown', it is handled by the nearest effect handler up the callstack. The difference from an exception is that once (if) it is handled, it actually resumes at the exact position in the code where it left off, not in a continuation created by a closure which is effectively what Scala and every other userland effect system does.
No. F# is an ML dialect, like OCaml. But, you should be aware that the "O" in "OCaml" actually means something. OCaml comes with beautiful object system that's prototypal in nature and structurally typed integrated into the language. F# on the other hand has the object system inherited from .NET, and it's simplistic in comparison.
Also, modules and functors.
So, no, F# is not OCaml. If you absolutely need to use .NET, and know some ML, you can reach for F#, but its limitations will drive you nuts quickly. F# is for C# programmers - they have "full power of .NET" still accessible, plus a few nice things like HM type inference, immutability by default, and maybe computational expressions.
F# is not really for C# programmers, C# programmers just get by the F# features that keep being copied into C#.
F# is an ML that is kind of allowed to play on .NET because in a given moment management accepted to integrate it into Visual Studio, and keeps looking for the golden spot that it will take it beyond the VB/C# shadow, always a second thought when the .NET team designs new architecture features only taking VB and C# into consideration.
F# twittersphere is its own bubble, not always with nice opinions about Microsoft and the .NET platform it depends on.
I completely disagree that F# is for C# developers. For one, C# developers are usually fine sticking with C# and its massive ecosystem. F# and .NET more seamlessly run everywhere than Ocaml, which last time I tried is a lot of trouble on Windows. F# has had multicore support from the beginning and hasn’t needed a multi-year development process to get there.
I’d like to learn Ocaml (more than just the parts I’ve learned from F# and SML), but F# is by no means a simplistic language in terms of what you can accomplish. It’s easily one of the best designed and most pragmatic languages out there. From the syntax I have seen, F# is cleaner than Ocaml.
> F# is an ML dialect, like OCaml
That’s kind of splitting hairs. The object system is different for obvious reasons. F# clearly started as a port of a subset of Ocaml to .NET and was even valid Ocaml code for some time in the early days. That’s exactly a dialect.
Don Syme explains in history of F# that there were multiple efforts to port OCaml to .NET, (Nemerle comes to mind), but none of those felt natural to use.
For F#, they started with OCaml syntax and semantics; then modified it to make it a true .NET language. They added features that were not in OCaml like inheritance & interfaces. and removed features that were unnecessary in .NET like functors.
It was wrong of me to only mention the "limitations" (compared to OCaml) that "will drive you nuts", while saying nothing about extensions/advantages of F# - because, obviously, it does have some. This:
printfn "%s" ((3.0 + 3.0).ToString())
may seem normal enough to onlookers, but being able to write it (more precisely, the mechanisms that make it possible) in F# can be seen as a definitive advantage over OCaml, where normal functions cannot be overloaded to work on multiple types. At least without explicitly handling it yourself, which can be a lot of work.
What I really wanted to say, but somehow failed, is that F# is not OCaml - it's a language in its own right, with both good and bad sides, which don't align with those of OCaml particularly well (and sometimes not at all). F# and OCaml share some semantic and syntactic features, but they are not "dialects" of each other (no matter which way you want to spin it), they have very different sets of trade-offs, and probably shouldn't be directly compared (even if such comparison was OK historically).
I think I failed to convey this properly because I was angry that "someone on the Internet!!!" mentioned F# in an OCaml thread "AGAIN!!!", even though it's definitely one of the biggest developments/successes on the OCaml side in recent memory. I mean, it's like you're having a party to celebrate your promotion, only to have that one person tell you all about how their cousin is better than you in every respect, and he didn't even need to work that hard for it, so you're just slow. It was strangely disconcerting, but I should have reacted less emotionally.
F# is a great language, and definitely an ML. But it falls a bit short of being an OCaml. :) The type-system features are quite different -- e.g. F# has no functors and no GADTs, at least not the last time I checked. I think that F# has a different feel and perhaps a different target audience.