Should we think of BuildFlow as an alternative to workflow managers like Prefect or kubeflow or is it a higher level library for stream processing like Beam?
We don't support any snapshotting or checkpointing directly in BuildFlow at the moment, but these are great features we should support.
But we do have some fault tolerance baked into our I/O operations. Specifically for Google Cloud Pub/Sub the acks don't happen until the data has been successfully processed and written to the sink, so if there is a bug or some transient failure the message will be resent later depending on your subscriber configuration.
All of our processing is done via Ray (https://www.ray.io/). Our early benchmarks are about 5k mesesages per second on a single 4 core VM, but we believe we can increase the with some more optimizations.
This bench mark was consuming a Google Cloud Pub/Sub stream and outputting to BigQuery.