Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Distributed SQL database (TiDB) source code explained (pingcap.github.io)
102 points by simonz05 on Jan 20, 2017 | hide | past | favorite | 18 comments


I am watching this project very closely. So far I think it has the potential to be a full featured replacement for FoundationDB. Note that they rewrote the underlying distributed, sorted, transactional KV store from Go to Rust which is the piece I am most interested in.


Thanks for your attention.


Within the spectrum of so-called "NewSQL" databases, there are basically two kinds:

* ones built on top of a mature relational database (e.g. Vitess, CitusDB)

* ones built on top of some custom or unproven KV store (e.g. TiDB, CockroachDB, NuoDB)

Ultimately, despite claims of benchmarks/elegance/MySQL not being web scale enough, I'm unsure if (at least in the immediate future) it's a wise move to do the latter. Solutions that use a proven storage engine seem like an easier pill to swallow.


Disclaimer: I work on Vitess.

I'm actually a fan of TiDB and Cockroach. I've met engineers from both, and they're super-sharp.

I am biased towards something like Vitess that builds on top of existing tech. The main advantage is that we can push-down the work to the lowest level, and leverage efficiencies that are already built in MySQL.

But the newer architectures offer better consistency models. Some customers may care more about that.

In the long run, I think these trade-offs should converge. Overall, any NewSQL is a better alternative than using traditional key-value stores, because it gives you better functionality while not giving up on scalability.


Hey! I really enjoyed the talk you gave at Square :p

I agree that we can't build on the same base forever, but I also think it's really hard to get companies to trust completely unproven solutions -- ground-up rearchitectures like TiDB and Cockroach -- with their data. I'm not sure that there's an easy way to get around that.

Re: consistency -- wouldn't it be possible to e.g. make Vitess work nicely with MySQL's Paxos-based group replication?


Hi, thanks for the compliments :).

MySQL group replication has some issues: It can fail your commits if multiple masters have conflicting transactions, which is a problem for cross-shard distributed transactions. Additionally, group replication is too chatty and doesn't work well cross-dc.

I actually have a counter-proposal that addresses the above concerns here: http://ssougou.blogspot.com/2016/09/distributed-durability-i....


> Within the spectrum of so-called "NewSQL" databases, there are basically two kinds:

The guy that invented the term "NewSQL" seems to disagree with you:

https://sigmodrecord.org/publications/sigmodRecord/1606/pdfs...


Haha. Yeah, I've read this paper.

My argument was mostly based on the actual technical differences, so I didn't really count DBaaS as a separate category. At the very least, I don't think I know enough about how DBaaSes work differently underneath the UI :P


TiDB and CockroachDB both use Facebook's rocksdb. I would not call it an unproved KV store. Both also use etcd's raft implementation which is also quite well proven.


Sure, but I wouldn't exactly call them mature yet. I think MySQL is still much easier to trust than RocksDB. (I don't actually know much about RocksDB, but I remember that LevelDB had a bunch of data corruption issues. Hopefully RocksDB doesn't.)

To quote the famous video[1]: "Relational databases have been around since the fucking 70s and are some of the most mature technology you can find."

1: https://www.youtube.com/watch?v=b2F-DItXtZs


RocksDB is widely used in many of projects for years, i think it's quite mature. Not just TiDB and CockroachDB, MyRocks and MongoRocks are using RocksDB as storage engine too. Also there are plenty of other projects are using RocksDB. If you are worry about the adding new feature of RocksDB, yes, you are right, but all of the storage engine are changing rapidly(even InnoDB which is used by MySQL as default engine), Anyway, happy hacking :)


I'd be more interested to see some articles and posts on how to use TiDB and/or TiKV in interesting ways.

There's been plenty of explanation on how its built, their process, the architecture, etc. Some practical application would be helpful now to help contextualize those decisions.


Hi from TiDB developer, we have many users in China, some of them have already used TiDB in production. Most of them treat TiDB as a drop-in replacement of MySQL sharding solution, and some of them use TiDB as an Ad-Hoc OLAP database with transactional insert/update/delete support.


I think TiKV on its own looks quite interesting. I see that the readme considers it to be part of TiDB, but I can see uses for TiKV on its own, without TiDB. Do you have anyone doing this in production?



This takes a lot of work to do, kudos! Don't listen to any haters, this is the type of document you can use to reply to them. Great job, keep it up. Cheers! ~ A friendly distributed DB competitor


Thanks, that means a lot to us, btw happy Chinese new year! 新年快乐! :)


Thank you,We are still working on the next RC release.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: