Hacker News new | past | comments | ask | show | jobs | submit login
Building a Forex Trading Platform Using Kafka, Storm and Cassandra (insightdataengineering.com)
139 points by jecs321 on Oct 23, 2014 | hide | past | favorite | 39 comments



This is a somewhat related question about MetaTrader, which is the most popular FX platform.

It could be considered paranoid and jingoistic, so I apologize in advance to all the great Russian hackers out there.

Does anyone know anything about MetaQuotes (the maker of MetaTrader)? They're a trusted solution provider to many of the biggest banks in the world, and yet in my experience it's difficult to find anything about them online (the company, not the software).

I know it's a broad stereotype, but many Russian companies don't have the best reputations when it comes to business ethics. It's an open secret that many of the top oligarchs formed more or less a kleptocracy with the Putin administration, and that the rule of law is pretty shoddy when it comes to Western companies or individuals seeking justice.

Traders all over the world plug in their algorithms in plain text right into MetaTrader. Realistically, do they have anything to worry about, or am I being completely paranoid?

I don't mean to disparage MetaQuotes. As I said, I don't know anything about them, and they partner with big financial institutions all over the world. Many of these companies provide customized versions of MetaTrader, so I would imagine they might have access to some or all of the MetaTrader source code. I guess I'm just looking to be reassured.

What steps could a trader take on any platform in order to reduce exposure to potential bad actors? Is using an API from a broker the only solution?

Anyway, again my apologies if I offended anyone.

Update: their Wikipedia page has a few more articles on them than I remembered. Also, technically they are registered in Cyprus, but were founded in and are primarily based in Russia. Still, I'd love to hear people's thoughts on this issue:

http://en.wikipedia.org/wiki/MetaQuotes_Software


I can't think of a single bank using metatrader, you're thinking of brokers.


Just like Intellij! Omg the Russians own our codes


JetBrains (the company behind IntelliJ) is headquartered in Prague - Czech Republic.


I used to trade on Metatrader back in 2007~2009.

There was a huge scandal because it was revealed that MetaQuotes implemented a function to forcefully cause disruption in the orders that were coming in. This function would be used by brokers to forcefully widen the spread at any given time. I remembered a lot of traders, myself included, just completely lost trust in retail forex. It was nothing but a sham.


If you do still trade, which Application do you use today?


> It is a truly global marketplace that only sleeps on weekends.

Sure, there are 4 trillion reasons why this is enough and yes, I know it's very much about traders themselves, and that if there were an incentive to change it probably would change pretty fast, but anything that stops on the weekend seems very silly in the 21st century. (Even EVE Online's daily downtime is a bit ridiculous.)


You need liquidity on Saturday and the banks don't work.


yep, you actually can execute trades on saturdays on some retail brokers (they usually act as the counter-party) if you're willing to take the humongous spreads during those hours.


Why is liquidity for settlement so strongly linked to making contracts (trading)?


Banks are also counterparties to effectively all large contracts, and there's an avalanche effect that keeps most traders out of the market.


I am no expert in forex platforms, but this statement seems a bit off:

"individual investors only have a few simple tools at their disposal, e.g., Meta Trader or Ninja Trader"

Both meta trader and Ninja trader have powerful event-based scripting languages (MQL5 [1] and NinjaScript [2]) with a solid library of charts, strategies, indicators and order execution. More than that, they have a huge community providing all kinds of services around those platforms[3][4]

Other than that, Wolf seems like a nice piece of software.

[1] http://www.mql5.com/en/docs

[2] http://www.ninjatrader.com/support/helpGuides/nt7/

[3] http://www.mql5.com/en/signals/mt5

[4] http://www.metatrader5.com/en/automated-trading/mql5market


Simple... versus the tools inside the top investment banks. I'd estimate that the publicly available tools are at least 10 to 15 years behind.


More than that. Further still, the publicly available tools trend towards only covering the "algorithm" part of the problem - that's what's sexy. Missing out is things like risk management, money management, allocation, OMS vs EMS, etc.

Trading for most prosumers is a vanity hobby, though they don't know it.


Plus tickets, sales tools, blotters, order management, routing, reports (reg and otherwise) etc...


Lots of brokers also offer vanilla FIX connections to their systems as well, so a GUI isn't necessary.


Interactive Brokers offers a FIX interface to their system.


Virtual Brokers comes to mind.


i fail to see the point in using hadoop if your universe is limited to foreign exchange... you can probably fit all ticks of all pairs in existence since the 70s in less than a TB


That is a very good point. An initial idea of Wolf was to create a platform that could trade virtually anything (Craigslist, Amazon, Ebay, Stock market, Forex, etc) as long as there's a trading API, and could make use of data from just about anywhere. Not just ticks. Imagine you can write a trading rule like this: "if Mr. X invested $ in company A and Ms. Y recommends buying stocks of company B or an epidemy spreads in country C or product D received great consumer reviews then trade E for F". That would require a lot of data: news articles, financial data, customer reviews, geopolitical events, etc. One of the Insight data engineer found some cool correlations between geopolitical events from GDELT and Forex. Wolf was created end-to-end in 6 weeks so... I focused on Forex. P.S. One year of compressed Forex ticks from HistData.com is about 100GB.


Assuming 40 pairs, 2 ticks per second and 1kb to store a tick (including timestamps, indexes etc), I reckon that's about 70TB for 40 years worth of data, not counting weekends.

I've assumed only outright ticks, hence the 40 pairs - in reality most banks store the crosses as well, which can be up to 1,600 pairs (40 x 40), which will get you into the PB range for the 40 years.


One kilobyte per tick seems quite generous. All of the EURUSD tick data for 2013 from histdata.com (the source mentioned in the article) is only 515MB (~20GB for 40 years, ~824GB for 40 pairs).


I would say a "tick" comprises the timestamp (stored as a long int) and ten levels of the order book (bid price, ask price, bid size, ask size) each stored as a double-precision float or a long int, so that's

  (1 + 4 * 10) * 8 = 328 bytes
per tick, so 1KB isn't far off. Obviously not every level changes on every tick, so there are opportunities for compression, that can be significant.

Note that the "tick data" from histdata.com gives you prices sampled every 1 second (so not every tick) for the top level of the order book, and doesn't give you any size information at all.


Not the surprise: you need less than 3bytes on average to store a tick...


The library definitely deserves attention considering the fact that the current MetaTrader and NinjaTrader platforms are like dinosaurs, they feel very old. In the article you said that the stream of information is essential to the operation of Wolf. I also think that this is the fundamental part of a forex platform - the data should be accurate because you have to validate the forex strategy as much as you can during the backtesting phase. Have you considered replacing the current data provider (HistData.com) with something better? Actually we are on the same track because I am also implementing a forex trading platform and I have experienced a lot of problems just because the HistData database is not accurate enough for backtesting - it has a lot of empty records, sometimes a whole day of tick data can be missing.


Thank you for the comment. I agree that having sound data is essential to building any trading algorithm. Probably that's why they are so expensive. I won't complain about HistData.com, it's a free service, they only charge a small fee for a convenient ftp access to their site. At the moment, that's the most you can get for free with a millisecond resolution. Despite it's drawbacks, works well for backtesting, IMHO.

A great alternative is finance.yahoo.com. It is reasonably rate limited, see http://stackoverflow.com/questions/9346582/what-is-the-query.... You won't be able to do any high-frequency trading anyway without dedicated hardware in a dedicated data center.


Were you actually able to get 1ms latency from Kafka? How long does it take an event to enter then exit Kafka?


No. I only had boxes with SATA drives. I don't see how you could get below 1ms latency with spinning disks. You would have to use in-memory queue to get this sort of latency. Secondly, I had Storm consuming off Kafka every millisecond, so in the worst case scenario a message would have to wait one millisecond in the queue anyway. Here's a great article about throughput/latency of Kafka https://engineering.linkedin.com/kafka/benchmarking-apache-k....


"The foreign exchange market, or forex, is the biggest and the most liquid exchange service in the world with over $4 trillion worth of trades made every day."

Yes, and you are shut out of that market unless you're on the interbank (which generally means you are an actual bank). Retail forex "markets" are bucket shops that get first look at all their customer bids. The actual forex market is OTC.

http://www.ecnforex.co.uk/interbank-forex-markets-explained/

Trading retail forex is a mug's game.


I would disagree that it is entirely a mug's game.

There are some EAs for MetaTrader that will check if your broker is manipulating your SL/TP and stop the trade. Also if you're a medium/long term trader then it doesn't really matter what your broker does.


In that case: you should be trading futures.


Anyone know how I can run this? I was into FOREX few months back and this tool looks very interestnig. I need to exchange large sum and want the most of the volatile FOREX market.


https://github.com/slawekj/wolf#getting-started

It suggests to deploy modules on different machines, but you could probably try this out on 3-4 VMs (use SSDs).

There are rather simple deployment scripts (it uses apt-get, so get Ubuntu VMs), for some of the modules.

Set up a Hadoop cluster (HDFS, Hive, Camus) use YARN, you can run Storm on it, but there is a deployment (install & run) script for running Storm standalone (with a standalone Zookeeper node).

So, it's a lot of work, but sounds really interesting, fun and immensely didactive!


A pilot installation of Wolf was made on 10 ec2 boxes, mainly m1.medium and m1.large if I remember correctly. 3 boxes for C*, 3 boxes for Hadoop, 1 for Kafka, 1 for Storm, 2 for everything else. You might be able to squeeze it in less machines as Amazon is upgrading specs all the time. Deployment scripts are very simple at the moment, just a bunch of shell commands. That might be upgraded in the future to some magic automation. Nevertheless, scripts were tested on Ubuntu 12.04, but are not specific to Ubuntu, other then the package manager (apt-get). Probably if you sed s/apt-get/yum the scripts, they will work on CentOS/Fedora. If you have a really strong machine you might even be able to squeeze everything in one box. I wouldn't recommend it though, keeping the state of all the clusters in one Zookeeper is a very risky business, plus reliability issues. Unfortunately, it's not so simple to get Wolf from nothingness to a working/trading system. To actually trade you will also need a data feed, and a brokerage account.


I'm not ultra tech savvy. Just getting into programming but I know bits and pieces. What the guy made, is it superior to metatrader 5 or 4? How is it different from them. I'm running Ubuntu 14.04 as a default machine. Also why is a huge cluster, Amazon servers required? Isn't it possible to run it on a home computer? I don't see any heavy computation. From what I understood, it harvests data from sources and displays them in a plot form.


And it's able to go and get any historic data. It can scale up to thousands of machines (petabytes range), it can scale up to simultaneously ingest thousands of data flows, and so on, and continue to do whatever you want based on event processing (Storm can watch for any strange rules you can come up and fire off events [triggers] to an other component to execute trades), without stopping the flow for even a second.

It's a proof of concept in distributed systems engineering.


Because running a trading algorithm on a non-idempotent stack is such a good idea.


Very good remark! Storm by default implements at-least-once semantics. If a bolt process executed an event and failed right away without acknowledging, an execution would get replayed (so you would buy/sell twice). However, there's a remedy! By sacrifacing latency you can have exactly-once-semantics, see Trident, Summingbird.


Yes. Because that's what was missing in this world. Yet another foreign exchange lottery platform for the rich kids to gamble around.

Edit: Oh boy - the all-I-wanna-be-is-rich egotistical nerds downvoted me. My soul is crushed.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: