Even ignoring the practicality part, it becomes a timing game, because "empty" m...

schoen · on June 25, 2018

> Even ignoring the practicality part, it becomes a timing game, because "empty" messages - even if they were filled with unintelligible "random" hex - would traverse the network differently than ones with variable length/size content and would be able to be filtered out pretty quickly.

To eliminate the statistical observability of metadata, the padding needs to reach or exceed the maximum capacity of the channel. So you can't have people sending more messages than the padded channel permits per time period. In your example, packets "with variable length/size content" would need to be absolutely prohibited, or else all packets' length would need to be randomized, and message data would need to be sent following strictly the same distribution as padding messages.

For example, you and I could have a rule of exchanging exactly 1 MB of data per day, at a specified time, every day. Then an observer wouldn't be able to tell whether, on a particular day, we had actually communicated something to each other or just allowed the padding data to go out. Clearly in this system we're not ever allowed to use it to transmit more than 1 MB per day, without destroying the metadata unobservability property. An attacker still knows that you and I are part of a system that offers us an otherwise unobservable channel, but not when we do or don't make use of that channel.

There are lots of variants that also allow many-to-many messaging, again at a high cost in overhead, latency, and availability.

jancsika · on June 26, 2018

> Clearly in this system we're not ever allowed to use it to transmit more than 1 MB per day, without destroying the metadata unobservability property.

You're also not ever allowed to transmit links or anything else that goads the user into fetching a remote resource in response to a message.

394549 · on June 25, 2018

>>>> but you can't completely conceal metadata.

> For example, you and I could have a rule of exchanging exactly 1 MB of data per day, at a specified time, every day.

Depending on the size and popularity of the relay network, the fact the two parties are connected to it could be valuable metadata.

If you really wanted to minimize the amount of metadata to something that's almost useless, you'd probably need to use something like a continuously-operating broadcast numbers station.

https://en.wikipedia.org/wiki/Numbers_station

AndrewKemendo · on June 25, 2018

On it's face such a scheme seems theoretically robust, but for frequency correlation only. I'd be curious if in practice it would be possible to eliminate all other variability though, of which there are many. For example I'm unaware of any true solution to latency triangulation.

My hunch is that it wouldn't be possible, and there would be a side-channel vulnerability somewhere.

schoen · on June 25, 2018

I'm not proposing a low-latency interactive approach, so latency triangulation shouldn't apply. In my example mechanism, we always have to wait a full day until sending any reply, so there's no event that an attacker can use to measure latency from.

Edit: the beginning of this research is the Dining Cryptographers.

https://en.wikipedia.org/wiki/Dining_cryptographers_problem

Although Chaum's solution has terrible availability properties, it's unconditionally secure against outsiders!