This distinction between informational asymmetry and security through obscurity seems artificial. Doesn't security through obscurity rely on informational asymmetry by definition? What is the distinction here? It would be more honest to say that security through obscurity sometimes works, and sometimes doesn't have an alternative. And bot prevention is such a case. I'm not aware of any open source bot prevention system that works against determined attackers.
Any real world security system relies to some extent on security through obscurity. No museum is going to publish their security system.
It's only in the digital world that certain things, such as encryption, can be secure even under conditions where an adversary understands the entire system, so that security through obscurity in that context is frowned upon because it shouldn't be necessary.
But this is a special case. Security is mostly a red queens race, and "obscurity" or "informational asymmetry" is an advantage the defenders have.
The concept that both Alec and Cory are dancing around but do not name
directly is basically Kerckhoffs's principle [1].
They're both right: Alec in saying that open detailed knowledge of
moderation algorithms would harm the mission, and Cory for saying that
a protocol/value level description of moderation gives insufficient
assurance.
That's because abuse detection isn't cryptography in which the
mechanism can be neatly separated from a key.
>This distinction between informational asymmetry and security through obscurity seems artificial. Doesn't security through obscurity rely on informational asymmetry by definition?
It depends on your definition:
- (1) "security through obscurity" is an unbiased term with no negative connotations which simply describes a situation without judgement. Parsing that phrase in this purely logical way means "information asymmetry" is a distinction without a difference. This neutral meaning is what your comment is highlighting.
... or ...
- (2) "security through obscurity" is a negative cultural meme and the recipients of that phrase are people who are incompetent in understanding security concepts. E.g. they don't realize that it's a flaw to hide the password key in a config file in a undocumented folder and hope the hackers don't find it. It's this STO-the-negative-meme that the blog post is trying to distance itself from by emphasizing a alternative phrase "informational asymmetry". Keeping the exact moderation rules a "secret" is IA-we-know-what-we're-doing -- instead of -- STO-we're-idiots.
The blog author differentiating from (2) because that's the meaning Cory Doctorow used in sentences such as "In information security practice, “security through obscurity” is considered a fool’s errand." : https://www.eff.org/deeplinks/2022/05/tracking-exposed-deman...
I and I suspect many like me realise that the truth lies somewhere in the middle.
Do I use security-by-obscurity? Of course! I know that my server is going to get hammered with attempts to steal info, and I can see /path/to/webroot/.git/config getting requested several times an hour, so I don't put important stuff in places where it might be accessed. Even giving it a silly name won't help, it has to be simply not something that's there at all. That kind of security-by-obscurity is asking for trouble.
Sure as hell though, if I move ssh off of port 22 then the number of folk trying to guess passwords drops to *zero*, instantly.
I’ve found dictionary attacks against non-22 ssh to be effectively zero. Maybe once in awhile you get a port scan and a poke or two, but anyone able to move ssh off of 22 is not likely to have the password “p4ssw0rd”
> I'm not aware of any open source bot prevention system that works against determined attackers.
It works just fine if you're willing to move to an invite only system and ban not just the bot, but the person that invited them. Possibly even another level up.
The problem with this system is that it leads to much less inflated numbers about active users, etc. So very few companies do it.
Such a system is still vulnerable (I'd daresay even more so) to account takeovers. And it might even have cascading effects depending on how your ban one level up goes. For a first approximation, even if one user can only invite 2 users, exponential growth will mean that bots may still pose a problem.
> vulnerable (I'd daresay even more so) to account takeovers.
Not more so. Vulnerability is a function of defensive capacity. There is no reduced defensive capacity. If anything, knowing who invited whom can allow one to allow web-of-trust based checks on suspicious login, allowing for more stringent guards.
> For a first approximation, even if one user can only invite 2 users, exponential growth will mean that bots may still pose a problem.
In these types of systems users earn invites over time and as a function of positive engagement with other trusted members. Exponential growth is neutered in such systems because the lag for bad actors and the natural pruning of the tree for bots and other abusive accounts, leads to a large majority of high quality trusted accounts. This means that content flagging is much more reliable.
So, yes, bots are still a (minor) problem, but the system as a whole is much more robust and unless there is severe economic incentive to do so, most bot operators understand that the lower hanging fruit is elsewhere.
You misunderstand some of the vulnerabilities then. Bad actors on the systems are not the only weaknesses of the system.
Other systems are potential weaknesses of your system.... But what do I mean by that?
If other systems have better ease of use while blocking 'enough' bad actors it is likely your exceptionally defensive system will fail.
"I got blocked from SYSTEM1 for no reason, hey everyone, lets go to SYSTEM2", this is risky if one of the blocked people is high visibility, and these kinds of accounts tend lead the operator to make special hidden rules that tend to fall under security by obscurity of the rules.
I do not think every weakness is a vulnerability. Were it so, then all things are vulnerabilities to some agent or entity in some sense, since weakness is often defined in relative terms.
If another system is easier to use for say, signup. Then of course the system I propose will have to leverage its strengths to make up for the fact that it is harder to join. But there are plenty of nightclubs and restaurants that one can just simply not get into unless they know someone. They're often the most acclaimed.
If the system I propose leads to a celebrity's ouster then the mechanisms and business orientation of the system would need to leverage that ouster to its benefit after, of course, making sure that the system as designed only ousts truly irritable people.
One may say, "but if a given celebrity attracts many people and their ouster would lead many of them off of the platform how can this be used to our benefit?"
But it is precisely this dynamic that the system that I advocate for has as its chief strength! Those that would leave merely at the call of a scorned leader despite having a more fruitful and productive conversation on the platform are those that are most probably least likely to positively engage with the platform in the first place.
Um, so you're reinventing the private club... quite an original idea.... But not really a useful one in any sense. You don't need any special technology to implement this idea. Just have a membership board that votes in/out members if it's such an exclusive group. This in itself is rather solved problem, but not a useful problem worth discussing when addressing issues on a larger scale.
What I advocate for is not a simple up and down vote by the existing members. What I advocate for is a web-of-trust without the cryptography and with some sane AI to handle things like responding to malicious account take overs and other tricky bits like dealing with forged identities.
This isn't my invention. This has been used before in a limited form at Dribble and it worked to keep the riff raff out.
It's way too much friction. It's why we are commenting here, on this open site, instead of lobste.rs
As an absolute last resort, yes, invite only does work. But people will only seek out invites if there is something extremely desirable to be found on the site like a private torrent tracker.
That's all nice and well, but I no longer can tell the difference between Big Tech's "Abuse Prevention" and abuse. They need transparency not because it's going to make their job easier. They need transparency because literally millions of people hate their companies and don't have one iota of trust in their internal decision-making. Big Tech workers might think all those people are morons and can be ignored indefinitely. In reality, it simply doesn't work this way.
You think you want transparency and that it’ll make you trust them, but it won’t. Even if you found out how those decisions are made, it won’t make a difference.
Here’s something I wrote a couple days ago (https://news.ycombinator.com/item?id=33224347). It’ll tell you how one component of Meta’s content moderation works. Read it and tell me if it made you suddenly increase the level of trust you have in them.
What will actually happen is that you’ll cherry pick the parts that confirm your biases. Happy to proven wrong here.
Reading this article does, in fact, increase my trust that my Facebook account won't be randomly, irrevocably banned one day a la google.
The trouble is, that's not my primarily distrusted thing about facebook; I don't trust that the power they have to shape people's opinions by deciding what to show them, won't be abused to make people think things that are good for facebook but bad for society at large.
So while that article does increase my trust in facebook in general, the magnitude of that increase is miniscule, because what it addresses is not the reason for lack of trust.
But you're right that transparency wouldn't solve that. Because it's only the first step. If facebook were to transparently say "we are promoting far right conspiracy theories because it makes us more money", and provide a database of exactly which things they were boosting, while perhaps I would "trust" them, I certainly wouldn't "like" them.
It would be nice to know why the meme I posted got flagged because it didn't meet Facebook's vague 'Community Standards'. These platforms are enormous black boxes where their decision is final and there is no way to appeal, short of literally going into their building and asking to talk to the manager, which is outside of many people's scope, and not worth the effort. They would rather let content get censored than go out of their way to appeal.
It does look like they do tell you which part of the standards are violated and are pretty detailed / defined about what they are? Never really had that happen to me although since I barely use FB.
Right. Read that first. Also the Santa Clara Principles that Doctorow mentions.[1]
Now, a key point there is freedom from arbitrary action. The Santa Clara Principles have a "due process clause". They call for an appeal mechanism, although not external oversight. Plus statistics and transparency, so the level of moderation activity is publicly known, to keep the moderation system honest.
That's really the important part. The moderation process is usually rather low-quality, because it's done either by dumb automated systems or people in outsourced call centers. So a correction mechanism is essential.
It's failure to correct such errors that get companies mentioned on HN, in those "Google cancelled my account for - what?"
The "Abuse prevention is tradecraft" author has hold of the wrong end of the problem.
Note that Facebook has the Oversight Board to handle appeals and I assume such appeals must necessarily reveal the underlying decision making process. https://www.oversightboard.com/
> I assume such appeals must necessarily reveal the underlying decision making process.
Probably not the parts they keep secret. The Oversight Board can make a decision about content based on the content itself and publicly-available context.
What tells the automated system that flagged it initially used don't need to be revealed, and the feedback from the Oversight Board probably isn't "make these detailed changes to the abuse detector algorithm" but a more generalized "don't remove this kind of stuff".
I associate infosec with code. I associate content moderation with humans. Where things get challenging is when code is doing content moderation. The executive privilege I extend to human content moderators to discuss in private and not to explain their decision becomes a totally different thing when extended to code.
I think the real problem is content moderation is expensive, and your going to have to pay for it one way or another, and the current system is not paid enough for that kind of content moderation.
Him stating you basically have to go full reddit with community-integrated moderation is pretty much going to be the only sustainable stable system IMO.
However this is actually an argument for finer-grained, better resourced and (ideally) community-integrated moderation — so the communities themselves can police their own membership — noting in passing that such will of course permit (e.g.) white supremacists to protect themselves from harmful, hurtful ideas such as liberalism, equality and equity.
I think the problem is that if Facebook, Twitter and similar platforms were to publicly present an unambiguous defintion of what 'abusive content' is, then it would become fairly clear that they're engaging in selective enforcement of that standard based on characteristics of the perpetrator such as: market power, governmental influence, number of followers, etc.
For example, if the US State Department press releases start getting banned as misinformation, much as Russian Foreign ministry press releases might be, then I think this would result in a blowback detrimental to Facebook's financial interests due to increased governmental scrutiny. Same for other 'trusted sources' like the NYTimes, Washington Post, etc., who have the ability to retaliate.
Now, one solution is just to lower the standard for what's considered 'abusive' and stop promoting one government's propaganda above anothers, and focus on the most obvious and blatant examples of undesirable content (it's not that big of a list), but then, this could upset advertisers who don't want to be affiliated with such a broad spectrum of content, again hurting Facebook's bottom line.
Once again, an opportunity arises to roll out my favorite quote from Conrad's Heart of Darkness:
"There had been a lot of such rot let loose in print and talk just about that time, and the excellent woman, living right in the rush of all that humbug, got carried off her feet. She talked about ‘weaning those ignorant millions from their horrid ways,’ till, upon my word, she made me quite uncomfortable. I ventured to hint that the Company was run for profit."
None of this tech nerdery means a whole lot without "skin in the game." First, ensure that we have real liability and/or regulation in place, similar to the FDA and such, and THEN begin to work on solutions. I'm certain answers will reveal themselves much quicker.
There's a lot of debate about liability right now in the context of section 230 and it's not obvious to me that more liability will create better outcomes. It could just as easily lead to either an unmoderated digital hellscape or all social media being shut down.
People get super confused about the differences between abuse prevention, information security, and cryptography.
For instance, downthread, someone cited Kerckhoffs's principle, which is the general rule that cryptosystems should be secure if all information about them is available to attackers short of the key. That's a principle of cryptography design. It's not a rule of information security, or even a rule of cryptographic information security: there are cryptographically secure systems that gain security through the "obscurity" of their design.
If you're designing a general-purpose cipher or cryptographic primitive, you are of course going to be bound by Kerckhoff's principle (so much so that nobody who works in cryptography is ever going to use the term; it goes without saying, just like people don't talk about "Shannon entropy"). The principle produces stronger designs, all things being equal. But if you're designing a purpose-build bespoke cryptosystem (don't do this), and all other things are equal (ie, the people doing the design and the verification work are of the same level of expertise as the people whose designs win eSTREAM or CAESAR or whatever), you might indeed bake in some obscurity to up the costs for attackers.
The reason that happens is that unlike cryptography as, like, a scientific discipline, practical information security is about costs: it's about asymmetrically raising costs for attackers to some safety margin above the value of an attack. We forget about this because in most common information security settings, infosec has gotten sophisticated enough that we can trivially raise the costs of attacks beyond any reasonable margin. But that's not always the case! If you can't arbitrarily raise attacker costs at low/no expense to yourself, or if your attackers are incredibly well-resourced, then it starts to make sense to bake some of the costs of information security into your security model. It costs an attacker money to work out your countermeasures (or, in cryptography, your cryptosystem design). Your goal is to shift costs, and that's one of the levers you get to pull.
Everybody --- I think maybe literally everybody --- that has done serious anti-abuse work after spending time doing other information security things has been smacked in the face by the way anti-abuse is entirely about costs and attacker/defender asymmetry. It is simply very different from practical Unix security. Anti-abuse teams have constraints that systems and software security people don't have, so it's more complicated to raise attacker costs arbitrarily, the way you could with, say, a PKI or a memory-safe runtime. Anti-abuse systems all tend to rely heavily on information asymmetry, coupled with the defender's ability to (1) monitor anomalies and (2) preemptively change things up to re-raise attacker costs after they've cut their way through whatever obscure signals you're using to detect them.
Somewhere, there's a really good Modern Cryptography mailing list post from... Mike Hamburg? I think? I could be wrong there --- about the Javascript VM Google built for Youtube to detect and kill bot accounts. I'll try to track it down. It's probably a good example --- at a low level, in nitty-gritty technical systems engineering terms, the kind we tend to take seriously on HN --- of the dynamic here.
I don't have any position on whether Meta should be more transparent or not about their anti-abuse work. I don't follow it that closely. But if Cory Doctorow is directly comparing anti-abuse to systems security and invoking canards about "security through obscurity", then the subtext of Alec Muffett's blog post is pretty obvious: he's saying Doctorow doesn't know what the hell he's talking about.
Having worked in anti-abuse for nearly 20 years this is spot on. Even if it were possible, publishing “the algorithm” isn’t going to solve anything. It’s not like it can be published in secret or avoid being instantly obsolete.
All of this is an exercise balancing information asymmetry and cost asymmetry. We don’t want to add more friction than necessary to end users, but somehow must impose enough cost to abusers in order to keep abuse levels low.
Unfortunately for us, it generally costs far less for attackers to bypass systems than defenders to sustain a block.
As defenders we work to exploit things in our favor - signals and scale. Signals drive our systems be it ML, heuristics, signatures (or more likely a combination). Scale lets us spot larger patterns in space or time. At a cost. 99%+ effective systems are great, but at scale 99% is still not good enough. Errors in either direction will slip by in the noise; especially targeted attacks.
As a secondary step, some systems can provide recourse for errors. Examples might include temporary or shadow bans, rate limiting, error reporting, etc. Unfortunately, cost asymmetry comes into play again. It is far more costly to effectively remediate a mistake than it is to report one. We’re back to cost asymmetry.
All of this is suboptimal. If we had a better solution, it would be in place. Building and maintaining these systems is expensive and won’t go away unless something better comes along.
I think a big part of why this is a focus nowadays is because some "community standards" started crossing into political canards as abuse types, so normies who are not spammers are starting to bump into anti-abuse walls, which don't create real appeal processes because that is too expensive. Now the political class is starting to demand expensive things as a result, and they have the guns.
In the past the rules were obvious easy wins like "no child porn" and "no spam" that nobody really gave a shit about most anti-abuse and welcomed it because they never encountered it for their normie behavior.
These platforms to reduce the 'political' costs of their anti-abuse systems need to drop community standards that start becoming political canards, and say that if we are to enforce political canards one way or another, then it has to become law, creating a much higher barrier for the political class to enact because they have another political camp on the other side of the aisle fighting them tooth and nail, because all political canards have multiple sides.
That might mean dropping painful things like coronavirus misinformation enforcement, violent hate speech against LGBT groups in certain countries and even voting manipulation, because you have to let the political class determine the rule set there, not the company itself. Otherwise it will be determined for you, in a really bad way, even in the USA.
I mean, all of this might be true or it might not be true, but either way: if Cory Doctorow is appealing to "security through obscurity" to make his argument, he's making a clownish argument.
Yeah I'm not even thinking about cory, just talking about this general issue and why it has become an issue in the past 7 years, vs any other time. I really think it's come down to enforcing political things as rules, and suggesting to any lurker who works in anti-abuse in big tech that you need to start putting a price on enforcing political rules, much like you do in many other parts of anti-abuse as you explained, or your going to destroy the company eventually.
I know that would be really hard also in most big tech, because unfortunately there is a specific political opinon culture there, and basically suggesting that you stop enforcing LGBT hate speech is not going go well with the general employee population. Puts them in a rock and hard place, so it would probably have to be done confidentially.
Having moderated it is obvious to any moderator that a bit of opaqueness goes a long way, the reasons that posts get filtered as spam is never publicly disclosed for instance.
However I don’t really know if secret courts where posts are removed and people are banned based on secret laws are really the way to go regardless of their effectiveness because of Facebooks claims of benevolence.
Except the examples given of IA are so broad as to eliminate the distinction between IA and STO. Knowing a value that is in a space larger than 2^64 possibilities is qualitatively different than knowning something in a space of only millions of possibilities. The real difference with como is that it's a cat-and-mouse game (or Red Queens race as another commenter said).
It's more like being able to teleport all the keys in all the houses from under the doormat to under a rock in the garden once you notice thieves are checking the doormat. This would, in fact, appreciably increase house security on average, while still being STO.
Upon further reflection, the question is "how hard is it to find the needle in the haystack"
If you use a 128 bit key, but use a non-time-constant compare somewhere, then it's pretty darn easy to find the needle.
This is why the JPEG fingerprinting example from TFA doesn't qualify to be in the same category as a properly secured cryptographic key. They can notice that non-picture posts are not blocked, but picture posts are, which already greatly narrows it down. They could post a picture generated from the actual client, and see it go through, and narrow it down even more. That's not even that hard of a one for an attacker to figure out. It's much closer to "key under doormat" than "random key"
The degree of nuance in the parent comment is why I find informational asymmetry a useful way of understanding my frequent uneasy intuitions in discussions of security.
What I mean is informational asymmetry frames security within engineering practice, though perhaps at a price of winning the internet.
I kinda agree (and I wrote the cited article) but as soon as you pick a number (2^40? 2^64? 2^80? 2^128?) you are painting a huge target on your forehead, when it's better to teach people that the point is the asymmetries (plural) and how you use, combine and compose them.
Content hosters, YouTube, Facebook, twitter, etc. Need to Delegate moderation to 3rd parties and allow users to choose which third party they want moderation from. They should only take action for everyone when they are legally required to.
If you want that, you can get most of the way there with ActivityPub (Mastodon/Pixelfed/Friendica/etc...) and your choice of service provider. The problem, of course is that the big social platforms so dominate content discovery that things not shared there are unlikely to find a large audience.
I think they could both be right. Sure, you don't want to give away the technical tells (TLS client version, etc). But if something is being moderated for it's actual content, then I think it could be beneficial to say why. While you don't want nefarious groups corrupting the public perception through misinformation, you also don't want platforms doing this by suppressing legitimate speech.
I'm having a complicated thought... the same points he talks about information asymetry in relation to the preservation of value are at play in the political (i.e. public) games.
I didn't even know there were santa clara principles, in a rough sense, this is maintining some sort of value from the people who have read those to them who don't even know about such principles.
I seem to be thinking that information assymetry is statecraft, a "super-set" of the notion of abuse prevention (IA and security through obscurity) as trade craft (because the state contains the market/trade)
You are now stumbling into the dirty secret of how a large part of the world works, and the #1 priority for remediation if you have even a modicum of intention to make inroads at all into substantially changing things.
Info Asymmetry is the basis of power/entrenchment.
Any real world security system relies to some extent on security through obscurity. No museum is going to publish their security system.
It's only in the digital world that certain things, such as encryption, can be secure even under conditions where an adversary understands the entire system, so that security through obscurity in that context is frowned upon because it shouldn't be necessary.
But this is a special case. Security is mostly a red queens race, and "obscurity" or "informational asymmetry" is an advantage the defenders have.