Hacker News new | past | comments | ask | show | jobs | submit login

> I’m trying to create a space where people can gather like the RSS feed mentioned, but where they own their own writing, and can profit off of it if they want to opt in to letting it be trained. It sounds a lot easier than it is, the problem is a little weird.

I mean, maybe I'm just defeatist, but it sounds near-impossible to me. The companies that train AI models have already shown that they don't give a damn about creator rights or preferences. They will happily train on your content regardless of whether you've opted in.

So the only way to build a "space" that prevents this is by making it a walled garden that keeps unauthorized crawlers out entirely. But how do you do that while still allowing humans in? The whole problem is that bots have gotten good enough at (coarsely) impersonating humans that it's extremely difficult to filter them out at scale. And as soon as even one crawler manages to scrape your site, the cat's out of the bag.

You can certainly tell people that they own their content on a given platform, but how can you hope to enforce that?




The counter is poisoning the well of training data not trying to hide it.

Crawling the web is cheap. Finding hidden land mines in oceans of data can be next to impossible because a person can tell if something isn’t being crawled but they can’t inspect even a tiny fraction of what’s being ingested.


You’re getting onto what I am saying a little, I think. You want to scrape my data? and I can prove that you did? The way the legislation is going in certain areas, I’m pretty sure there will be a crackdown. I am pretty sure a sufficiently large userbase could mess something up for a scraper. I think, anecdotally, we’re seeing evidence of this type of warfare already. And yea, the challenge is not letting bots in. but then you don’t even have to worry about that so much as if the data can be shown to manipulated and twisted to an agents nefarious interests, whatever they may be, you’re gonna get a flood of users that look and act seemingly real but aren’t.

It’s an interesting problem I think is solvable, traction is one issue, and then, building a product appealing enough for people to feel comfortable they’re not being exploited.

like, if you want this to work properly, you have to shut it down from every other part of the internet that can become bothersome with bot behavior. Like, federated logins, social media, secure proxies, etc. Nothing touches it. Treat it like the blackwall in cyberpunk (actually what inspired me). I would pay for this. Like a lot for it. but, that is a difficult sell because to migrate off these apps requires legit lifestyle changes, and people (rightfully) want both.

I get worked up sometimes on the topic because while I am dubious but sometimes wrong about AI capabilities, but if I believe some of what is said at face value, I do strongly believe a day is coming and may be here that you will have zero guarantee of someone you are talking to is a bot, or an ai, or even a video/voice like agent based on a real person - that future is a destroyed internet. I think people should probably get around to thinking of what a disaster that would be.


I am glad people smarter than me are thinking about it. I seriously suck at networking so I don't trust my thoughts on potential solutions. Maybe the issue is that we are trying to solve it as a technical problem, while the problem is we don't know who is really human, which seems a little closer to meatspace.


Part of an issue for me is it feels like pissing into the wind - some of my ideas cant even be implemented on current gen ios, because of all the “smart” ai features that try to gobble up everything I do on it. I know better now than to give that stuff out, most people have some sense, because it’s like, most apps work a lot better when that’s turned on, but completely break when it’s off, so aren’t they kind of being coercive there? Windows is becoming more like this too. I don’t use androids much if I can help it because being tied into google system is often incredibly annoying. so what is left? no one can just recreate a sane and semi comfortable less invasive internet from scratch. But if I can’t even trust my own tech not to tattle on me for crimes as “egregious” as the accelerometer in my phone reporting i braked a little too hard, welp, congrats, now my policy went up even though I didn’t do anything wrong and likely even avoided accident. That’s my main gripe with this tech, like just let me do what I want without trying to subtly influence or manipulate me and get me off of these invasive applications.


Oddly, this is why Russia's tests[1] and China's firewall[2] may end up being the unfortunate end state of splinternet. I am on android and played with some of the alternatives. Sadly, you are right.. it is genuinely hard to get away from the convenience.. and I care. It would be impossible to get someone, who doesn't care to make that jump.

I don't want to start my rant on cars, because even now I am debating buying an old clunker just to avoid some of the technology in modern cars ( not that it would stop cell snooping on me.. ).

[1]https://www.pcmag.com/news/russia-tests-cutting-off-access-t... [2]https://en.wikipedia.org/wiki/Great_Firewall


There have probably been things written about it in more technical detail, but the chinese firewall became an issue in world of warcraft, when players discovered the packet sniffing and used addons to take out chinese players and guilds by dropping malicious keywords into packets (wow addons also have the additional problem of having way too much machine access). Honestly, I take no political position here at all, because I think most countries are engaging in some sort of form of this behavior - but in the case of WoW, it basically forced chinese players who were becoming quite competitive out of the game entirely. I felt really bad for them. How do you even mitigate issues like that in a global internet? I have no clue. I’m sure scrapers of all sorts of countries are tripping up over all sorts of censorship stuff, which is why, I think a tightly protocoled isolated section of the web would likely be a good thing to have as a safety raft.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: