I see that the perpetually-offended have already shown up.
My advice is to either simply ignore them, or point and laugh (your choice).
The one thing you can count on them is that they are never going to be satisfied. Most likely they're not even actually offended, not really. They just enjoy bullying other people.
That's such a coincidence. I was browsing Replicate last night for new models and played with the very same one and was impressed at the results (well, mostly). Kudos to you for turning it into something more accessible and the background removal idea. For everyone else, Replicate is well worth playing around with if you want to tinker with some models without much commitment.
Super cool! I would add a checkbox to toggle whether you want background removal, or make that an optional second step. In about half the cases I tried, the background was an important part of my image and without it only weird artifacts remained.
Just came off your Elixir talk photos on Twitter to find this on HN. Please share your ElixirConf talk recording. Producing a nice emoji/avatar for an AI agent seems like a cool idea.
Doesn't this cost money? Could you use a model from huggingface too.
I'm trying to get into this kind of stuff, but I want to use free or local AI models to avoid bills.. lol
Yes! I love Elixir :) Phoenix LiveView is really amazing. I feel so fast working in it. I got hooked after watching Chris McCord's 'Build a real-time Twitter clone in 15 minutes' demo, and things have improved a lot since then.
Elixir is a killer language and tool. It's considered esoteric these days but I think more and more folks are going to wake up to the inherent beauty of the Erlang BEAM and OTP applications pretty soon here.
It never took off as the Ruby killer. It hurts to say so because I agree with all the positive Elixir comments around here - IMO, people are missing out by not knowing of it.
I tried the prompt "electric bass" -- the generated image was OK (it was a three string bass, but whatever) but the background removal not only removed the background but most of the body of the bass because (I assume) the pickguard was the same color as the image background.
As others have pointed out it tends to associate some professions with specific races. And crime related emojis with black people. It's a Stable Diffusion issue.
The text to image models represent the data that is being fed to it. If you feed it 100 images and out of those 100 images 95 are of white men politicians and 5 are of politicians of color, that's what you are going to get back. These datasets are gathered automatically from various internet sources and represent the data that is available out in the wild. For stable diffusion to "fix" the bias in the data set would be simply impossible. That would require hiring an army of people whose sole purpose would be to "rebalance" the data set on some internal objective schema manually.
I just saw this was on fly and thought the same thing lol. But I also thought to myself “this is actually a good use for fly, nice simple app to see if there’s traction before moving to something reliable.”
Fly told me in 7-days they would automatically update my redis database. My plan was to manually update it that weekend. 3 days later, I get an alert that they migrated the db early. b/c I didn't have storage enabled, all data was gone.
Fly.io definitely has an innaproprately casual attitude, and it's a big deal they migrated you early.
However, they do point out that literally any reboot would have wiped your data and that that could have happened at any time previously. I think there's some truth to the suggestion I've seen on HN that the users running into the worst fly.io issues are the ones aiming to spend single digit amounts per month and not paying for enough machines and disk space to have reliability be possible
Very anecdotal so take this lightly but the past year or so of Fly.io threads have had a lot of comments expressing negative experiences with reliability. I have no experience myself so I can't comment, but if you search for a fairly recent thread you'll probably find some people's experiences.
I was about to start building a new application and I really wanted to give Fly a shot, but the CLI literally wouldn’t connect to their builder API. The status page was all green, but clearly things weren’t working.
I moved to Render and I’ve had a much better experience.
I had something deployed on fly for a few months and was regularly running into random restarts and connection issues. I ended up switching to DO’s app platform and haven’t seen any of that since.
However, I also observe that it's not very good at creating emoji that are legible at, well, emoji (body text) size.
Emoji and icon creation generally isn't as straightforward as you'd think -- you need to essentially create a series of clear shapes that are intelligible at body text size (32x32 pixels at retina, or even smaller), and then add texture and smaller details in a way that pops at higher resolutions, but "disappears" when downscaled.
I wonder if there's a pipeline that could implement that today. E.g. training on emoji at 32x32 resolution to generate those, and then some kind of intelligent upscaling algorithm also trained on low- and high-res emoji that is able to insert appropriate texture and details according to the emoji description.
I love the idea that in the not-too-distant future people will just be dictating their desired emoji, they'll save and reuse the ones they love from their friends, and we'll get yearly most-popular-new-emoji lists that reveal the national/worldly mood/zeitgeists...
Protip: if you remove the "hidden" class name from the div containing the key word "Recents", you can see recent emojis made by others (which is both fun and disturbing)
It seems to use %s.jpg as a filename and to request the browser to download the picture you clicked on as that filename, so probably some larrikin typed 2girls1cup in as a prompt and you happened to click on whatever picture came out and was in the feed.
> Larrikin is an Australian English term meaning "a mischievous young person, an uncultivated, rowdy but good hearted person", or "a person who acts with apparent disregard for social or political conventions".
Interested to know what can be done with the up/down votes, some kind of RLHF for image generation?
Unfortunately it seems some "tiled images" must have made it into the dataset as half the time it generates an array of tiny images instead of a single emoji. Out of about 5 or 6 tries I got one good one that wasn't centered correctly.
Yeah my thought was the up/down votes could be used to have a "Trending" or "Most Popular" section on the site. But that RHLF idea is interesting too — I've had the same problem with tiled images, and so fine-tuning on highly upvoted examples may help
This is so fun! I've been sharing the link all day with friends and colleagues.
BTW, the official new 15.1 Unicode emoji are being released on Monday -- if you're in the Bay Area, come to the release party at the Computer History Museum in Mountain View, and there's a livestream too: https://computerhistory.org/events/new-emoji-release-party/
A few years ago I made a chat bot at work that read the last few messages on Mattermost and then used GPT-2 to predict how the conversation would continue. The bot would then post that message. I added emoji reactions as well so the bot could also react on posts. Shortly afterwards, somebody noticed an empty emoji reaction on a post. It turned out that the emoji names were just strings, and you could react even with non-existent emojis. The bot had invented the “phantom-dummy-tune” emoji. Promptly a colleague made an emoji for it and uploaded it under that name, and it would show up. Glad to see that this process can now be automated as well :-)
When you setup SD1.5 with enough extras, the finger problem is effectively gone. For those very few that slip through, a new seed or inpainting takes care of it.
The generator isn't very accurate. I prompted "a sad bald construction worker" and I got a smiling construction worker wearing a hard hat. Only 1 out of 3 adjectives was implemented.
I tried a few times to get a good "donkey kong thumbs up" emoji but the results didn't make much sense (spritesheet-like, missing any kind of thumbs up). Most of them did contain Donkey Kong, in some weird AI-deformed way.
Awesome idea, but I gave it a couple of tries for "software integration" and "integrated" and both times gave me a massive 9x9 grid of disturbing emoji faces peeling off the screen like stickers. Likely broken.
Does it make sense to prompt it based on the names of existing Apple emojis? I can’t tell if it’s helping or not. Some tests are exceptionally good, some are inexplicably awful. Very cool either way! I had a lot of fun experimenting.
Doesn't work at all for me. I tried "pirate flag" and got something ugly and not at all piraty than was then made uglier and essentially blank by the background removal.
Yes I tried a few other prompt: from what I recall "syndicalist flag" gave a similar result, "hacker emblem" was refused as being potentially NSFW, and "anarchy" was the only one to give an actual result… of a cop emoji! Definitely not working well.
It produces some "interesting" results. I put in "robber" expecting that one emoji everybody thinks exists but doesnt, and I got a black guy with a white beanie/hat.
Apart from that it produces insanely convincing results. I believe people would not bat an eye if they'd get added in an update to iOS!
Heh, I put in "grapefruit" and it gave me just an absolute mess of like 100 grapefruit slices and then added a bunch of smoke over the top when it "removed the background".
Then I tried "day gecko" and the output absolutely rocked. So I guess my first try was unlucky.
AI models and their biases are reflections of the biases of society. I don’t like them, but it’s hard to argue with the fact it’s just math producing the semantic best likelihood based on publicly sourced training data.
See that's the thing about content creation— you entirely lose the "it's just a tool and tools are inherently neutral" argument. Nobody's claiming that the implementor created deliberately racist content, but you can't merely disavow responsibility for the content you generate because the input and algorithms you chose were the root of the problem. Like it or not, what's included in the source data for these models is an editorial choice, and which model you use and how you use it are also editorial choices.
Actually I think “it’s a reflection of society’s bias” is a totally reasonable statement to make if you’re product is a reflection of societies generated content.
Rather than impugning the model makers for not curating societies content to erase its biases, to my mind it demonstrates what’s broken with society, and should be used as an indictment of how we encode our society in our media - the fact that if you asked an oracle a question it produces racist output is more an indictment of society itself because the media of our society IS racist. Sweeping it under the rug serves no one, IMO.
Instead the story is “AI models are racist,” which misses the real problem. The real problem is that when a human in our society wants to portray a robber, they use a black man. That should be the story, and pitching it as a flaw with AI models is like criticizing the color of the paint when the foundation is cracked.
It's not perpetuating stereotypes, it's showing that these stereotypes exist in society. Similar to how a comedian might point out the absurdity of racism by writing and delivering a joke about race. Or how a child might ask why people of certain skin tones tend to have different hair styles as well. Neither the child nor the comedian are definitely racist, but they are making observations that may not be considered politically correct.
The fact we, and many people, are having these conversations wouldn’t happen if they curated societies biases out of their product.
Saying their product perpetuates stereotypes but ignoring its reflecting the entirety of medias bias is ignoring that all media perpetuates these stereotypes, and their models are no more or less perpetuating of these stereotypes than literally the entirety of all media. The fact they hold it up for careful examination in an irrefutable way is a feature in my mind, not a flaw.
I would note that SD produces a base model, which can and is routinely fine tuned. I would rather see a fine tuning that eliminates the bias in the media than see a base model that is fundamentally divergent from the state of the media today. That’s the proper abstraction - a base model that’s the basic output from mass training on available media, and models that specialize for some curation. But I also object to the base models being censored. The reason why is it cuts off a base truth from the semantic models underlying such that it’s outputs are at odds with observable reality. Specialized models shouldn’t be doing things like un censoring, but adjusting base truth to curated views. “Unbiased model” should be a fine tuning of reality. “Safe for work model” should be a fine tuning of reality. The challenge is the model producers don’t trust the model users to be adults making adult decisions and thinking adult thoughts - including about biases, stereotypes, etc.
But regardless, I think base models should never be thought of as final products but as a basis to produce a final product.
> Saying their product perpetuates stereotypes but ignoring its reflecting the entirety of medias bias is ignoring that all media perpetuates these stereotypes
Not being further up on a hierarchy of many bad actors doesn't absolve you from responsibility. What goes into these models is an editorial decision, as is which one to use in your software. This author isn't distributing models and they aren't being held to task over the model's content— they're generating images from those models and distributing them, and those images are what bother people. If they used the same model but somehow didn't get objectionable results, nobody would know, let alone care.
You say the media is to blame? Sure. When you start generating content with a particular perspective, you are media.
Snakes are, at times, quite dangerous. Granted, not all snakes, but in terms of being risk-averse, and unaware of which snakes are quite possibly deadly, am I not, at least in my current ignorant state, best served to avoid all snakes?
Feel free to substitute "snakes" with any existing stereotype. I personally am a fan of "people who eat pizza with pineapple".
There's a whole lot of information out there about the harm caused by stereotypes. Normally I tell people to look it up because it's not hard and I'm not your research assistant, but here's a freebie: "Are Greg and Emily More Employable Than Lakisha and Jamal" is a mid-aughts study in which the authors sent out 5k job applications using fabricated equally-weighted resumes that were randomly assigned a stereotypically Black American or stereotypically white American name. Resumes attached to 'white' sounding names were fifty percent more likely to recieve a callback. Would you consider that a reasonable pan-industry attempt by employers to protect themselves from harm based entirely on someone's name?
Your mistake is you're only thinking about it from the perspective of the stereotyper. Now think about the negative impact (career, legal system, etc) on the person being incorrectly stereotyped.
It works by taking your prompt and generating an emoji using https://replicate.com/fofr/sdxl-emoji. Next, I remove the background using https://replicate.com/cjwbw/rembg. Then, click to download and add to slack!
It's all open-source, code is here: https://github.com/cbh123/emoji
Let me know if you have any questions!