Hacker News new | past | comments | ask | show | jobs | submit login
Emoji Generator with AI (emoji.fly.dev)
366 points by vicgalle_ on Sept 8, 2023 | hide | past | favorite | 154 comments



Hi! Creator here! I just started hacking on this last night and wasn't expecting to see this on HN already :)

It works by taking your prompt and generating an emoji using https://replicate.com/fofr/sdxl-emoji. Next, I remove the background using https://replicate.com/cjwbw/rembg. Then, click to download and add to slack!

It's all open-source, code is here: https://github.com/cbh123/emoji

Let me know if you have any questions!


I see that the perpetually-offended have already shown up.

My advice is to either simply ignore them, or point and laugh (your choice).

The one thing you can count on them is that they are never going to be satisfied. Most likely they're not even actually offended, not really. They just enjoy bullying other people.


That's such a coincidence. I was browsing Replicate last night for new models and played with the very same one and was impressed at the results (well, mostly). Kudos to you for turning it into something more accessible and the background removal idea. For everyone else, Replicate is well worth playing around with if you want to tinker with some models without much commitment.


Ahah, that's awesome. Thank you! Let me know if you have any questions about Replicate! Happy to help.


Awesome job! One criticism I have is these are emoji, they're more like bitmoji or memoji.


Super cool! I would add a checkbox to toggle whether you want background removal, or make that an optional second step. In about half the cases I tried, the background was an important part of my image and without it only weird artifacts remained.


great idea! working on this now


Just came off your Elixir talk photos on Twitter to find this on HN. Please share your ElixirConf talk recording. Producing a nice emoji/avatar for an AI agent seems like a cool idea.


Thanks so much :)

I'll share it as soon as I have access! I'll also post a transcript on my site (charlieholtz.com) soon.


Doesn't this cost money? Could you use a model from huggingface too. I'm trying to get into this kind of stuff, but I want to use free or local AI models to avoid bills.. lol


I guess even with hugging face you have to pay for the compute costs which could get pretty big pretty fast if something takes off.


Elixir... don't see that every day. I started reading through the code and was like "what the heck is an ex file?!!"

Never really seen Elixir before. Looks pretty nice.


Yes! I love Elixir :) Phoenix LiveView is really amazing. I feel so fast working in it. I got hooked after watching Chris McCord's 'Build a real-time Twitter clone in 15 minutes' demo, and things have improved a lot since then.


Elixir is a killer language and tool. It's considered esoteric these days but I think more and more folks are going to wake up to the inherent beauty of the Erlang BEAM and OTP applications pretty soon here.


Is Elixir really considered esoteric? There’s some pretty big businesses running on it.


It never took off as the Ruby killer. It hurts to say so because I agree with all the positive Elixir comments around here - IMO, people are missing out by not knowing of it.


The beauty of Ruby with the bulletproof-ness of Erlang.


Hi, neat idea. Appear to have hit an issue. https://i.imgur.com/XQGfBed.png Interesting, just not a clean Emoji.

Using :bob: (Thought it might give me "B.O.B.", "Bob Dylab", or "What about Bob")


Oh this is crazy, I didnt realize replicate had user models like that for chaining

Is there a good video or channel to follow thats all about replicate and similar services at this level of depth?


what did you use to recorder the demo in the GitHub readme? its very smooth on the transitions.


Really cool, I think this would be a nice service for Slack, etc to add custom emoji


I tried the prompt "electric bass" -- the generated image was OK (it was a three string bass, but whatever) but the background removal not only removed the background but most of the body of the bass because (I assume) the pickguard was the same color as the image background.


The word confused appears to cause a failure


I really like it. It does seem to be suffering a little from some biases that are probably coming from sdxl though.


What biases?


As others have pointed out it tends to associate some professions with specific races. And crime related emojis with black people. It's a Stable Diffusion issue.


It seems biased towards white people for most anything I input.

criminal - black man stealing - white murderer - white killer - white thief - white skin, black beard. arab? Indian? man murdering kittens - white


I just ran "a criminal" 8 times and got only white men.


It's not a stable diffusion issue, it's a culture issue.


"It's not a stable diffusion issue, it's a culture issue."

parroting and perpetuating the biases will ensure it will continue to be a culture issue.


The text to image models represent the data that is being fed to it. If you feed it 100 images and out of those 100 images 95 are of white men politicians and 5 are of politicians of color, that's what you are going to get back. These datasets are gathered automatically from various internet sources and represent the data that is available out in the wild. For stable diffusion to "fix" the bias in the data set would be simply impossible. That would require hiring an army of people whose sole purpose would be to "rebalance" the data set on some internal objective schema manually.


Id rather train it on real data and fix the root cause than to interject some version of utopia that won the minds of developers.


Remarkably complete, it even declined to make a dirty eggplant.


Wonderful, but the background removal breaks the image - the original final is great, then it destroys it trying to remove the background...

Try this prompt, and it gives a full image, then depending, it culls far too much when removing background, please make background removal optional?

"round brook as designed by comfort tiffany and alphonse mucha in green black and gold"


Yeah I’ve noticed that too. Adding a toggle soon!


The broken AI hands with extra fingers as the thumbs up / thumbs down button is a nice touch.


I assumed it was an intentional “two thumbs up/down” that amusingly highlighted questionable AI hands at the same time. :)


I know, I thought it was a mistake but then I literally laughed out loud once I understood.


Amen! Had a nice chuckle. Kudos for self-deprecating humor on the part of the authors.


Yes, that is freakin' awesome.


    Good news from Fly.io! We don’t collect bills smaller than $5.00. 

    Bad news, you made it on the HN homepage and your CC is now on fire.


I just saw this was on fly and thought the same thing lol. But I also thought to myself “this is actually a good use for fly, nice simple app to see if there’s traction before moving to something reliable.”


Do you have any specific references or instances that indicate fly.io as being unreliable?


FlyIO deleted my database and I lost all data.

Fly told me in 7-days they would automatically update my redis database. My plan was to manually update it that weekend. 3 days later, I get an alert that they migrated the db early. b/c I didn't have storage enabled, all data was gone.

Support ticket:

https://community.fly.io/t/forced-migration-to-v2-with-decei...


Fly.io definitely has an innaproprately casual attitude, and it's a big deal they migrated you early.

However, they do point out that literally any reboot would have wiped your data and that that could have happened at any time previously. I think there's some truth to the suggestion I've seen on HN that the users running into the worst fly.io issues are the ones aiming to spend single digit amounts per month and not paying for enough machines and disk space to have reliability be possible


Like I said in the forum history, I should have paid for persistent storage, but this data lose could have been avoided during a planned reboot.

FlyIO knew that there was no storage attached to the redis instance and there would be data loss, but they still chose to reboot it ahead of schedule.


Very anecdotal so take this lightly but the past year or so of Fly.io threads have had a lot of comments expressing negative experiences with reliability. I have no experience myself so I can't comment, but if you search for a fairly recent thread you'll probably find some people's experiences.


I was about to start building a new application and I really wanted to give Fly a shot, but the CLI literally wouldn’t connect to their builder API. The status page was all green, but clearly things weren’t working.

I moved to Render and I’ve had a much better experience.


I had something deployed on fly for a few months and was regularly running into random restarts and connection issues. I ended up switching to DO’s app platform and haven’t seen any of that since.


Here’s a good thread where this is discussed in detail, plus my own anecdotal experience:

https://news.ycombinator.com/item?id=36808296


Does fly do runaway billing?


Bandwidth would be the concern, I think. They won't auto-scale machines up/down for you.


Fly bandwidth pricing is pretty competitive; if bandwidth is an issue your bill will be cheaper on Fly than most other providers.


... And it is hugged to death now.


Curious as a non dev, could this equally have been on vercel or netlify or does this type of app lend itself to fly?


This is super, super cool.

However, I also observe that it's not very good at creating emoji that are legible at, well, emoji (body text) size.

Emoji and icon creation generally isn't as straightforward as you'd think -- you need to essentially create a series of clear shapes that are intelligible at body text size (32x32 pixels at retina, or even smaller), and then add texture and smaller details in a way that pops at higher resolutions, but "disappears" when downscaled.

I wonder if there's a pipeline that could implement that today. E.g. training on emoji at 32x32 resolution to generate those, and then some kind of intelligent upscaling algorithm also trained on low- and high-res emoji that is able to insert appropriate texture and details according to the emoji description.

I love the idea that in the not-too-distant future people will just be dictating their desired emoji, they'll save and reuse the ones they love from their friends, and we'll get yearly most-popular-new-emoji lists that reveal the national/worldly mood/zeitgeists...


I think this is based on https://replicate.com/fofr/sdxl-emoji

It's SDXL (Stable Diffusion's latest release), fine-tuned on Apple Emoji images.


I can’t stop thinking that in 2-3 years this kind of functionality might get bundled with popular chat platforms by default.

As I keep trying to generate :sexy-cthulu: and :van-gogh-with-an-eyepatch: , my queries time out, perhaps for the best.


> I keep trying to generate :sexy-cthulu:

The e621 model will probably have that in the training set…


Something similar (maybe not ai, idk) is included in Google's keyboard app. You can combine many emoji. Not all of them, but it's still quite nice!


Protip: if you remove the "hidden" class name from the div containing the key word "Recents", you can see recent emojis made by others (which is both fun and disturbing)


I keep looking for an element with "Recents" in it and Chrome isn't showing anything.


Looks like it was removed


Bummer. I'd love to see what other people have generated rather than just the curated stuff.


Even just knowing what i've entered i can understand hiding this.... especially if you're attaching your own name to things.

Plus, it opens liability issues.


There's a thread on /g/, as you can expect it's very... creative.


I think it has some xss attack as on load my ios safari asked if i want to download 2girls1cup.jpg


It seems to use %s.jpg as a filename and to request the browser to download the picture you clicked on as that filename, so probably some larrikin typed 2girls1cup in as a prompt and you happened to click on whatever picture came out and was in the feed.


> Larrikin is an Australian English term meaning "a mischievous young person, an uncultivated, rowdy but good hearted person", or "a person who acts with apparent disregard for social or political conventions".


Interested to know what can be done with the up/down votes, some kind of RLHF for image generation?

Unfortunately it seems some "tiled images" must have made it into the dataset as half the time it generates an array of tiny images instead of a single emoji. Out of about 5 or 6 tries I got one good one that wasn't centered correctly.


Yeah my thought was the up/down votes could be used to have a "Trending" or "Most Popular" section on the site. But that RHLF idea is interesting too — I've had the same problem with tiled images, and so fine-tuning on highly upvoted examples may help


Based on a few tests, the examples on the homepage appear to be heavily curated…


Even so, the spider has 10 legs!


This is so fun! I've been sharing the link all day with friends and colleagues.

BTW, the official new 15.1 Unicode emoji are being released on Monday -- if you're in the Bay Area, come to the release party at the Computer History Museum in Mountain View, and there's a livestream too: https://computerhistory.org/events/new-emoji-release-party/

(It's free!)


"threesome" is an interesting one.

Men are always black and the women are always white.

I wonder where it learned that..


Diversity training.


n=?


n=3 for me.

It's also "featured". And the featured one is an example of what I'm talking about.



I think we killed it


working on it!! scaling up now


yep it's dead


A few years ago I made a chat bot at work that read the last few messages on Mattermost and then used GPT-2 to predict how the conversation would continue. The bot would then post that message. I added emoji reactions as well so the bot could also react on posts. Shortly afterwards, somebody noticed an empty emoji reaction on a post. It turned out that the emoji names were just strings, and you could react even with non-existent emojis. The bot had invented the “phantom-dummy-tune” emoji. Promptly a colleague made an emoji for it and uploaded it under that name, and it would show up. Glad to see that this process can now be automated as well :-)


Based on some of these, it seems 'AI' still just can't figure out fingers.


When you setup SD1.5 with enough extras, the finger problem is effectively gone. For those very few that slip through, a new seed or inpainting takes care of it.


The thumbs up icon killed me


The generator isn't very accurate. I prompted "a sad bald construction worker" and I got a smiling construction worker wearing a hard hat. Only 1 out of 3 adjectives was implemented.


Construction workers probably need to wear a hard hat even when they are bald. ;)


True, but I think a human graphic artist would have known what I was asking for


The AI hands for the like button make the whole site for me.

https://imgur.com/a/4q89CQH


> Pro-tip: we'll automatiically pre-pend 'A TOK emoji of a' to your prompt. Try something simple like 'cat' or 'high five'.

What is a "TOK emoji"?


A sure sign that it was trained to replicate a particular style, which the trainer decided to name TOK.

It’s somewhat helpful to pick names that are a single token after byte pair encoding, which TOK happens to be.


> Pro-tip: we'll automatiically pre-pend 'A TOK emoji of a' to your prompt.

What if I write "cat, except give me a real photo"?


Now that this is working it's very cool. The background removal is a little intense, it keeps cutting off people's hair:

https://replicate.delivery/pbxt/EKWIccdjCTrcEdJYFN1kxo9fF5JY...

Some of these are extremely good, I hope you add some kind of curated "best of" gallary at some point.


I built a clone of this without the background removal: https://a.picoapps.xyz/argue-party?ref=hn


Seeing what other users are prompting it with is half the fun of the output.

Nice to see people merely pushing boundaries rather than ruining it for everyone for now.


One thing I've noticed over time is that none of these image generation AIs seem to know what an Echidna is.


Built a version of this using the same emoji model, but without the background removal: https://a.picoapps.xyz/argue-party?ref=hn

Let me know if you have any feedback or feature ideas.


Haha not bad, how do you link them? :anthropomorphic-banana-wielding-a-sword-and-shield:

I was going for this guy: https://www.thingiverse.com/thing:3327431


I tried a few times to get a good "donkey kong thumbs up" emoji but the results didn't make much sense (spritesheet-like, missing any kind of thumbs up). Most of them did contain Donkey Kong, in some weird AI-deformed way.



Awesome idea, but I gave it a couple of tries for "software integration" and "integrated" and both times gave me a massive 9x9 grid of disturbing emoji faces peeling off the screen like stickers. Likely broken.


Does it make sense to prompt it based on the names of existing Apple emojis? I can’t tell if it’s helping or not. Some tests are exceptionally good, some are inexplicably awful. Very cool either way! I had a lot of fun experimenting.


Awesome job! btw, what does "TOK" mean in "a TOK emoji of…" I've been looking everywhere but i dont find any reference to this acronym.


Doesn't work at all for me. I tried "pirate flag" and got something ugly and not at all piraty than was then made uglier and essentially blank by the background removal.


Did you try more than once? Happened on my first try too, next few were great


Yes I tried a few other prompt: from what I recall "syndicalist flag" gave a similar result, "hacker emblem" was refused as being potentially NSFW, and "anarchy" was the only one to give an actual result… of a cop emoji! Definitely not working well.


I love how Hulk Hogan is invariably green, since it mixes him with The Hulk

Hilarious


I get an error message when trying to generate:

"We can't find the Internet. Attempting to reconnect"

Then it's back to the start page with no emoji.


The handshake is hilarious. That's so many fingers


Symbolises any contract I signed well.


[X] I have read and agree to the Terms of Service and Sale of Soul Agreements.


>watermelon-guy


I like the Futurama ones. They’re inaccurate but in cute ways. Fry has smaller Frys coming out of his head and Leela has two eyes.


Asking it for one eyed anything produces only outputs with two eyes


You can save these and import them into WhatsApp via some sticker creation apps. This is fun


It produces some "interesting" results. I put in "robber" expecting that one emoji everybody thinks exists but doesnt, and I got a black guy with a white beanie/hat.

Apart from that it produces insanely convincing results. I believe people would not bat an eye if they'd get added in an update to iOS!


Heh, I put in "grapefruit" and it gave me just an absolute mess of like 100 grapefruit slices and then added a bunch of smoke over the top when it "removed the background".

Then I tried "day gecko" and the output absolutely rocked. So I guess my first try was unlucky.


It loves giving me 100 of someone. I'm wondering if there are sticker pack-like images in it's training set.

And then, yes, the background removal takes away half of these and leaves smoke.


Yeah, that’s a good point. We’re currently training a new version that reduces this bias. It’s really important to us that we make this better.


Drug dealer, car thief, car jacker, drive-by-shooter, murderer...all of those things and many other crime related terms will give you a black person.

Quite a shit thing to be happening.


AI models and their biases are reflections of the biases of society. I don’t like them, but it’s hard to argue with the fact it’s just math producing the semantic best likelihood based on publicly sourced training data.


See that's the thing about content creation— you entirely lose the "it's just a tool and tools are inherently neutral" argument. Nobody's claiming that the implementor created deliberately racist content, but you can't merely disavow responsibility for the content you generate because the input and algorithms you chose were the root of the problem. Like it or not, what's included in the source data for these models is an editorial choice, and which model you use and how you use it are also editorial choices.


Actually I think “it’s a reflection of society’s bias” is a totally reasonable statement to make if you’re product is a reflection of societies generated content.

Rather than impugning the model makers for not curating societies content to erase its biases, to my mind it demonstrates what’s broken with society, and should be used as an indictment of how we encode our society in our media - the fact that if you asked an oracle a question it produces racist output is more an indictment of society itself because the media of our society IS racist. Sweeping it under the rug serves no one, IMO.

Instead the story is “AI models are racist,” which misses the real problem. The real problem is that when a human in our society wants to portray a robber, they use a black man. That should be the story, and pitching it as a flaw with AI models is like criticizing the color of the paint when the foundation is cracked.


How does perpetuation of stereotypes fix society?


It's not perpetuating stereotypes, it's showing that these stereotypes exist in society. Similar to how a comedian might point out the absurdity of racism by writing and delivering a joke about race. Or how a child might ask why people of certain skin tones tend to have different hair styles as well. Neither the child nor the comedian are definitely racist, but they are making observations that may not be considered politically correct.



The fact we, and many people, are having these conversations wouldn’t happen if they curated societies biases out of their product.

Saying their product perpetuates stereotypes but ignoring its reflecting the entirety of medias bias is ignoring that all media perpetuates these stereotypes, and their models are no more or less perpetuating of these stereotypes than literally the entirety of all media. The fact they hold it up for careful examination in an irrefutable way is a feature in my mind, not a flaw.

I would note that SD produces a base model, which can and is routinely fine tuned. I would rather see a fine tuning that eliminates the bias in the media than see a base model that is fundamentally divergent from the state of the media today. That’s the proper abstraction - a base model that’s the basic output from mass training on available media, and models that specialize for some curation. But I also object to the base models being censored. The reason why is it cuts off a base truth from the semantic models underlying such that it’s outputs are at odds with observable reality. Specialized models shouldn’t be doing things like un censoring, but adjusting base truth to curated views. “Unbiased model” should be a fine tuning of reality. “Safe for work model” should be a fine tuning of reality. The challenge is the model producers don’t trust the model users to be adults making adult decisions and thinking adult thoughts - including about biases, stereotypes, etc.

But regardless, I think base models should never be thought of as final products but as a basis to produce a final product.


> Saying their product perpetuates stereotypes but ignoring its reflecting the entirety of medias bias is ignoring that all media perpetuates these stereotypes

Not being further up on a hierarchy of many bad actors doesn't absolve you from responsibility. What goes into these models is an editorial decision, as is which one to use in your software. This author isn't distributing models and they aren't being held to task over the model's content— they're generating images from those models and distributing them, and those images are what bother people. If they used the same model but somehow didn't get objectionable results, nobody would know, let alone care.

You say the media is to blame? Sure. When you start generating content with a particular perspective, you are media.


What is wrong with stereotypes?

Snakes are, at times, quite dangerous. Granted, not all snakes, but in terms of being risk-averse, and unaware of which snakes are quite possibly deadly, am I not, at least in my current ignorant state, best served to avoid all snakes?

Feel free to substitute "snakes" with any existing stereotype. I personally am a fan of "people who eat pizza with pineapple".


There's a whole lot of information out there about the harm caused by stereotypes. Normally I tell people to look it up because it's not hard and I'm not your research assistant, but here's a freebie: "Are Greg and Emily More Employable Than Lakisha and Jamal" is a mid-aughts study in which the authors sent out 5k job applications using fabricated equally-weighted resumes that were randomly assigned a stereotypically Black American or stereotypically white American name. Resumes attached to 'white' sounding names were fifty percent more likely to recieve a callback. Would you consider that a reasonable pan-industry attempt by employers to protect themselves from harm based entirely on someone's name?


Your mistake is you're only thinking about it from the perspective of the stereotyper. Now think about the negative impact (career, legal system, etc) on the person being incorrectly stereotyped.


Bad faith actor


>AI models and their biases are reflections of the biases of society.

It's not one way though it perpetuates the bias. I'm not blaming the author of this as its Stability AI that need to clean their data.


Yeah I’ve also noticed an unintended bias that we’re not proud of.

We’re on it – we’re reviewing the model and fixing the issue ASAP. Thanks for pointing it out and for your patience!


Just take a step back and think about what you've said and the meaning of the word "bias"


:( Everyone really needs to clean their data


Who will clean the cleaners? Are they without bias and without blemish?


The generated "emoji" size is 500kb. That's really a lot.


They're relatively large PNGs, so that's not surprising. Resize and convert to JPG if you want to use them for something and care about the size.


Alien -> uh, oh, failed to generate, looks like NSFW input!


It didnt get "Tiger Woods" correct.


Wow this is fun.


Build database gives me a blank emoji


Tons of fun :) thanks for the laugh!


types "ubutnu distro is sad emoji" it says prompt is likely nsfw try again


It could replace written language.

Needs more training though. "hate" got me a smiling bearded man.

And it's got a NSFW filter. Seriously?


What were you expecting a "hate" emoji to look like?


An angry face at minimum. Bared teeth. Glaring eyes. You know, the obvious.

I mean that is dead obvious, right?

Or, if there is some deeper symbology that I'm not aware of, that. It would be nice to be surprised.


That just sounds like anger to me


Buddy bradley.


I see a bunch of Disney characters there.

They look exactly like how they're supposed to look.

That might be a problem, legally speaking ...


This doesn’t appear to have any filter for generating based on racist / extremist language


useless !




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: