I work for a remote-only company but use a workspace almost every day. I get to chose my own “office” and the people in it, I also pick the commute I want, this one is just 5 minutes away.
Imagine a future where state actors have hundreds of AI agents fixing bugs, gaining reputation while they slowly introduce backdoors. I really hope open source models succeed.
I work for a large closed-source software company and I can tell you with 100% that it is full of domestic and foreign agents. Being open source means that more eyes can and will look at something. That only increases the chance of malicious actions being found out ... just like this supply-chain attack.
Because in the closed source model the frustrated developer that looked into this SSH slowness submits a ticket for the owner of the malicious code to dismiss.
It’s insane to consider the actual discovery of this to be anything other than a lightning strike. What’s more interesting here is that we can say with near certainty that there are other backdoors like this out there.
> Imagine a future where state actors have hundreds of AI agents fixing bugs, gaining reputation while they slowly introduce backdoors. I really hope open source () succeed.
I guess we can only hope verifiable and open source models can counteract the state actors.
Not necessarily. A frustrated developer posts about it, it catches attention of someone who knows how to use Ghidra et al, and it gets dug out quite fast.
Except, with closed-source software maintained by a for-profit company, suck cockup would mean a huge reputational hit, with billions of dollars of lost market cap. So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
But with open-source, anyone can be a contributor, everyone is a friend, and nobody is reliably real-world-identifiable. So, carrying out such attacks is easier by orders magnitude.
> So, there are very high incentives for companies to vet their devs, have proper code reviews, etc.
I'm not sure about that. It takes a few leetcode interviews to get in major tech companies. As for the review process, it's not always thorough (if it looks legit and the tests pass...). However, employees are identifiable and would take huge risk to be caught doing anything fishy.
We witnessed Juniper generating their VPN keys with Dual EC DRGB, and then the generator constants subverted with Juniper claiming of now knowing how did it happen.
I don’t think it affected Juniper firewall business in any significant way.
... if we want security it needs trust anyway. it doesn't matter if it's amazing Code GPT or Chad NSA, the PR needs to be reviewed by someone we trust.
it's the trust that's the problem.
web of trust purists were right just ahead of the time.
It would actually be sort of interesting if multiple adversarial intelligence agencies could review and sign commits. We might not trust any particular intelligence agency, but I bet the NSA and China would both be interested in not letting much through, if they knew the other guy was looking.
That is an interesting solution. If China, US, Russia, EU, etc all sign off and say "yep this is secure" we should trust it. Since if they think they found an exploit, they might assume the other people found an exploit. This is a little bit like the idea of a fair cut for a cake. If you have two people that want the last slice of cake, you have one cut and the other choose the first slice, since the chooser will choose the biggest slice, so the slicer knowing they will get the smaller will make it as equal as possible. In this case the NSA makes the cut (the code), and Russia / China chooses if its allowed in.
this is why microsoft bought github and has been onboarding major open source projects. they will be the trusted 3rd party (whether we like it our not is a different story)
Imagine a world where a single OSS maintainer can do the work of 100 of today’s engineers thanks to AI. In the world you describe, it seems likely that contributors would decrease as individual productivity increases.
Wouldn't everything produced by an AI explicitly have to be checked/reviewed by a human? If not, then the attack vector just shifts to the AI model and that's where the backdoor is placed. Sure, one may be 50 times more efficient at maintaining such packages but the problem of verifiably secure systems actually gets worse not better.
> […] you need to do well in your classes to get into a good university.
Isn’t this also mostly inherited: do most get into the good universities because of their grades or their connections? Can most get good grades without an “inherited” support system?
> No one in Denmark knows what this case is about, except that these top officials are being prosecuted for vague reasons, and with an undertone that it is all very unfair and probably a bad thing for Denmark.
Everybody in Denmark knows what this is about, it’s a badly held secret. In fact the supreme court denied running the case behind closed doors because the secret is considered “publicly known” at this point[1]
Right around 37 minutes in, Brevik mentions that Mike O'Brien (along with Pat, one of the three co-founders of ArenaNet) was the brains behind Battle.net, and then "a few of the guys from [Blizzard South] moved up north during the last six months of development and started making Diablo into multiplayer and integrating Battle.net into the entire thing."
That lines up with Pat's assessment:
> Initially Collin Murray, a programmer on StarCraft, and I flew to Redwood City to help, while other developers at Blizzard “HQ” in Irvine California worked on network “providers” for battle.net, modem and LAN games as well as the user-interface screens (known as “glue screens” at Blizzard) that performed character creation, game joining, and other meta-game functions.
Thanks for the link but I think David Brevik said a couple guys from blizzard came up to help for 6 months and he had zero multiplayer code built as well as it was his first C program.
I took a take-home assignment from fly.io a while back; they promised that every completion would receive real human feedback.
As far as I remember, I was told it would take a few hours, maybe 4, but the assignment looked rather fun, so I thought, whatever, let’s do it.
Maybe I didn't understand the assignment fully or missed some cues; I don't know. But it took me roughly 10 hours spread across two days. After 5-6 hours, you don't feel like throwing the work away; you just want to finish. It was pretty frustrating.
After returning the assignment, I waited the two weeks they asked for and heard nothing. I sent one email, waited a week, got no reply, sent another, waited some more, and still received no reply. I ended up sending a handful of emails to various addresses I could find. I even sent a DM on Twitter to one of the founders to let them know. No reply anywhere.
The job listing promised mountains and that it will take few hours and they’ll take time in their feedback,etc.
except i just got a generic reply that “we’re not satisfied with the solution”. For something that took ~10 hours, i was at the least expecting a vague pointers on whys. I even sent an one pager explaining style decisions, caveats, etc. It was pretty insulting.
And when asked for a follow up, got a more generic bs about how the evaluation criteria is honed from their years of experience and that is not something they share outside.
One good thing that is it made me realise interviewing still sucks and i just stopped looking for jobs.
I went and looked at your submission, and I am comfortable with how we handled it and the level of detail we gave you. Would you mind sending me an email so I can hear a bit more? It sounds like we might've implied something we didn't mean to.
These should not take 10 hours, unless you're learning Rails from scratch or something (some people do this). We've tried different ways of saying "don't spend more than 2 hours on this unless you really want to" and it doesn't always come through.
The "detail"I received is the same generic reply you've shared in the other thread.
I don't know what fly's internal processes are, but I suspect the reviewer had to have submitted some bit of internal feedback on each submissions on why they're okaying it or not. A filtered version of it or even a separate candidate specific feedback would be just enough.
I've been at the hiring side of the table for many years and every time we share a handcrafted feedback to interviewees that we have passed on, we always heard back good things. They know we respected their time (and them), even if they don't agree with the particular feedback.
This feedback is not meant for them to get "better" or something. My point is, it could even be 2 hours the candidate spends, the least they deserve is knowing that it was actually looked at.
No that's not how ours works. We have, basically, 20 checkboxes for "things that are good about this work". These range from "strong user focus" to "logic to prevent <some problem we've previously had with billing>".
The submissions that we continue with check a bunch of boxes. The submissions that we pass on don't.
The problem is, we're not hiring people to build to a spec. We want to assess your decisions and ability to go from a basic problem to a first implementation. If we shared the rubric with folks, they'd focus entirely on trying to check the boxes and I don't think we'd get an accurate assessment.
Your point about valuing peoples' time is important, though. We have not yet found the right balance for everyone.
"The problem is, we're not hiring people to build to a spec. We want to assess your decisions and ability to go from a basic problem to a first implementation"
You could have told the candidate that, then. That would have been some useful information to have for the hours the candidate put into their submission. Or like "hey, we're going to just skim over this with a subjective list of check boxes".
I have enough experience that I probably would have smelled this before I got that deep into the process and noped out, but for the hopeful mid-level/new-senior devs this sounds pretty demoralizing.
I’ve worked with, and been on hiring committees with people like mrkurt before. What always happens is they reject a bunch of candidates toward the beginning of the process, then eventually the time comes where they MUST hire someone. Because someone or something they are accountable to (investors, their boss or commitments they’ve made) asks why they can’t hire. Then there is a mad dash to interview and hire someone where standards are greatly reduced. They then end up hiring someone with similar skills and risk profile of people they have previously rejected anyways. The net result is just a bunch of time wasted for everyone.
If you ever get asked to do one of these exercises, it’s useful to try to determine where they are in this process. Ask about their hiring timeline, how long they have been interviewing for the position, and try to get a feel for how fatigued they are in general.
If it feels like they are just starting, ask how long you have to complete the exercise. If there isn’t a time limit, it’s better to wait as long as possible, so they can reject other candidates first and burn themselves out. If there is a time limit, ask to pause the interview process (insert some excuse here) before they send out the exercise.
If they are near the end and sound exhausted, that’s a good sign and the effort might not be wasted.
You know, "take-home" projects are so cargo-culted, and so poorly executed at so many places, and in such bad faith, that this I think turns out to be perfectly reasonable, valuable survival advice, even though it applies literally not at all, in any way whatsoever, to the hiring process you're commenting on.
Submissions are supposed to predict and handle issues that you guys have had in production? That doesn't seem very reasonable for a 2 hour task.
I wonder if your hiring process is skewed by the perfectionists who spend tens of hours of their submissions. Arguably that's the type of thing you guys would want to select for but it's not at all fair for those who timebox themselves to 2 hours or whatever.
That's exactly how it works. Spend more time, cover more edge cases.
This sounds like a classic "my developer reckons he can do this in 2 hours (but never actually has)". The reality is your developer is crap at estimating.
A good developer will be lucky to produce 100 lines a day. if you doubt that check your git history (excluding autogenerated crap).
So unless the project you're setting people needs, on average, 25 lines, whatever you're setting people is clearly not a 2 hour project. And that's not even taking into account whatever hurdles you've inadvertently put in the task, strange build config, obscure libraries, out-of-date libraries, non-standard formatting, etc.
Could you tell us what the average length of a submisson is?
One task I got given a year or two back had an old version of Vue, linting rules that conflicted with the defaults of the 'usual' IDE you'd use, wanted you to create 3 new backend API endpoints and a new frontend page with non-trivial functionality. Plus they wanted units tests.
That's a 2 day job. They also claimed it would take an hour or two.
Sorry, this smells bad - It sounds a lot like the old "We've just solved this really tricky bug that nobody knew how to deal with - it took us a month, but we're experts now - now you do the same in 2 hours"
Even a very experienced engineer might not have encountered this specific bug/situation that you’re testing for in your take home assignment. How do you make sure that you don’t filter out good engineers by testing them on a situation they haven’t encountered in their career?
Presumably the answer is "we don't, because we're ok with filtering out a few very experienced developers if that reduces the risk of a bad hire".
It's pretty rare IME to encounter interviews which are closely related to the actual work you'll be doing, mostly it seems like the main purpose is to filter out the heaps of bad/mediocre developers.
> The submissions that we continue with check a bunch of boxes. The submissions that we pass on don't.
What detail is shared about what "checkboxes" the candidate met and didn't meet? From the parent post and your reply, it doesn't sound like much at all?
Woah, I've been a hiring manager long enough that it's been a while since I've done a take-home code exam myself but I don't even think I would grade a LLM that way (because I moved to AI now). Unless your code exam is super trivial or the boxes themselves are table stakes ("code runs without errors", "code includes more than one function"), coding is creative enough that it's hard to come up with 20 checkboxes that cover whether a sample is any "good" let alone "shows better decision making".
I've had a few bad experiences when sharing feedback with candidates myself and I would understand doing the checkbox approach for feedback and/or just never sending detailed feedback, but actually grading submissions pass/fail based on a subset of criteria you jealously guard from candidates essentially selects for lucky people. If I wanted to do that, I'd just shuffle the submissions by number of bytes and discard everything that's a multiple of 5 or something.
> Your point about valuing peoples' time is important, though. We have not yet found the right balance for everyone.
Did you try paying candidates for the home assessment? People's time costs money. Paying people not only helps attract candidates but also helps the company reduce the list of candidates to take the assessment.
I'm inclined to say the interview process is imbalanced with more power in the hands of the employer. If you're looking for the right balance, try turning over some power from the employer to the candidate.
The candidate's cost was 2-hours worth of time. Was the employer's cost 2-hours worth of time?
Presumably you can't tell if the applicant spent 2h or 6h, and in practice the 6h solution is probably going to look much better. So sure, in theory you may be right, but in practice I bet you select for people who spent (much) more than 2 hours.
And for every company that means it there are five others want to find people that will sink massive amounts of time into these so they can do the same once they start.
The poor communication roughly mirrors my experience there as well. I had expressed interest in one of their infrastructure operations positions and was given access to the repo with the project after a wait of about a week to hear back. Due to some life events I wasn’t able to start looking at the project in earnest for about another week, whereupon I had some questions. So I sent questions to the individual that originally contacted me. Never heard back.
Was it my issue waiting a short time to start it? Maybe, but you’d think they’d at least try to answer a question or two. Oh well, I just gave up because if they weren’t going to put in any time, why should I?
It's hilarious watching the CEO here trying to do damage control in a highly visible setting after so many burnt programmers emerge to tell their tales.
I hope my responses don't come across as damage control. Talking through this stuff is super valuable, we tweak our hiring process almost continuously based on feedback. Sometimes when people have a bad experience, we've just fucked up. Other times, we set the wrong expectations and there are simple improvements we can make.
I doubt anyone will read frustrated comments and my responses and think "oh boy, I didn't want to apply there but that one guy responded and made me change my mind".
> Talking through this stuff is super valuable, we tweak our hiring process almost continuously based on feedback.
Isn't that ironic, indeed hypocritical? You take feedback from job applicants, consider it super valuable, but refuse to give anything useful to those very job applicants.
Sounded pretty clear and specific. You value feedback on your interview process but don't extend the same courtesy to applicants who would value feedback on their rejected application.
I think people read it as damage control because you're trying to address the feedback while arguing that you guys aren't wrong. That's pretty typical damage control behavior these days.
Between your responses here and the treatment of candidates that appears to be pretty common, Im a big no on ever applying to your company. Your hiring process is broken.
Everyone thinks they’re going to give feedback because “we’re different, we’re going to do this right”. Then life happens and that shit gets dropped by the wayside.
We don't really give individual feedback on the take home challenges. Where we struggle is explaining this to people, it's not obvious to every dev _why_ we don't give specific feedback. Here's what we send:
---
Unfortunately, we’re not going to be moving forward with your application for the backend role at this time.
The first thing we want you to know is that we got a lot of applications for these roles. Like, a lot a lot.
The way we evaluate candidate submissions is that we’ve built, over a year or so of running this challenge, a written rubric for what the strongest submissions look like. We started with something sane looking, and then iterated over time as we got a sense of what candidate submissions actually looked like.
We score submissions according to that rubric. Different members of our platform team score different applications; they’re just jumping in and looking at the code and making grading decisions. In the pool of candidates we’re evaluating right now, your code submission missed our cutoff.
A natural thing to want from us next is a copy of the rubric, or specific information about what your submission was missing. We’d want that too! But we can’t give that to you, for (at least) two reasons:
(1) The outcome on your code was a weighted combination of a bunch of different factors, so there isn’t a simple answer that doesn’t just dump our whole rubric to you.
(2) We want to preserve the opportunity for people to apply to these roles in the future, and spoiling the challenge would break that.
What we can tell you is that the most successful submissions had a combination of these attributes:
- Heavy user focus. Successful submissions explained how the system could do good things for users.
- Straightforward database models that make the queries we care about fast to run.
- Transactionally safe sync with Stripe. For example, the best submissions mentioned that Stripe API requests had to be idempotent to ensure the user is not charged multiple times.
We sincerely hope you stay in touch and re-apply in the future. In the meantime, we'd like to give you some Fly.io credits to play around with! If your email address here matches your Fly.io email, click here to get them. If not, please let us know which email you use for Fly.io and we'll set that right up.
Please feel free to reach back out to us about other roles, or about this role in the future. Thank you, again, for doing this.
Seems like you probably are not explicitly stating this policy beforehand? It's not very surprising that explaining why you don't do as an applicant might expect afterwards is not going well. From the Github examples you've linked earlier, right at the end:
> When you’re ready, let us know and we’ll schedule it for review. We review submissions once a week. You’ll hear back from us no matter what by the end of the /following/ week, possibly sooner if you submit early in the week.
+ 1 sentence expressing "we will tell you if you passed or failed, we have a policy of not providing actionable feedback"
> Remember, we are not timing you and there is no deadline. Work at your own pace. Reach out if you're stuck or have any questions. We want you to succeed and are here to help!
+ 1 sentence "but make sure to not spend significantly more time than X hours"
I think that your heart is in the right place with a message like this, but my feedback would be to keep it shorter, perhaps to a paragraph or two. There is a phrase called “the message is the medium” and I think it applies to hiring.
The best way to reject a candidate IMO is to call the candidate, deliver short feedback over the phone “Some of the feedback we got from the interviewers was we like to see the problem solved in a shorter amount of time/ less space complexity/ whatever). If I was on the fence, I would them know that if they want to try again they can give it another shot in 6 months, a year, whatever. To me this seems like the most humane way to do it (This was actually my experience with Google, maybe 8-10 years ago).
At the core of everything people-related a company does or does not do is the fear of a lawsuit, and in the case of interviewees, the only thing that leads to a lawsuit is telling the interviewee anything at all - therefore, no feedback.
No, replace "sued" with "humiliated". Every time. It just sounds more prudent to say "legal risk" than "we're worried a member of staff will make us look like absolute morons" or "we're lazy". It's obvious when you think about it.
"We didnt hire her because she seemed like she might get pregnant" isnt the kind of feedback most people write anyway and if they did a 2 minute pass via HR is enough to eliminate the risk.
The companies that are actually worried take pains to tell you what not to say during the interview. This is rarer than you'd think. Most companies aren't actually that worried about getting sued over interviews. That doesnt stop them from tossing this lame excuse at you though.
I agree with this, a LOT of programming assignments have numerous effective approaches to a solution, and it's quite often, I believe, that the interviewers are simply unfamiliar with the submitted solution and don't know how to evaluate it or what to look for, not that it doesn't work.
And of course, this is more likely when the interviewee submits a more advanced or higher-level solution, meaning the best strategy is to "dumb-down" the submission rather than trying to use the latest language features.
Similar experience with DuckDuckGo - except I didn’t get past their bullshit first part which was “write 7 pages describing a project you designed in the last year”.
They claimed it would take only a couple of hours but for the 15 or so bullet point requirements that was simply not realistic. Spent a couple of days on it and the only feedback I got was no thanks.
They had previously stated they pay for every stage regardless of outcome. After sending my details they never paid me.
Feels vastly hypocritical given their stated people first work culture.
Had a similar experience and made it to the 2nd stage, and also got paid for both.
I even took the time to learn enough Perl to write the solution. They said it wasn't required but would be a nice touch. It was an interesting project to work on and it was fun coming up with a solution. I thought it fulfilled their requirements pretty well, but also just got a short "no thanks" email.
I was left wondering for a long time what they didn't like about it..
Someone suggested they may use these interview submissions as cheap labor to get ideas to solve internal problems. I thought that was ridiculous until I reread the disclosure I had to sign at the beginning:
> [DDG] will be the exclusive owner of all right, title, and interest in and to the work product resulting from the project, including all intellectual property and proprietary rights.
Could have just been boilerplate disclosure stuff, but seems a little weird.
It’s sad seeing so many comments saying they got paid while I didn’t! I was interviewing for their new privacy based browser, it was a C#/.NET desktop app role.
I just add them to my open source portfolio on github and make it look like I love side projects
I also use them as starting points for the next take home assignment and even live interviews (I’ve been able to keep the project open in the IDE on a separate monitor and copy and paste code even when doing a camera-on + screenshare interview)
We're not. We've had issues losing track of candidates over the last year. This is one of those all or nothing problems, a 0.5% failure rate is pretty bad for many human beings.
That's very frustrating, I'm sorry you had that experience. The most time I've spent is ~4 hours and I've definitely gotten to that point where I'm like I don't know if it's worth it but I should just finish it. And yea, to not get anything back from the reviewer/recruiter after that just feels bad.
Hi Martin: I'm really sorry we ghosted you. That's not good, and it's happened more often than I'd like. We struggled with this in March, especially, because we were dealing with reliability issues.
I'll make sure we get you a response within the next week.
If you want some feedback email me and I’ll check your work. I don’t work at Fly.io but I have over a decade of backend experience including Google, Apple, etc.
If they said it would take 4 hours and you spent 10 hours on it, it’s likely your approach was ineffective compared to a more skilled engineer.
In my experience, the number of hours the company suggests is always BS. It takes 2–3x as much time to experiment and then polish your solution. It might take the states amount of time if you’re the person who came up with the problem and knows how to solve it right away because you’ve put much thought into it and have seen dozens or hundreds of solutions already.
Let's say I come up with an assignment and I pass it to a junior to see how long it takes. The junior does the work, likely reports the time rounded down, but also with no stakes. Jim job is secure.
As an interviewee you're gonna spend time polishing the code. Every variable well named, lots of comments, good functions, well considered parameter order etc. It's gonna take longer because , for you, the stakes are higher. You'll likely build some code to test it, and so on.
This is before we discuss domain knowledge that the junior had, and also that the company implemented the task in the first place and didn't just spitball the time expected.
Yeah I’d be very surprised if even half of take home assignments were “tested” on juniors vs just having time estimates spitballed. You know, like every dev does when asked “so how long do you think this ticket will take to implement?”
With the unfortunate difference that when estimating a ticket you will have to complete, most people have learned to give safe over-estimates, but they don't want to look bad when a prospective hire complete the job in 1/4 the time estimate...
When I've created these sorts of take homes I always do them myself and have at least 1 other person on the team do them. My thinking was that we should be able to do them in 1.5 hours if we tell the candidate it will take then up to 3 hours. I would give my own time to complete it as the expected time for a candidate.
You can't create the take home yourself because you're drawing on your own personal domain experience.
For example I'm an expert on real time video streaming in embedded systems. It's trivial to me to program a real time system that will stream video at a lower resolution in real time from scratch. I pass that problem to your average backend web developer they're going to take 100x longer then me.
The issue is, while it may seem obvious to you that the domain is too specialized, when you yourself spend to much time in a specific domain you become biased and you start to think such knowledge is basic because you get too good at it. It's invisible to you.
Time shouldn't be the factor measured here because people have such varied backgrounds.
The take homes I created in the past were not so extremely domain specific. They were relatively common problems like "read these JSON files and generate a tree of dependencies between them based on references, where file1.json might contain an item like 'file2.json'".
If I'm hiring for a real time video streaming platform, I probably don't want to hire a backend dev who has 0 experience in the field and takes 100x thr time, so this seems fine.
The point of the assignment isn't to make it "fair" (in some loose sense), but to filter out candidates.
True to some extent, but you usually don’t want to filter out all candidates who don’t have extensive experience in the exact domain you’re working in, otherwise you’d never hire anyone.
It is also going to be very hard to stand out on them without taking extra time because like it or not for a top company the competition is going to be spending that 2-3x time.
At least in a technical interview the problem is time boxed so that the other solutions you are being compared against also only got 45 minutes.
We design these so people with relevant experience can bang them out in about two hours. They're not trick challenges, though, they're very straightforward. Getting people to believe they should only spend two hours on them is the challenge.
It's not BS in the sense they are lying. It's BS because they're wrong.
The reason they end up getting it wrong is usually because they pick something domain related to what the company does. So these guys literally work in that domain every single day of course they know it in and out.
When they throw it at someone completely foreign to the domain it's going to take a longer time to figure it out. Most interviewers just lack the ability to see this.
Often the difference in time is simply whether you’ve seen a similar problem before or not. I wouldn’t jump straight to suggesting someone is less skilled. You can also end up spending more time on something simply because you’re trying to stand out or have some reason to believe more effort translates to a better results. If you’re a test driven fan, simply making your code more testable and adding meaningful tests can add time.
Hard-delete isn't required by GDPR. The data itself just has to be made non-identifiable. You don't actually have to remove the database records, for instance.
There’s identifiable and sufficiently deidentified to meet the legal standard. Removing the userid meets the GDPR definition, but I bet you could reidentify based on patterns or fingerprints, if you really wanted to.
I would love to expand it to other sports, the only real challenge is finding an API with appropriate stats and writing a simple algorithm to decide what constitutes "watchabilty" based on the those. I would happily take advice on that re: UFC and add it to the site
I work for a remote-only company but use a workspace almost every day. I get to chose my own “office” and the people in it, I also pick the commute I want, this one is just 5 minutes away.