Hacker Newsnew | past | comments | ask | show | jobs | submit | joelm's commentslogin

Been using Claude Code (4 Opus) fairly successfully in a large Rust codebase, but sometimes frustrated by it with complex tasks. Tried Gemini CLI today (easy to get working, which was nice) and it was pretty much a failure. It did a notably worse job than Claude at having the Rust code modifications compile successfully.

However, Gemini at one point output what will probably be the highlight of my day:

"I have made a complete mess of the code. I will now revert all changes I have made to the codebase and start over."

What great self-awareness and willingness to scrap the work! :)


Gemini has some fun failure modes. It gets "frustrated" when changes it makes doesn't work, and replies with oddly human phrases like "Well, that was unexpected" and then happily declares that (I see the issue!) "the final tests will pass" when it's going down a blind alley. It's extremely overconfident by default and much more exclamatory without changing the system prompt. Maybe in training it was taught/figured out that manifesting produces better results?


It also gets really down on itself, which is pretty funny (and a little scary). Aside from the number of people who've posted online about it wanting to uninstall itself after being filled with shame, I had it get confused on some Node module resolution stuff yesterday and it told me it was deeply sorry for wasting my time and that I didn't deserve to have such a useless assistant.

Out of curiosity, I told it that I was proud of it for trying and it had a burst of energy again and tried a few more (failing) solution, before going back to it's shameful state.

Then I just took care of the issue myself.


After a particular successful Claude Code task I praised it and told it to "let's fucking go!" to which it replied that loved the energy and proceeded to only output energetic caps lock with fire emojis. I know it's all smoke and mirrors (most likely), but I still get a chuckle out of this stuff.


This really cracked me up, indeed mostly funny and maybe slightly scary or at least off-putting that it's so human-like.


I asked it to do a comparatively pedestrian task: write a script to show top 5 google searches.

First it did the search itself and then added "echo" for each of them - cute

Then it tried to use pytrends which didn't go anywhere

Then it tried some other paid service which also didn't go anywhere

Then it tried some other stuff which also didn't go anywhere

Finally it gave up and declared failure.

It will probably be useful as it can do the modify/run loop itself with all the power of Gemini but so far, underwhelming.


This was also my exact experience. I was pretty excited because I usually use Gemini Pro 2.5 when Claude Code gets stuck by pasting the whole code and asking questions and it was able to get me out of a few pickles a couple of times.

Unfortunately the CLI version wasn't able to create coherent code or fix some issues I had in my Rust codebase as well.

Here's hope that it eventually becomes great.


Claude will do the same start over if things get too bad. At least I've seen it when its edits went haywire and trashed everything.


Same here. Tried to implement a new feature on one of our apps to test it. It completely screwed things up. Used undefined functions and stuff. After a couple of iterations of error reporting and fixing I gave up.

Claude did it fine but I was not happy with the code. What Gemini came up with was much better but it could not tie things together at the end.


Sounds like you can use gemini to create the initial code, then have claude review and finalise what gemini comes up with


Personally my theory is that Gemini benefits from being able to train on Googles massive internal code base and because Rust has been very low on uptake internally at Google, especially since they have some really nice C++ tooling, Gemini is comparatively bad at Rust.


Tangental, but I worry that LLMs will cause a great stagnation in programming language evolution, and possibly a bunch of tech.

I've tried using a few new languages and the LLMs would all swap the code for syntactically similar languages, even after telling them to read the doc pages.

Whether that's for better or worse I don't know, but it does feel like new languages are genuinely solving hard problems as their raison d'etre.


Not just that, I think this will happen on multiple levels too. Think de-facto ossified libraries, tools, etc.

LLMs thrive because they had a wealth of high-quality corpus in the form os Stack Overflow, Github, etc. and ironically their uptake is causing a strangulation of that source of training data.


Perhaps the next big programming language will be designed specifically for LLM friendliness. Some things which are human friendly like long keywords are just a waste of tokens for LLMs, and there could be other optimisations too.


>Personally my theory is that Gemini benefits from being able to train on Googles massive internal code base and because Rust has been very low on uptake internally at Google, especially since they have some really nice C++ tooling, Gemini is comparatively bad at Rust.

Were they to train it on their C++ codebase, it would not be effective on account of the fact that they don't use boost or cmake or any major stuff that C++ in the wider world use. It would also suggest that the user make use of all kinds of non-available C++ libraries. So no, they are not training on their own C++ corpus nor would it be particularly useful.


Excuse me why was this downvoted so aggressively??


How can they train on internal codebase without leaking specifics?


They can’t, which is a good point. Also it would be basically useless for the reasons I mention.


> Personally my theory is that Gemini benefits from being able to train on Googles massive internal code base

But does Google actually train its models on its internal codebase? Considering that there’s always the risk of the models leaking proprietary information and security architecture details, I hardly believe they would run that risk.


Googler here.

We have a second, isolated model that has trained on internal code. The public Gemini AFAIK has never seen that content. The lawyers would explode.


What model do your lawyers run on?


Oh, you’re right, there are the legal issues as well.

Just out of curiosity, do you see much difference in quality between the isolated model and the public-facing ones?


We actually only got the “2.5” version of the internal one a few days ago so I don’t have an opinion yet.

But when I had to choose between “2.0 with Google internal knowledge” and “2.5 that knows nothing” the latter was always superior.

The bitter lesson indeed.


That's interesting. I've tried Gemini 2.5 Pro from time to time because of the rave reviews I've seen, on C# + Unity code, and I've always been disappointed (compared to ChatGPT o3 and o4-high-mini and even Grok). This would support that theory.


Interesting, Gemini must be a monster when it comes to Go code then. I gotta try it for that


As go feels like a straight-jacket compared to many other popular languages, it’s probably very suitable for an LLM in general.

Thinking about it - was this not the idea of go from the start? Nothing fancy to keep non-rocket scientist away from foot-guns, and have everyone produce code that everyone else can understand.

Diving in to a go project you almost always know what to expect, which is a great thing for a business.


There is way more Java and C++ than Go at Google.


Reasonsbly small Go codebase works well almost with any LLM

I had always designed very large projects as few medium sized independent Go tools and that strategy pays in times of AI assisted coding.


So far I've found Gemini CLI is very good at explaining what existing code does.

I can't say much about writing new code though.


I tried it too, it was so bad. I got the same "revert" behaviour after only 15 minutes.


What are your goals?

If budget is a factor, there are local/regional providers you can find good deals with. E.g. I've had good experiences with https://www.opticfusion.com/ in Seattle.

If low-latency across the US is a factor then...doing that with one will be sub-optimal, however I'd either start in a specific market you want to chase (e.g. LA or NYC), or go more central like Chicago.

If connectivity flexibility and peering are key for you then you might consider locating in a "carrier hotel". The best locations for those vary by geography (e.g. in Chicago that would be at Equinix, but in LA it's Coresite). You'll pay more, but can readily access the peering exchange.


Figuring out your goals is key here. There are lots of providers all over the spectrum, so narrowing the options is key.

Location is clearly important. Where do you want your datacenter(s) and do you want all locations managed by the same company or different companies.

Electrical reliability: some sites are fed from multiple substations, have a well maintained battery room, redundant generators, well tested transfer switches... and some don't. Automatic transfer switches tend to be a SPoF though, and a major provider, doing everything right, is likely to have an outage at one of their facilities due to a transfer switch failure once every 3-10 years, and afterwards, there will be an unavoidable scheduled maintenance to replace it (unless it had to be replaced during the outage). Maybe there's a new way to install transfer switches redundantly though?

HVAC reliability: HVAC needs electricity too, so see the previous one, but servers don't like to be hot, what are their plans for when HVAC equipment fails.

Network reliability: if you're at a carrier neutral site, you'll get fiber to your rack from your upstreams and then it's all up to you or them. If your provider is your transit, do they deliver redundant connections (lacp), do they run redundant connections all the way to their border router, redundant connections to their upstreams, do they have multiple upstreams, will they pull fiber from an upstream carrier to your rack. Do they participate in the local peering exchange? Also, have a capacity in mind.

Remote hands / physical access is important. If you have a lockable rack, do all the racks have the same key? IPMI can avoid a lot of physical time on the machines, but not all of it.

In addition to all of these, there's a what they say, and what's actually the case, and if they'll let you inspect / audit.

There's no wrong answers on these, I have my personal hosting at a facility with probably wishy washy answers to everything and I have low trust, but it's cheap and I don't need nine nines. Other people need all the reliability. Or something in between.


How good is the pricing at Optic Fusion? I’m always wary of “contact us for a quote”.


I don't recall exactly (it's been a while since I was handling the colo contracts directly). I just remember that it was really good value.

You're probably going to have to contact folks for quotes for most colo providers. They're doing "enterprise sales", so they want to talk to you, see what the needs and opportunities are, then quote you based on your needs.


This has been an intense journey building CompanyCraft as a solo dev (and I was never a full-time dev by background). Thanks to CoPilot and ChatGPT I was able to learn a decent level of React through the project. I look forward to your feedback!


Congrats on jumping into this adventure! Here are my thoughts on your concerns:

1. If you communicate that you're in the early stages of building an exciting new business and wanting to get feedback from early-access customers, there should be businesses that are OK with you being solo. They will adapt to that risk, and you won't be able to sell to everyone, but it shouldn't exclude you from everyone either. You might just not be talking to the right customers yet.

2. For milestones (and fundraising) you should think about your goals - what are you trying to get out of this? Do you want to build and scale a big team, serve giant amounts of customers, etc. Or do you just want to augment or replace your income? Or something in-between? You can think about this from different angles: team size, revenue, geographic scope, your personal exit or income, etc.

Here are a couple of examples from my journey in case they're helpful:

In my prior business (https://www.bigleaf.net) my first milestone was to get a working product. I didn't feel as though just talking about it would be convincing enough (and it wasn't), but once I could demo an MVP then it really wow'd customers. Along the way I added a milestone to get a technical co-founder since I got burnt out doing it myself. Those 2 major steps took just over a year. The next milestone was getting our first revenue (took just a month or so after the MVP), then milestones just kept being added from there logically based on the strategic path we chose to be on at that moment.

In my current business (https://www.companycraft.ai) I had a mix of some early milestones. I wanted to talk to potential customers (entrepreneurs) to ensure the problems I wanted to solve matched up with pain they had, and I also wanted to build an MVP ASAP. I completed those in about 3 months, with many other smaller steps and goals along the way. My current milestone is to do a full "v1" public launch in November.

So you just set some milestones that are logical based on where you are and where you want to be in the coming months and years, then hold yourself accountable to them (or ideally be connected with a peer or mentor who can be an accountability partner of sorts).

3. On location, I go back to your team goals. If you're going to stay solo then no, location shouldn't matter. If you want to build a fully remote team then it shouldn't matter much (though being near-ish to a decent airport would be valuable for visiting the team and customers). If you want any local team members then you should think about where you'll gather and what the talent pool is like in your area.

4. It sounds like you're already on a good track here - talking to people who have the pain you're solving and gathering their emails. If that has you on the right track I'd just keep doing that. However you mentioned below that you're selling this B2B...so in that case you may want to consider if you're actually talking to the buyer of your product. Are the people you've met the people who would approve the purchase of your solution? If not, try to connect with those people. Identifying a list of them on LinkedIn and cold reaching out could be one method.

5. Fundraising goes back to your goals as an entrepreneur :) If you want to grow this to IPO or a really big exit, have a team of hundreds, and build a market-changing or world-impacting offering, then you probably need to raise outside funds at some point. However, if your desired scale is smaller and/or you see a path to profitability without fundraising then you get to keep control of your destiny, which is great. If you can afford it and you don't sense the market is going to run away without you, I'd advise continue building, talking to customers, and close your first deals; don't rush into raising money. If/when you know that you can't succeed at your goals without raising money, then dive into that path.


Latency has been the biggest challenge for me.

They cite "two to 15+ seconds" in this blog post for responses. Via the OpenAI API I've been seeing more like 45-60 seconds for responses (using GPT-3.5-turbo or GPT-4 in chat mode). Note, this is using ~3500 tokens total.

I've had to extensively adapt to that latency in the UI of our product. Maybe I should start showing funny messages while the user is waiting (like I've seen porkbun do when you pay for domain names).


Was this in the past week? We had much worse latency this past week compared to the rest (in addition to model unavailability errors), which we attributed to the Microsoft Build conference. One of our customers that uses it a lot is always at the token limit and their average latency was ~5 seconds, but that was closer to 10 second last week.

...also why we can't wait for other vendors to get SOC I/II clearance, and I guess eventually fine-tuning our own model, so we're not stuck with situations like this.


I've seen more errors lately I think, but no the latency has been an issue for months. I think it has grown some over the last few months, but not a dramatic change.


Well poop, hope that gets resolved fast. I guess OpenAI can't hire compute platform engineers fast enough!


If a user is waiting on the response, you basically have to stream the result instead of waiting on the entire completion.


There's no real benefit to streaming if you are planning to use the LLM output downstream (say, in a SQL query). LLM latency is a major annoyance right now, whether locally-hosted or cloud-based.


Yea, that is probably a better solution. Not an easy one to refactor into at the moment though.


I thought this was an interesting thread. I'm curious if folks here agree with his assessment:

"predictive AI will seem stuck while generative AI accelerates in 2023. Most high-value AI will still be predictive- so there will significant frustrations around AI ROI."


Shameless plug - if you're using Starlink and want to fix the dropouts, plus get a static public IPv4 address that works over any ISP, get our SD-WAN service: https://www.bigleaf.net/. With a 2nd internet connection, our platform will auto-ID your sensitive traffic and route it over the best performing connection (e.g. aware of jitter, dropouts, etc), plus your bulk data traffic (e.g. Netflix) will route over your highest-throughput connection. 30-day money-back guarantee, so no risk to try.

As a former wireless ISP architect/engineer, it's wonderful to see the leadership that Starlink is providing in LEO satellite connectivity (due to the low latency compared to geostationary). I hope the "block the sky with satellites" visual/astronomy concerns won't play out as an actual issue, because this seems like a great platform to address connectivity needs in harder-to-reach areas.


Bigleaf Networks | Portland, OR area | Full-Time ONSITE | $60k-$150k plus stock options

Bigleaf is what we call "Cloud-first SD-WAN" - we use a software-based network to optimize the connection from businesses to their key cloud applications. For example, grocery stores use us to ensure their credit card transactions are successful and law firms use us to ensure their VoIP calls always sound good.

We're hiring for a number of roles right now:

  * Software Development Manager
  * Sr. Front-End SW Engineer
  * Linux-focused SW Engineer (firmware and Linux networking subsystem)
  * Software Engineer (general role with focus on back-end and networking)
  * DevOps Engineer
  * Director of Network Engineering and Operations
  * Network Support Specialist
Most of these roles are up on our website, and you can read more here: http://www.bigleaf.net/careers. Feel free to email me directly (see my profile) if you're interested.


Bigleaf Networks | Beaverton, OR (Portland suburb) | ONSITE | http://www.bigleaf.net

Bigleaf is an SD-WAN platform that provides reliability and performance for Cloud applications over commodity broadband. We're a small team but we've got an established business with hundreds of paying mid-market customers, and we're growing quickly.

Our interview process entails some initial email discussions, 1-2 in-person or phone-based interviews (no crazy technical algorithm memorization tests), and often a brief (~1 hr) coding challenge for you to do from home.

We're hiring for the following technical roles right now: * Front-end Developer * Sr. Software Engineer (Linux networking focus) * Network Operations Engineer * Network Integration Engineer

Check out more details here: http://www.bigleaf.net/careers and feel free to email me (Founder and CEO) at joelm@bigleaf.net (no recruiters please).


Your product lines are extremely impressive, great times ahead.


Bigleaf Networks | Beaverton, OR (near Portland) | ONSITE full-time

Bigleaf is an SD-WAN provider delivering internet redundancy and optimization, keeping businesses connected to the cloud. Our proprietary platform uses Software-Defined-Networking (SDN) technologies to provide seamless failover and dynamic application prioritization.

We have a reliable and high-performance service that’s growing quickly, so we're looking for a Network Operations Engineer to join the team.

The Role

• Network Operations

• Technical Support

• Device and Service Provisioning

• Software Engineering / DevOps

Fit Check

• Do you love serving customers with outstanding support?

• Do you know what ARP is and how it works?

• Have you troubleshot BGP or OSPF issues?

• Do you know what jitter does to a VoIP call?

If you think this might be a good fit for you, please check out more info below and get in touch:

http://www.bigleaf.net/careers/


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: