I hear a lot of people say good things about CoPilot too but I absolutely hate it. I have it enabled for some reason still, but it constantly suggests incorrect things. There has been a few amazing moments but man there is a lot of "bullshit" moments.
Even when we get a gen AI that exceeds all human metrics, there will 100% still be people who with a straight face will say "Meh, I tried it and found it be pretty useless for my work."
To be fair, LLMs are pretty good natural language search engines. Like when I'm looking for something in an API that does something I can describe in natural language, but not succinctly enough to show up in a web search, LLMs are extremely handy, at least when they don't just randomly hallucinate the API. On the other hand I think this is more of a condemnation of the fact that search tech has not 'really' meaningfully advanced beyond where it was 20 years ago, more than it is a praise of LLMs.
> LLMs are extremely handy, at least when they don't just randomly hallucinate
I work in tech and it’s my hobby, so that’s what a lot of my googling goes towards.
LLMs hallucinate almost every time I ask them anything too specific, which at this point in my career is all I’m really looking for.
The time it takes for me to realize an llm is wrong is usually not too bad, but it’s still time I could’ve saved by googling (or whatever trad search) for the docs or manual.
I really wish they were useful, but at least for my tasks they’re just a waste of time.
I really like them for quickly generating descriptions for my dnd settings, but even then they sound samey if I use them too much.
Obviously they’d sound samey if I made up 20 at once too, but at that point I’m not really being helped or enhanced by using an LLM, it’s just faster at writing than I am.
I don't mean this as a slight, just an observation I have seen many times - people who struggle with utility from SOTA LLM's tend to not have spent enough time with them to feel out good prompting. In the same way that there is a skill for googling information, there is a skill for teasing consistent good responses from LLM's.
Why spend my time teasing and coaxing information out of a system which absolutely does make up nonsense when I can just read the manual?
I spent 2023 developing LLM powered chatbots with people who, purportedly, were very good at prompting, but never saw any better output than what I got for the tasks I’m interested in.
I think the “you need to get good at prompting” idea is very shallow. There’s really not much to learn about prompting. It’s all hacks and anecdotes which could change drastically from model to model.
None of which, from what I’ve seen, makes up for the limitations of LLM no matter how many times I try adding “your job depends on Formatting this correctly “ or reordering my prompt so that more relevant information is later, etc
Prompt engineering has improved RAG pipelines I’ve worked on though, just not anything in the realm of comprehension or planning of any amount of real complexity.
People also continue to use them as knowledge databases, despite that not being where they shine. Give enough context into the model (descriptions, code, documentation, ideas, examples) and have a dialog, that's where these strong LLMs really shine.
I see it do a lot that's interesting but for programming stuff, I haven't found it to be particularly useful.
Maybe I'm doing it wrong?
I've been writing code for ~30 years, and I've built up patterns and snippets, etc... that are much faster for me to use than the LLMs.
A while ago, I thought I had a eureka moment with it when I had it generate some nodejs code for streaming a video file - it did all kinds of cool stuff, like implement offset headers and things I didn't know about.
I thought to myself, "self - you gotta check yourself, this thing is really useful".
But then I had to spend hours debugging & fixing the code that was broken in subtle ways. I ended up on google anyway learning all about it and rewrote everything it had generated.
For that case, while I did learn some interesting things from the code it generated, it didn't save me any time - it cost me time. I'd have learned the same things from reading an article or the docs on effective ways to stream video from the server, and I'd have written it more correctly the first go around.
So if LLMs weren't surprising to you, it would imply you expected this. If you did, how much money did you make on financial speculation? It seems like being this far ahead should have made you millions even without a lot of starting capital (look at NVDA alone)
> So if LLMs weren't surprising to you, it would imply you expected this.
I do claim that I have a tendency to be quite right about the "technological side" of such topics when I'm interested in them. On the other hand, events turn out to be different because of "psychological effects" (let me put it this way: I have a quite different "technology taste" than the market average).
In the concrete case of LLMs: the psychological effect why the market behaved so much differently is that I believed that people wouldn't fall for the marketing and hype of LLMs and would consider the excessive marketing to be simply dupery. The surprise to me was that this wasn't what happened.
Concerning NVidia: I believed that - considering the insane amount of money involved - people/companies would write new languages and compilers to run AI code on GPUs (or other ICs) of various different suppliers (in particular AMD and Intel) because it is a dangerous business practice to make yourself dependent on a single (GPU) supplier. Even serious reverse-engineering endeavours for doing this should have paid off considering the money involved. I was again wrong about this. So here the surprise was that lots of AI companies made themselves so dependent on NVidia.
Seeing lots of "unconventional" things is very helpful for doing math (often the observations that you see are the start of completely new theorems). Being good at stock trading and investing in my opinion on the other hand requires a lot of "street smartness".
Re: NVIDIA. I wholeheartedly agree. Google/TPU is an existence proof that it is entirely possible and rational to do so. My surprise was that everyone except Google missed.
OpsGenie has that. We use it at my job. I'm not sure what problem OP is having. The phone call, text message, and app alert from OpsGenie are more than enough. The notification configuration is extremely flexible and each user can customize it as needed. From a user perspective, I don't know what else you could want.
I have no affiliation with OpsGenie outside of using at work.
And we have one user with another brand that also locks down the notification/alert settings and kills apps in attempt "to save battery" which can't be controlled.
I've learned so much about Splunk this month. I hate it. The UX is hot garbage. Why are settings scattered everywhere? Why does a simple word search not return any results? Why is there no obvious way to confirm data is being forwarded; like actual packets, not just what connections are configured.
Yeah. There is alot of over hyped and over funded nonsense that comes out of NASA. Some of it is hype from the marketing and press teams, other hype comes from misinterpretation of releases.
None of that changes that there have been major technical breakthroughs, and entire classes of products and services that didn't exist before those investments in NASA (see https://en.wikipedia.org/wiki/NASA_spin-off_technologies for a short list). There are 15 departments and dozens of Agencies that comprise the US Federal government, many of whom make investments in science and technology as part of their mandates, and most of that is delivered through some structure of public-private partnerships.
What you see as over-hyped and over-funded nonsense could be the next ground breaking technology, and that is why we need both elected leaders who (at least in theory) represent the will of the people, and appointed, skilled bureaucrats who provide the elected leaders with the skills, domain expertise, and experience that the winners of the popularity contest probably don't have.
Yep, there will be waste, but at least with public funds there is the appearance of accountability that just doesn't exist with private sector funds.
"Incidents of this nature do occur in a connected world that is reliant on technology."
- Mike Maddison, CEO, NCC Group
Until I see an explanation of how this got past testing, I will assume negligence. I wasn't directly affected, but it seems every single Windows machine running their software in my org was affected. With a hit rate that high I struggle to believe any testing was done.
The point is that you set a goal and made progress towards it. Ideally you choose goals for yourself and it's expected that you won't complete them all.