Any non-digital system you describe "as Code" is not a source of truth, but a Source of Hope. The code describes the intended state, something has to reconcile it with reality.
This is the same as having it in unstructured documents. Which means the auditing is still required funny enough.
So yes, this could be done. I'd love to see what run in the CI/CD for a change. When someone works on the wrong thing, or breaks compliance IRL, how do you backport it into this? "Alice is a software engineer, and created this SaaS account with her email when the company was founded. The admin email can not be changed and she has admin even though another role should control that"
Yes, this is ridiculous. Far too much chrun in this ecosystem. This decreases my confidence in this Mintlify company; given this, they seem like the type to randomly rugpull when they feel like it.
Yeah, I can understand where you're coming from. I guess there's balance to the extent that being overly stubborn is also not a great thing. Prioritization at an early stage startup is a tough task.
There is no world outside of a cult or hype bubble where a 6 day turnaround time from announcement to deprecation is acceptable for a standard.
Standards are something that multiple independent groups can build on and should be somewhat stable.
If this was labeled as experimental than maybe, although its crazy to waste time publishing if you are working on something that rapidly iterating. Its not though, because everyone in this AI boom is rushing to be the winner takes all, and is trying their best to signal to the crowd that they are that winner.
Nvidia sees the forest of the trees. The consequences of the US government buying steaks and Intel are that there will be Federal requirements for us companies using Intel. This is entirely about the foundry business. Nvidia is at risk when 100% of the production of its intellectual property occurs in Taiwan. They're more interested than anyone else in diversifying their foundry solutions. Intel has just been a terrible partner and totally disregards its customers. It's only because of the new strategic need for the US to have a foundry business that the government is saving until. NVIDIA is understandably supportive of this.
Open models are going to win long-term. Anthropics' own research has to use OSS models [0]. China is demonstrating how quickly companies can iterate on open models, allowing smaller teams access and augmentation to the abilities of a model without paying the training cost.
My personal prediction is that the US foundational model makers will OSS something close to N-1 for the next 1-3 iterations. The CAPEX for the foundational model creation is too high to justify OSS for the current generation. Unless the US Gov steps up and starts subsidizing power, or Stargate does 10x what it is planned right now.
N-1 model value depreciates insanely fast. Making an OSS release of them and allowing specialized use cases and novel developments allows potential value to be captured and integrated into future model designs. It's medium risk, as you may lose market share. But also high potential value, as the shared discoveries could substantially increase the velocity of next-gen development.
There will be a plethora of small OSS models. Iteration on the OSS releases is going to be biased towards local development, creating more capable and specialized models that work on smaller and smaller devices. In an agentic future, every different agent in a domain may have its own model. Distilled and customized for its use case without significant cost.
Everyone is racing to AGI/SGI. The models along the way are to capture market share and use data for training and evaluations. Once someone hits AGI/SGI, the consumer market is nice to have, but the real value is in novel developments in science, engineering, and every other aspect of the world.
I'm pretty sure there's no reason that Anthropic has to do research on open models, it's just that they produced their result on open models so that you can reproduce their result on open models without having access to theirs.
[2 of 3] Assuming we pin down what win means... (which is definitely not easy)... What would it take for this to not be true? There are many ways, including but not limited to:
- publishing open weights helps your competitors catch up
- publishing open weights doesn't improve your own research agenda
- publishing open weights leads to a race dynamic where only the latest and greatest matters; leading to a situation where the resources sunk exceed the gains
- publishing open weights distracts your organization from attaining a sustainable business model / funding stream
- publishing open weights leads to significant negative downstream impacts (there are a variety of uncertain outcomes, such as: deepfakes, security breaches, bioweapon development, unaligned general intelligence, humans losing control [1] [2], and so on)
I don't think there will be such a unique event. There is no clear boundary. This is a continuous process. Modells get slightly better than before.
Also, another dimension is the inference cost to run those models. It has to be cheap enough to really take advantage of it.
Also, I wonder, what would be a good target to make profit, to develop new things? There is Isomorphic Labs, which seems like a good target. This company already exists now, and people are working on it. What else?
> I don't think there will be such a unique event.
I guess it depends on your definition of AGI, but if it means human level intelligence then the unique event will be the AI having the ability to act on its own without a "prompt".
> the unique event will be the AI having the ability to act on its own without a "prompt"
That's super easy. The reason they need a prompt is that this is the way we make them useful. We don't need LLMs to generate an endless stream of random "thoughts" otherwise, but if you really wanted to, just hook one up to a webcam and microphone stream in a loop and provide it some storage for "memories".
I'm a layman but it seemed to me that the industry is going towards robust foundational models on which we plug tools, databases, and processes to expand their capabilities.
In this setup OSS models could be more than enough and capture the market but I don't see where the value would be to a multitude of specialized models we have to train.
I have this theory that we simply got over a hump by utilizing a massive processing boost from gpus as opposed to CPUs. That might have been two to three orders of magnitude more processing power.
But that's a one-time success. I don't hardware has any large scale improvements coming, because 3D gaming mostly plumb most of that vector processing hardware development in the last 30 years.
So will software and better training models produce another couple orders of magnitude?
Fundamentally we're talking about nines of of accuracy. What is the processing power required for each line of accuracy? Is it linear? Is it polynomial? Is it exponential?
It just seems strange to me with all the AI knowledge slushing through academia, I haven't seen any basic analysis at that level, which is something that's absolutely going to be necessary for AI applications like self-driving, once you get those insurance companies involved
Could be that you need massive amounts of data from those super expensive production training runs, and it's tough to figure that out from publicly available data and academic computing resources. Maybe the combination of gradual efficiency improvements, bigger compute clusters, and test-time reasoning keeps the cloud models in the lead. Plus, even if it's exponential scaling, wouldn't that still favor the big data centers? That would put local/edge models at a serious disadvantage.
This implies LLM development isn’t plateaued. Sure the researchers are busting their assess quantizing, adding features like tool calls and structured outputs, etc. But soon enough N-1~=N
To me it depends on 2 factors. Hardware becomes more accessible, and the closed source offerings become more expensive. Right now it's difficult to get enough GPUs to do local inference at production scale, and 2 it's more expensive to run your own GPU's vs closed source models.
[3 of 3] What would it take for this statement to be false or missing the point?
Maybe we find ourselves in a future where:
- Yes, open models are widely used as base models, but they are also highly customized in various ways (perhaps by industry, person, attitude, or something else). In other words, this would be a blend of open and closed.
- Maybe publishing open weights of a model is more-or-less irrelevant, because it is "table stakes" ... because all the key differentiating advantages have to do with other factors, such as infrastructure, non-LLM computational aspects, regulatory environment, affordable energy, customer base, customer trust, and probably more.
- The future might involve thousands or millions of highly tailored models
This will not be true for future frameworks, though it is likely true for current ones.
Future frameworks will be designed for AI and enablement. There will be a reversal in convention-over-configuration. Explicit referencing and configuration allow models to make fewer assumptions with less training.
All current models are trained on good and bad examples of existing frameworks. This is why asking an LLM to “code like John Carmack” produces better code.. Future frameworks can quickly build out example documentation and provide it within the framework for AI tools to reference directly.
Because there’s enough rails code in the training data to determine the proper conventions :) if you’re making something new without this glut of data, it’s going to be much more difficult for a coding assistant to match a convention it’s never seem before.
The thing is, with some elbow grease, you can write a great plugin for your preferred editor. No need for dubious LLMs results, especially when the difficult part, code intellisense, is already solved with LSP. If you're a shop that has invested in a framework, it would be cheaper and more productive.
true, but the conventions it has seen are the same across all similar domains not just same framework/language, copilot "picks up" the similarity.
What I mean is: if you name your modules consistently, say Operation::Object::Verb or Action::ObjectVerb or ObjectManager.doSomething it's really easy for the LLM to guess the next one, just as it is easy for a human.
Add a new file actions/users/update.rb and start typing "Act" and it may guess "class Actions::Users::Update, and start to fill in the code based on nearby modules, switch to the corresponding unit test and it'll fill it in too.
Source: we have our own in-house conventions and it seems copilot gets them right most of the time, ymmv.
There's an awkward reckoning in open source software about inclusivity and protecting the long-term security of projects coming.
Authors from several countries were already suspicious, such as Iran. Anyone from Russia and China or unknown places are all potential risks now.
Combined with recent inclusive ideologies, it’s gonna cause hard conversations. There will be a furthering in segmenting the Internet. Why fight contributing to an open source project when you could fork it and contribute with your allies?
For true enemies, there’s no risk to licensing or copyright issues. You can merge changes from the original, no problem. China even falls into this as there’s a limited ability for US companies to litigate within the country.
People think the Network State is hot, but at the end of the day, the Internet still has borders.
I don't see how blocking contributions from people in Russia etc will help. Malicious actors can simply falsely claim to be American. Is GitHub going to start verifying citizenship? Even if GitHub did that, it likely wouldn't be too hard to fake.
And to be honest, it's not like getting US citizenship for their agent is difficult for a government agency. The same goes for most other countries.
Keep in mind that most places allow you to literally buy citizenship through investment. The amount you need for a country like US is prohibitive for the vast majority, but, again, is not really a problem for another government.
As an American company they must presumably already do this to avoid violating sanctions, and least for anyone giving them money. It’s not a huge stretch to imagine they could also do so for free tier users.
I don't think they need to verify citizenship. I think IP geolocation is sufficient to comply with sanctions. That's not going to stop a malicious actor though.
this is a wild prediction to make and disturbingly regressive
FOSS is one of the most beautiful examples of supranational collaboration, and is in my experience much more integrated than the web at large, in a way that has nothing to do with "recent inclusive ideologies"
Hashi never sold me on the integration of their products, which was my primary issue with not selecting them. Each is independently useful, and there is no nudge to combine them for a 1+1=3 feature set.
Kubernetes was the chasm. Owning the computing platform is the core of utilizing Vault and integrating it.
The primary issue was that there was never a "One Click" way to create an environment using Vagarent, Packer, Nomad, Vault, Waypoint, and Boundry for a local developer-to-prod setup. Because of this, everyone built bespoke, and each component was independently debated and selected. They could have standardized a pipeline and allowed new companies to get off the ground quickly. Existing companies could still pick and choose their pieces. On both, you sell support contracts.
I hope they do well at IBM. Their cloud services' strategy is creating a holistic platform. So, there is still a chance Hashi products will get the integration they deserve.
FWIW, "HashiStack" was a much discussed, much promised, but never delivered thing. I think the way HashiCorp siloed their products into mini-fiefdoms (see interactions between the Vault and Terraform teams over the Terraform Vault provider) prevented a lot of cross-product integration, which is ironic for how "anti-silo" their go to market is.
There's probably an alternate reality where something like HashiStack became this generation's vSphere, and HashiCorp stayed independent and profitable.
I was an extremely early user and owner of a very large-scale Vault deployment on Kubernetes. Worked with a few of their sales engineers closely on it - was always told early on that although they supported vault on kubernetes via a helm chart, they did not recommend using it on anything but EC2 instances (because of "security" which never really made sense their reasoning). During every meeting and conference I'd ask about Kubernetes support, gave many suggestions, feedback, showed the problems we encountered - don't know if the rep was blowing smoke up my ass but a few times he told me that we were doing things they hadn't thought of yet.
Fast forward several years, I saw a little while ago that they don't recommend the only method of vault running on EC2, fully support kubernetes, and I saw several of my ideas/feedback listed almost verbatim in the documentation I saw (note, I am not accusing them of plagiarism - these were very obvious complaints that I'm sure I wasn't the only one raising after a while).
It always surprised me how these conversations went. "Well we don't really recommend kubernetes so we won't support (feature)."
Me: "Well the majority of your customers will want to use it this way, so....."
Just was a very frustrating process, and a frustrating product - I love what it does, but there are an unbelievable amount of footguns laden in the enterprise version, not to mention it has a way of worming itself irrevocably into your infrastructure, and due to extremely weird/obfuscated pricing models I'm fairly certain people are waking up to surprise bills nowadays. They also rug pulled some OSS features, particularly MFA login, which kind of pissed me off. The product (in my view) is pretty much worthless to a company without that.
> was always told early on that although they supported vault on kubernetes via a helm chart, they did not recommend using it on anything but EC2 instances (because of "security" which never really made sense their reasoning).
The reasoning is basically that there are some security and isolation guarantees you don't get in Kubernetes that you do get on bare metal or (to a somewhat lesser extent) in VMs.
In particular for Kubernetes, Vault wants to run as a non-root user and set the IPC_LOCK capability when it starts to prevent its memory from being swapped to disk. While in Docker you can directly enable this by adding capabilities when you launch the container, Kubernetes has an issue because of the way it handles non-root container users specified in a pod manifest, detailed in a (long-dormant) KEP: https://github.com/kubernetes/enhancements/blob/master/keps/... (tl;dr: Kubernetes runs the container process as root, with the specified capabilities added, but then switches it to the non-root UID, which causes the explicitly-added capabilities to be dropped).
You can work around this by rebuilding the container and setting the capability directly on the binary, but the upstream build of the binary and the one in the container image don't come with that set (because the user should set it at runtime if running the container image directly, and the systemd unit sets it via systemd if running as a systemd service, so there's no need to do that except for working around Kubernetes' ambient-capability issue).
> It always surprised me how these conversations went. "Well we don't really recommend kubernetes so we won't support (feature)."
> Me: "Well the majority of your customers will want to use it this way, so....."
Ha, I had a similar conversation internally in the early days of Boundary. Something like "Hey, if I run Boundary in Kubernetes, X won't work because Y." And the initial response was "Why would you want to run Boundary in Kubernetes?" The Boundary team came around pretty quick though, and Kubernetes ended up being one of the flagship use cases for it.
Thanks for the detailed explanation - some of what you say sounds familiar, but this was nearly 5 years ago so my fuzzy recollection of their reasoning - I recall it being something like they didn't trust etcd being compromised on kubernetes. My counterargument to that internally was "if your etcd cluster is compromised by a threat actor you have way bigger problems to worry about than secrets"
My vague recollection is that that concern was that the etcd store (specifically the keys pertaining to the Vault pod spec) could be modified in some way that would compromise the security of the encrypted Vault store when a Vault pod was restarted. It's been a long time since I remember that being a live concern though, so I've mostly recycled those neurons...
(I have no idea what your infra is so don’t take this as prescriptive)
My feeling is that for the average company operating in a (single) cloud, there’s no reason to use vault when you can just used AWS Secret Manager or the equivalent in azure or GCE and not have to worry about fucking Etcd quorums and so forth. Just make simple api calls with the IAM creds you already have.
> Caveat: the HCP hosted vault is reasonably priced and works well.
HCP hosted Vault starts at ~$1200/month, you'd have to use a metric shit ton of secrets in AWS or GCP to come close to that amount. Yes Vault does more than just secrets, but claiming anything HC sells as reasonably priced is a reach.
Ah, they have changed the public pricing page. Maybe we were on a grandfathered in deal. They had a starter package between free and enterprise with configurable cluster options that was $60ish a month. We heavily used the policies, certs and organization features that made it a no brainer for that price point for things outside AWS like Heroku.
We were running about $12/mo in aws secrets with no caching and no usage outside our aws services. I taught the team how to cache the secrets in the lambda function and it dropped to a buck a month or less.
If they killed off the starter package then you are right, there are only outrageous options and HCP would not be worth considering for small orgs.
Others' concerns are valid; the separation of concerns makes infra changes safer and easier to understand. Infra-tooling is slow because of the inherent risk of managing stateful services.
Mixing the infra and application logic is the obvious path forward, though. Just as most applications don't need more than Rails and a single Postgres, most apps don't need customized infra. Simplifying the 80% unlocks cycles for more creative work. Suppose you are successful enough to have to define custom infra at some point. That's good for you. That's a great problem to have.
AI is going to standardize architectures going forward. Providing a simple set of tools in the same code as the logic makes the planner easier for Copilots and reduces the context windows. Terraform and application code have implicit dependencies. Colocating them lets you define explicit dependencies that are more understandable.
Thanks for the insight, your last point on reducing context windows for AI is exactly what Nitric is trying to achieve for humans as well reduced cognitive complexity for 80% of the work (from your second point).
I can see a path forward for Nitric as platform engineering solution where standardization of architecture is achieved (with the aid of AI as well), through building Nitric custom providers: https://nitric.io/docs/reference/providers/custom/building-c...
It is worth noting that the Nitric team (whom I work with) aren't suggesting that Nitric is an alternative for every Terraform project out there.
However, there are many use-cases without bespoke infra needs where it does suit.
Our hope is that people who genuinely are looking for assistance with their cloud deployment needs evaluate Nitric against their own projects requirements.
I agree here. Formal models have to become easier to create through.
Today's ecosystem requires advanced knowledge of system design and still coding abilities.
To democratize model generation we need a more iterative and understandable way of defining intented execution. The problem is this devolves into just coding the damn thing pretty quickly.
For sure! I agree, it needs better languages, education, and tooling. It's not about making a hard problem harder; it's about making it more accessible and straight-forward to teach and use in day-to-day work.
Being more clear and precise in our specifications would only benefit us and the AI/ML tool generating the code. We could lean more on the correctness built into the entire stack rather than having to proof-read a mess of inferred code, something we're terribly ill-equipped to do.
This is the same as having it in unstructured documents. Which means the auditing is still required funny enough.
So yes, this could be done. I'd love to see what run in the CI/CD for a change. When someone works on the wrong thing, or breaks compliance IRL, how do you backport it into this? "Alice is a software engineer, and created this SaaS account with her email when the company was founded. The admin email can not be changed and she has admin even though another role should control that"