Hacker Newsnew | past | comments | ask | show | jobs | submit | CraigJPerry's commentslogin



>> You can see my instructions in the coding session logs

such a rare (but valued!) occurrence in these posts. Thanks for sharing


At 650tb it's not a memory bound problem:

working memory requirements

    1. Assume date is 8 bytes
    2. Assume 64bit counters
So for each date in the dataset we need 16 bytes to accumulate the result.

That's ~180 years worth of daily post counts per gb ram - but the dataset in the post was just 1 year.

This problem should be mostly network limited in the OP's context, decompressing snappy compressed parquet should be circa 1gb/sec. The "work" of parsing a string to a date and accumulating isn't expensive compared to snappy decompression.

I don't have a handle on the 33% longer runtime difference between duckdb and polars here.


Imagine 1,000s of helm charts. Your only abstraction tools are an umbrella chart or a library chart. There isn't much more in helm.

I liked KRO's model a lot but stringly typed text templating at the scale of thousands of services doesn't work, it's not fun when you need to make a change. I kinda like jsonnet plus the google cli i forget the name of right now, and the abstraction the Grafana folks did too but ultimately i decided to roll my own thing and leaned heavily into type safety for this. It's ideal. With any luck i can open source it. There's a few similar ideas floating around now - Scala Yaga is one.


I'm curious what the google cli is that you're referring to. Could it be kubecfg (https://github.com/kubecfg/kubecfg)?

I've used it in the past (for a quite small deployment I must say), but have been very happy with it. Specifically the diff mode is very powerful to see what changes you'll apply compared to what's currently deployed.


Yeah that's the one and the Grafana one is Tanka

>> optimizing python performance while keeping idiomatic?

That's impossible[1].

I think it is impossible because when i identify a slow function using cProfile then use dis.dis() on it to view the instructions executed, most of the overhead - by that i mean time spent doing something other than the calculation the code describes - is spent determining what each "thing" is. It's all trails of calls trying to determine "this thing can't be __add__() to that thing but maybe that thing can be __radd()__ to this thing instead. Long way to say most of the time wasting instructions i see can be attacked by providing ctypes or some other approach like that (mypyc, maybe cython, etc etc) - but at this point you're well beyond "idiomatic".

[1] I'm really curious to know the answer to your question so i'll post a confident (but hopefully wrong) answer so that someone feels compelled to correct me :-)


I don't know but given that you can't define happiness for someone else, it's a very personal thing, surely it's more insightful to flip the question on its head and figure out how to minimise suffering instead?

Don't ask "what stops you being happy?", instead ask if they're suffering - hopefully most of them are not, but if they are, what can be done about it?

I just have an aversion to someone trying to inflict their version of happiness on others i think.


Often, measures of happiness really means content. It doesn't mean a lack of suffering, but how well you accept it.

Nordic countries for instance are often ranked as the "happiest" even though their winters are terrible, no one is smiling in the streets and they have severe issues with alcoholism resulting in some of the strictest regulations regarding alcohol sales. But because they are accepting of their situation and support each other, they are considered happy.


Larger attack surface - you just need one of those N dependencies to fall for a spear phishing attack and you're cooked. Larger N is necessarily worse.

It depends on the software being written, but if it's a product your business sells or otherwise has an essential dependency on, then the best model available right now is vendoring dependencies.

You still get all the benefits of standing on top of libraries and frameworks of choice, but you've introduced a single point of entry for externally authored code - there are many ways you can leverage that to good effect (vuln scans, licence audits, adding patch overlays etc etc) and you improved the developer experience - when they check out the code, ALL of the code to build and run the project is already present, no separate npm install step.

You can take this model quite a bit further and buy some really useful capabilities for a development team, like dependency upgrades because they're a very deliberate thing now, you can treat them like any other PR to your code base - you can review the changes of each upgrade easily.

There's challenges too - maybe your npm dep builds a native binary as part of it, you now need to provide that build infra / tooling, and very likely you also want robust build artifact and test caching to save wasting lots of time.


Mojo is aiming at that. I've decided it's this years advent of code language for me and kinda looking forward to learning more about it.


That's a fine approach for "plumbing" type work, you know "join this thing to that thing then call that thing" - and that is most of the code in the world today but it falls apart in math heavy code.

You really just want operators when you're performing tons of operations, it's an absolute wall of text when it's all method calls.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: