Hacker News new | past | comments | ask | show | jobs | submit login
Please put units in names (ruudvanasseldonk.com)
1284 points by todsacerdoti on March 21, 2022 | hide | past | favorite | 594 comments



Built-in unit types are underrated. F# has them. Other languages should seriously consider adding them, despite the fact that feature bloat is a serious language problem and they're relatively niche: they are that useful, and I honestly don't think they would interfere much with other features.

Something like:

    unit type Meters = m
    unit type Seconds = s

    function sleep(time: Int[s]): void

    val speed = 5.4 m/s // Type = Float[m/s]
    val distance = parseFloat(prompt('enter time')) * 1 m // Convert unitless to meters just by multiplying
    val time = distance / speed // Type = Float[s]
    print("${time}") // Prints "# s"
    print("${time / 1 s} seconds") // Prints "# seconds"

    val complexUnit = 7 * 1 lbf/in2 // 7 lbf/in2
    // != 7 psi (too hard to infer) but you can write a converter function
    function toPsi<N : Numeral>(value: N[lbf/in2]): N[psi] {
        return value * 1 psi*in2/lbf
    }
It requires extending number parsing, type parsing (if the brackets aren't already part e.g. in TypeScript), and extending types to support units of measurement at least if they are statically known to be subtypes of `Numeral`.

Naming variables with their units doesn't solve the issue of mis-casting and using units incorrectly, and newtypes are too inconvenient (and sometimes impossible without affecting performance) so nobody uses them. Even as a very software-focused programmer I encounter units like seconds, bytes, pixels, etc. all the time, they are almost never newtypes, and I get bugs from forgetting to convert or converting incorrectly.


Yup, I have been using branded types in TS to avoid similar issues, and I still do miss F# units of measure.

Even though branded types prevent assignment of values with invalid units, opaque types are inferior in that the compiler doesn't know the relationship between the types (as exemplified in the parent comment) and you need a slew of helper functions & casting.


I just realized that you could probably do a decently complex unit algebra (at least to the basic level of F#'s unit of measure support) using the unit strings as brands and Typescript's template string types to match them. I'm not sure if that's a good idea or not in practice, but after seeing Wordle done with template string types I'm more certain than ever it is possible to do some interesting things with it.


I use a variant of branding in TS for this too, but it’s very much shoving semantics into the language that aren’t meant to be there. Subclasses of primitive types would be semantically closer to F#, but at the expense of being able to ever treat the values as values


In C++ units can, and often are, implemented on top of the existing type system.

See std::duration/time_point specifically for times, and boost.units for generalized unit support.

It is implemented via (templated) wrappers, but they are very close to zero overhead if not completely free. User defined literals also allow for a very succinct syntax, but to be honest, I usually don't bother. I normally write:

    auto delay = std::chrono::seconds{3};
instead of:

    auto delay = 1s;


Using templates I once implemented logic for dealing with SI units[1]. I also included angles, which is a bit odd but useful in practice.

It is really nice to have a type system that checks your formulas. Saved me a couple of times.

[1] https://gitlab.com/roeles/zen/-/blob/master



In C++ you can use user defined literals to make the units stuff compile time checked, similar to this https://github.com/bernedom/SI for example.


My Ti-89 has this. It's actually really useful because I don't always remember the magical ways units fit together and it solves and simplifies the units for me.


Long before that, very flexible units management was seen on HP-28 then HP-48 series.


So from your example I understand that F# doesn't know how to do more complex unit conversion? Does it know N=kg m/s^2 for example? That's related to the issue I often encountered with unit types. They work great for simple use cases, and give you a sense of security, but then fail and you end up writing lots of code just to make the unit types happy (you might argue that it makes the code safer, but it can be a lot of time spend). One cthing that falls over in pretty much all implementations I've encountered is log units (dBm, dBW,...)


> Does it know N=kg m/s^2 for example?

F# does if you tell it, i.e.:

   [<Measure>] type kg = g
   [<Measure>] type m
   [<Measure>] type s
   [<Measure>] type N = kg * m / s^2
Although, of course, in practice you'd probably just use whatever's defined in https://fsharp.github.io/fsharp-core-docs/reference/fsharp-d...

Not for log units tho.


Can someone break this down for me? val speed = 5.4 m/s // Type = Float[m/s]

I understand that the speed variable is automatically assigned a type of Float[m/s] based on the m/s there, but I'm confused about how the units are just placed at the end of the value assignment: val speed = {value} {unit}

Is this just a F# property that units can be added at the end of value assignment, and F# interprets them as units correctly?

Also, anyone know of methods for incorporating units as types in Python? It would be great to have an equivalent method as demonstrated here, where dividing types automatically creates a new type unit, and perhaps even associates with a related type, i.e: type(kg / (m * m * m)) == density


I think the syntax for expressions was extended to allow a `<unit expression>`, with or without angle brackets, at the end of any numeric expression. They're erased at compile time, so at runtime or via reflection you can't inspect the dimensions of a value. The only value of the units - which isn't insignificant - is the compile time checking.

Unfortunately the same isn't as easily done in Python:

1. You would need to reify these units as actual constants with overridden operators to track values, and some wacky (if even possible) mypy type shenanigans.

2. The types aren't erased and so would impact the performance of any critical code. F# isn't quite used in high frequency trading, but it does target numeric heavy users like those wanting to write scientific workloads, and units help improve correctness there. But if the types are computed at runtime and not erased, it would be an enormous hit to performance.

Unless you can figure out some way to get "3.0 * m" to be a plain old datatype (float) in Python while still retaining type info in mypy/pylance/pyrite. Perhaps there's a way, but I'm not sure.

Edit: This Python library seems to be trying to solve for these issues! https://pint.readthedocs.io/en/0.10.1/

I'm not sure if they've actually solved the runtime performance problem, but I suppose as long as you're not doing unit assignment/conversion in a loop, it should be fine.


your example is a bit confusing because you mislabeled the prompt for distance in

    val distance = parseFloat(prompt('enter time')) * 1 m // Convert unitless to meters just by multiplying


See https://unyt.readthedocs.io/en/stable/ in Python. This sort of feature fits really well within Python's featureset.


In the land of JavaScript, one could implement unit types as a babel plugin.


Going from "could" to actually doing it are two very different things, however.


I don't see how you'd go about doing this. Babel doesn't have a type checker, and for type level features you'd need support from typescript.

So you need a typescript plugin - however the typescript compiler api (which btw is not officially supported) is not flexible enough to support new type syntax (for things like m/s) so you'd a fork.


babel-plugin-typecheck, babel-plugin-tcomb and babel-plugin-jsdoc-type-checker (among others) are able to do similar things, so there must be a way. And I don’t see why not, since babel is ”just” an AST parser/transformer.


Those plugins primarily help with with runtime type validations - the static type checking is delegated to Flow/TS.

The benefits around F# units of measure that the parent talks about are primarily focused on static type checking.

If you are fine with the overhead of runtime boxing, it is relatively straightforward to have different container classes for different units and have bridge apis to convert/operate between them. A babel plugin could potentially help with operator overloading here, but this is quite far removed from F# units of measure at this point.


Is writing a static unit type checker as a plugin a daunting task, then? I’ve never thought of it in detail, but it would seem that unit type checking rules for basic operations like add/subtract and multiply/divide would be straightforward, and everything else could be derived from there. The laborious task is just formulae.


I used to work at a company with two database fields, speed_kmph.

The documentation read "speed_kmph - This field contains the travelling speed in MILES PER HOUR - please do not be confused by the name".

Woo, great job guys.


DB field names can be really hard to change. All those crystal reports to update, and the sales managers pet access DB that the hero set up for him that you don’t know about.


>DB field names can be really hard to change.

I find it really hard to accept any excuse for something like that. It's why enterprise code is so sloppy, people doing the most expedient thing rather than what is correct.


How do you write code so that it can accommodate DB field name changes without breaking SQL reports / BI?

Might be easy if you own the whole codebase and all reporting, but you are also going to break all those linked BI reports that the analysis team has written.


Breaking the report is a feature.

If you don’t, it will assume kmph instead of mph and just present the wrong data! But sure…those graphs still look pretty…


Which links back to the original point from quickthrower that database field name changes are really hard to change then.

If you can’t make a change without it potentially breaking other stuff, it goes in the “not easy” category for me. Especially if you probably won’t know if stuff is broken until the change is pushed to prod.


Create a new column speed_mph. Create a trigger that makes writes to either column gets written to both (make sure to write it to avoid infinite recursion). Copy all data from speed_kmph to speed_mph. Deprecate speed_kmph. Change existing usages at your leisure. Delete speed_kmph.


You can't just copy data, it's said to be they're factor k (i.e. 10^3) hahaha Even switching to "speed_mph" still makes the warning/comment relevant: that just means that "m" is for "miles" and not "meters"


Fair enough, although sounds like it’s really hard to change then back to quickthrowers point.


Most often those reports are built on top of table views and not direct table contents. If you have an abstraction layer between its not hard to change whats under.

But that makes an assumption you didnt make any shortcuts.


Considering I was actually consuming the data, if there was an abstraction layer then the field name was in that.


What is "correct" is up for debate, and often depends on cost tradeoffs. Sometimes this is the "correct" thing to do, no matter how gross we think it is.


Not often, what has always surprised me is how regularly these sort of problems are self inflicted. A lot of people will just do what is expedient regardless of what the management position is.


This is why it's dangerous to encode the unit in the variable, because it will eventually change (we had code that was measured in seconds and had to be changed to measure in 20/ths of a second for more granularity, if the timer variable had been named "timer_seconds" it would have had to be changed everywhere.


If you change the unit, I think the variable should be changed everywhere.

It's hard to know what assumptions code is making and by forcing you to change the code everywhere (which should be pretty easy in any modern IDE) you at least have a chance to evaluate any issues that unit change could cause.


Someone doing a half-assed refactor doesn't mean it's "dangerous" to encode a unit in a variable or function.

I do agree that generally it's better to use something like a struct that's more flexible, but doing that for every function is also quite verbose. For some functions the time requirement may never change as well.


In this specific case, it seems like creating a new property with the new data in a new name would be the safest path forward. Ensuring the unit are in both property names will prevent confusion.


This sounds like a feature to me and modern IDEs make this type of simple refactoring quite straight forward for most languages.


It's even better when the documentation gets stale too. And now this is in meters/second but the name implies km and docs specify miles.


recently I was looking at some pcap parsing code which was storing the timestamp field into a timeval structure (that nominally has usec precision), but then treating it as nanoseconds (as pcap supports both resolution). It made for some very confusing reads!


Java and Kotlin have a nice Duration class. So in Kotlin you can do

  delay(duration = Duration.ofMinutes(minutes = 1))
which is equivalent to

  delay(timeMillis = 60_000)
Using the optional argument names here for clarity; you don't have to of course.

Sticking with the JVM, a frequent source of confusion is the epoch. Is it in millis or in seconds? With 32 bit integers that has to be seconds (which will set you up for the year 2038 problem). However, Java always used 64 bit longs for tracking the epoch in milliseconds instead even before the year 2000 problem was still a thing. Knowing which you are dealing with is kind of relevant in a lot of places; especially when interfacing with code or APIs written in other languages with more 32 bit legacy.

Things like network timeouts are usually in milliseconds. But how do you know this for sure? Specifying, them in seconds means that there's less risk of forgetting a 0 or something like that. So, using seconds is pretty common too. You can't just blindly assume one or the other. Even if you use the Duration class above, you'd still want to put the magic number it takes as a parameter in some configuration property or constant. Those need good names too and they should include the unit.


The Duration.ofMinutes thing doesn't address the exact same problem. Anyone can put the units on the "right side". You don't even need a fancy "Duration.ofMinutes" helper function. Clarity-wise that's not different than just putting a comment saying how long it is and what units. The problem in the article is getting the units on the "left side," which, yes, you have to put the units in the name of the function.


Ah no. The function accepts a `Duration` type which could be precisely 1 m 23 s 14 ms 10 us 8 ns. And so you don't need `delay_ms` vs. `delay_ns` or anything because the type encodes the precise duration. If you passed a nanosecond `Duration` into `delay` it will delay ns, if you pass a millisecond `Duration` it will delay ms.


You have to enforce the units within the function, but that’s not the same as putting them in the name of the function.

For example, in Erlang, the sleep function could/should have been written to take named tuples: `timer:sleep({seconds, 3})`


Go has time.Duration type as well and works like this:

time.Sleep(10 * time.Millisecond)


The Go compiler is also happy with time.Sleep(1000), which the new "expert full stack" dev just PR'd.

To me Durations are a zero sum gain, because in this case they hide relevant detail (int64 nanoseconds) without enforcing usage. Compare it against what Golang could easily provide -- time.SleepMilli(100), which is 40% less characters for my aging eyes to parse than time.Sleep(100 * time.Millisecond).

After all, in the very same package we have a unit in the name:

   t := time.Time{}
   t.UnixMilli() // 1647952024456


    timeout = timedelta(seconds=300)
    frobnicate(timeout)
Working with GCP or Azure's Python SDK is like navigating a jungle of types. Some calls return a `compute_engine_list_item` while others return a `compute_engine` type and these are difficult to inspect and reason about, because Python classes default to printing something along the lines of `<__main__.myclass at 0x7fa8864a1040>`, making heavily typed Python code quite difficult to work with.

No paradigm is going to save you from spaghetti-code, but being able to pass a list and get an integer in return, makes it very easy to reason about what you can do with these values, whereas it can be quite difficult to reason about custom types/objects.

My point is, that knowing if `frobnicate(timeout=300)` is in seconds or minutes can be just as difficult (or even more difficult) as knowing what specific object I need to instantiate and pass to `frobnicate` (in the above case a `timedelta`)


> being able to pass a list and get an integer in return, makes it very easy to reason about what you can do with these values

I disagree; to paraphrase Ian Malcolm: what you can do with these values is less important than what you should do with these values. For example, we can add a distance to a currency, if they're both int or float; that doesn't mean we should.

The most obvious example of this is "stringly-typed programming", where pretty much everything is "string". Can I append a user-input string to an SQL statement string? Sure; but I shouldn't. Can I append a UserInput to an SQLStatement? Not without conversion (i.e. escaping)!


In general you're right.. strong typing needs good tooling and thourough language support to be fun. Python is a bit lacking in both of these.

In this specific example i disagree, timedelta is part of the standard library and should be widely known and familiar.


Badly designed type systems should not be the measuring stick.

But in your example above, typed Python would instruct your editor and tools like mypy to flag any improper use of frobnicate. You can set up your editor to offer you a tip on what type to use as soon as you type in `frobnicate(`.

In cases where typing is not really available, I prefer to put types into APIs instead of variable names (and Python makes that great: `frobnicate(duration_as_timedelta=timeout)`).


I do not think the fact that Google API’s have terrible types is a good argument against typing in general.


When I dealt with the Azure SDk for Python I frequently wrote small scripts that created the types in question and then called breakpoint() so that I could examine the types interactively.


Fully agree with everything in this article. This is definitely a code smell to me to see unitless sizes and intervals specified. I know that whenever I’m doing a code review and I see some property or variable name like “timeout” or “size” I will ask the developer to change the name to make it clear what the unit is trying to portray.

When possible, I also encourage the use of better types than simple integer values, (like TimeSpan if .NET) as these further reduce ambiguity and the potential for mistakes.

This is such a simple thing to find and fix but it definitely helps in the long term.


I also agree but would offer a third suggestion - add the type to the function name as well, so you see this line in code: `timeoutSec(timeout)`. You can't enforce people to name the parameter correctly, this means that anyone reading doesn't have to go digging to know the parameter type. Also stuff like `timeoutSec(timeoutMs)` stands out like a sore thumb.


> I know that whenever I’m doing a code review and I see some property or variable name like “timeout” or “size” I will ask the developer to change the name to make it clear what the unit is trying to portray.

In a dynamic language, maybe.

In a static language, especially Haskell/Scala/Ocaml/F#, I hate duplicating the type.


I would go one step further and suggest that all physical quantities should either have the units in the identifier name or encoded in the type system. Meters, seconds, milliamps, bytes, blocks, sectors, pages, rpm, kPa, etc. Also it's often useful to explicitly distinguish between different unit qualifications or references. Seconds (duration) versus seconds-since-epoch, for example. Bytes versus page-aligned bytes, for another.

Having everything work this way makes it so much easier to review code for errors. Without this means that as a reviewer you must either trust that the units and conversions are correct or you should do some spelunking to make sure that the inputs and outputs are all in the right units.


100%. This is a baseline requirement where I work. If you don't either include the units in the identifier name, or use the type system where possible, your code is not getting merged.

The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.


> The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.

I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Asking bash coders to write "sleep 5s" instead of "sleep 5" - I doubt you'd get any objections at all.

But if you're putting a foo.bar.Duration on a getDefaultTimeoutFromEnvironment on a setConnectTimeout on a HttpRequest on a HttpRequestInitializer on a NetHttpTransport on a ProviderCredential to make a simple get request? People who've come from less ceremony-heavy languages might feel less productive, despite producing 10x the lines of code.


Well ignoring the silly stuff this would just boil down to something like:

    request = NetHttpTransport.Request()
    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)
which is verbose, but at least it's fairly clear.

And yes I'm calling a class named HttpRequestInitializer silly, I don't care if some language decided it should exist.


    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)
This is actually a very good example of what not to do, with the mistake being in whoever implemented setConnectTimeout()

I actually don't know this particular API, but I'm used to timeouts being in milliseconds, so that code looks wrong to me.

Much better API, and what the article is talking about, is to change this to:

    .setConnectTimeoutMilliseconds(getDefaultTimeoutFromEnvironment().seconds)
Now the mistake is obvious, and even if the original developer doesn't notice it will stick out in a PR or even a causal glance.


That's one of the benefits of having it exposed as a type-enforced parameter. setConnectTimeout() could take a Duration, which contains the amount and the unit, and therefore wouldn't care if consumer A provided a timeout in seconds, and consumer B provided a timeout in milliseconds.


Totally agree, but then I would expect the code would be:

    request.setConnectTimeout(getDefaultTimeoutFromEnvironment())
with getDefaultTimeoutFromEnvironment() returning a Duration.

Ideally this is consistent throughout the codebase, so that anything that uses a primitive type for time is explicitly labelled, and anything using a Duration can just be called "Timeout" or whatever.


This example is nice but if you put arguments metadata in the function name, you have to have one main argument, the function name can prove cumbersome if you have 3 or 4 arguments with units like

    .setPricePerMassInCentsPerKilogramsWithTimeoutInMilliSeconds(100,2,300)


I think you should rather do

  .setPrice(priceInCents=100, massInKg=2, timeoutInMs=300)


I'd argue there are better API patterns for this though -- keeping in mind this values code readability (and correctness) over micro-optimization:

    .setPriceInCents(100);
    .setMassInKilograms(2);
    .setTimeoutInMilliseconds(300);
or

    .calculate({
        priceInCents = 100,
        massInKilograms: 2,
        timeoutInMilliseconds: 300,
    });


Values with an intrinsic unit scale up to many units and values. If you declare the parameters of a "calculate" function as USACurrency, Weight and Duration you can write

    calculate(100cent, 2Kg, 300s)   
    calculate(0.1dollar,4.7lb/*approximate*/,5min)
    calculate(something.price(),whatever.weight(),options.getDuration("exampleTimeout"))
    calculate(USD(0.1),Kg(2),Minute(5))


All fair points. In this example I was suggesting what this might look like at the border of the application where you need to talk to some (standard) library which doesn't use the same convention.


A class named 'HttpRequestInitializer' and taking 10 lines to set a timeout on a HTTP request isn't merely hypothetical: https://developers.google.com/api-client-library/java/google... - and that's not counting any import statements.

(although getDefaultTimeoutFromEnvironment was artistic license on my part)


True, but like I said that doesn't make it not silly. Just harder to fix.

Edit: Also, overriding a class method dynamically inside a function? I usually program python these days and even I think that's wild.


Its an interface with only method. I suspect that most of us would just us a lambda now.


I'll never quite understand why it wasn't simply a function in the first place.


Java pre-8 only had anonymous classes, there were no lambdas


And Java lambdas are still syntax sugar for one-method anonymous classes.


This is where I divert. You just hard coded seconds into your test. Now your tests that cover this must take seconds to finish; thankfully you were not testing hours or days!

My last shop was a Go shop and one test I think shows this off was an SMTP server and we needed to test the timeouts at different states. The test spun up half a dozen instances of the server, got each into the right smtp state and verified timeouts in under 10ms.

The environment that set the timeout would either be "timeout_ms=500" or "timeout=500ms" (or whatever). This is where you handle that :)


Not sure I fully understood your objection, but the reason I specified seconds is because I presumed the setConnectTimeout to be part of the default HTTP library, which likely doesn't adhere to the same conventions, and that it expected seconds (which seem to be the usual for http libraries as far as I can tell).

Of course if the setConnectTimeout method was part of the same application you could just pass the timeout directly, but at the boundary of your application you're still going to have to specify at some point which unit you want.


If you're testing things with timeouts it's often a good practice to mock out the system clock anyway. That allows testing your timeouts to be nearly instantaneous and also catches edge cases like the clock not moving forward, having a low resolution, moving backward, ... deterministically.


The test could simply mock the default value to something reasonable like '.1 seconds' and test that duration instead, so I don't think this is a real problem.


This is actually revealing a different problem: the system clock as an implicit dependency. YMMV depending on support of underlying libraries, but I will typically make the clock an explicit dependency of a struct/function. In Go, usually it’s a struct field `now func() time.Time` with a default value of `time.Now`.


Many timeout functions take seconds as a floating point. So you could time out on 0.05 seconds (5 milis). But now the code is clear and less prone to bugs.


In Go, you don't pass a float, you pass a duration


Which is unitless, hence there is no problem. `time.Duration.Seconds()` returns a floating point.


The 80/20 approach of renaming "timeout" to "timeoutSeconds" or "timeoutMillis" is also valid. They key takeaway is to not make assumptions.


> I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Ideally it'd be as simple as:

typealias Dollars = BigDecimal

typealias Cents = Long

that's valid Kotlin, but the equivalent is doable in most languages nowadays (Java being a big exception).


I recommend against that, and use proper (wrapper) types.

I don't know Kotlin, but in most languages if you alias 2 types to the same base type, for example Seconds and Minutes to Long, the compiler will happily allow you to mix all 3 of them, defeating the protection full types would bring.


that's correct. typealiases are the wrong solution here. the better solution would be value classes, but of course, the unit shouldn't be the type.


> typealias Dollars = BigDecimal

> typealias Cents = Long

If I ever have to deal with monetary values in a program where someone thought this was a good idea, ... well, it really won't be the worst thing I’ve ever dealt with, but, still.

(If you have dollars and cents as types in the same program, they darn well better work properly together, which is unlikely to work if they are both just aliases for basic numeric types.)


don't do this!

    typealias Cents = Long
    typealias Meters = Long
    
    Cents(3) + Meters(4) = 7L
that's exactly the thing we want to prevent.

the classes shouldn't be for units at all, the type should represent the type of unit.

so instead of a class Seconds you should have a class Duration, instead of class Meters you should have a type Distance. that's because the unit of the different types can be converted between each other.

Dollars and Cents are a bit of a bad example because currencies can't be easily converted between each other, as conversion is dependent on many factors that change over time. meters, yards, feet, lightyears, miles, chain, furlongs, whatever describe a multiple of the same thing though, so a different type for each unit isn't necessary, as the input that was used to create the instance isn' usually needed. the counter example would be datetime with timezones - a datetime with a timezone convers a lot more information than the datetime converted to UTC.


What type of industry/product do you work in/on? And what sort of languages do you work in?


Not the commenter, but I work in scientific software development and it's just a minefield of different units, so being explicit is generally very useful. Even if you can assume you can stick to metric (you can't), what different users want varies across countries. For e.g. here in the UK we often want to represent data as millilitres, but in France the same measurements are often centilitres.

I don't use libraries to enforce it though, we did try this but found it quite clunky: https://pint.readthedocs.io/en/stable/


It varies across different users from the same city! The same family, even!

One piece of equipment I just finished working on measured and displayed vacuum at different sensors in PSIg, kPa, Pa, mmHg, and inHg. The same machine, the same measurement at different stages in the process, five different units!


Not OP, but I see that a lot in the aerospace and heavy industry sectors.

We keep laughing about "if we engineered bridges as we engineer software" ... the truth is that the areas where correct software matters tend to write very robust code, and the rest of the industry would be well advised to take notice and copy what they see.

Of course, writing robust code is a skill, and it takes extra time.


It takes time to learn, and to learn the value, and time to agree with the team that it's sensible. With this sort of thing - proper naming of variables - I disagree that it takes longer at point of use.


> and the rest of the industry would be well advised to take notice and copy what they see

I don't agree.

There is a good reason that aerospace industry writes robust code - in invests time (money) to avoid disasters that could cause, among other things, loss of human life.

On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.


>On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

Who knows. Maybe a billion dollar company that can't fulfill orders. Maybe a million people who suddenly can't use their bank online.


Yes, sure, but a "billion dollar company" in this case does not represent the whole industry.

You can probabbly find a some specific non-critical case in aerospace industry, but surely based on that example one would not suggest that the whole aerospace industry should just copy what they see in frontend dev.

Context matters, there are exceptions, but the standard practices are based on average scenario, not on extremes.


I'm not saying that all code should be developed under standards designed for embedded control system software. I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.


>>> and the rest of the industry would be well advised to take notice and copy what they see

>> I don't agree.

>> [web form validation example]

>> That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.

> I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.

This feels like a straw man.

The original argument is that the rest of the industry (that includes web, but a lot of other parts also) should copy what they see in aerospace industry.

I believe that would not be appropriate for the rest of the industry to just copy practices from any other part because each segment has its own risk factors and expected development costs and with that in mind developed their own standard practices. Nowhere did I state that the "web stuff can't be important" nor that there is no example of web development (form validation) where the errors are insignificant.

That said, I will go back to the "billion dollar company that can't fulfill orders" / "million people who suddenly can't use their bank online" catastrophe; this happens all the time. Billion dollar company doesn't fulfill orders, error is fixed, orders are fulfilled again, 0.0..01% of revenue is (maybe) lost.

In aerospace industry a bug is deployed, people are dead. No bugfixing will matter after that moment.

How can this two industries have the same standards of development?


Yes, as a comparison, let's just take all the units (em, px, %) out of writing CSS and see how fun that becomes to review and troubleshoot.


I'm certain that the Mars Climate Orbiter had a lot to do with this practice.


Also not OP, but I work on graphics software and we frequently deal with different units and use strict naming systems and occasionally types to differentiate them.

Even more fun is that sometimes units alone aren't sufficient. We need to know which coordinate system the units are being used in: screen, world, document, object-local, etc. It's amazing how many different coordinate systems you can come up with...

Or which time stream a timestamp comes from, input times, draw times (both of which are usually uptime values from CLOCK_MONOTONIC) or wall times.


As a bonus, coordinate systems can be left-handed or right-handed, and the axes point in different directions (relevant when loading models for example).


What does a type for ‘seconds’ do that an ‘integer’ doesn’t?

I may be misunderstanding this.


The type is "duration", not "seconds". "seconds" is the unit. You can think of the unit as an operator that converts an integer to a duration.

The advantages are:

- An integer doesn't tell you if you are talking about seconds, milliseconds or anything like that. What does sleep(500) means? Sleep 500s or 500ms? sleep(500_ms) is explicit

- It provides an abstraction. The internal representation may be a 64-bit number of nanoseconds, or a 32-bit number of milliseconds, the code will be the same.

- Conversions can be done for you, no more "*24*60*60" littered around your code if you want to convert from seconds to days, do fun(1_day) instead.

- Safety, prevents adding seconds to meters for instance. Ideally, it should also handle dividing a distance by a duration and give you a speed and things like that.

Under the hood, it is all integers of course (or floats), which is all machine code, but handling units is one of the things a high level language can do to make life easier on the human writing the code.


A function might take multiple integer arguments, each in different units. Separate types for each unit guarantees you won't pass the wrong integer into the wrong argument.

Eg

  func transferRegularly(dollars(10000), days(30)) 
Meaning clear.

  func transferRegularly(10000, 30)
Meaning obscure, error prone and potentially costly


With some languages like Python, you can use keyword arguments too (even out of order).

Eg. you could simply do

  transferRegularly(amount=10000, period_in_days=30)
I am always amazed how new languages never pick up this most amazing feature of Python.

Though obviously, this code smells anyway because 1. repetitive transfers are usually in calendar units (eg. monthly, weekly, yearly — not all of which can be represented with an exact number of days), so in Python you'd probably pass in a timedelta and thus a distinct type anyway, and 2. amounts are usually done in decimal notation to keep adequate precision and currency rounding rules (or simply `amount_in_cents`).

Still, I am in favour of higher order types (eg. "timedelta" from Python), and use of them should be equally obligatory for unit-based stuff (eg. volume, so following the traditional rule of doing conversions only at the edges — when reading input, and ultimately printing it out).


I see keyword arguments as slightly different though. The keyword is like the parameter name. The value is still a plain integer and (theoretically) susceptible to being given the wrong integer. In contrast, unit types allow for hard checking by the compiler.

In practice, with good naming it won't make much difference and only shows up when comparing the docs (or intellisense) for an API with how it is actually used.


> transferRegularly(amount=10000, period_in_days=30)

dollars(10000) is still better than this example, because: 10000 what? Pennies? USD? EUR?


Someone else caught me out on that too by suggesting making it `amount_in_dollars` elsewhere in the thread ;)

Now you can say how there are also AUD, CAD...

The point was simply that if units are needed due to lack of specific type being used, it's nicer to have that in the API when language allows it.


PHP got this property in PHP 8.

One problem with this approach is refactoring. If you wanted to refactor your example with the parameter "amount_in_dollars" then you would either have to continue maintaining the legacy "amount" argument, or break existing code.


So you mean just like with, eg. renaming a function? I agree it's an issue compared to not doing it, but a very, very minor one IMHO, and legibility improvements far outweight it.


Renaming a function comes with the explicit implication that the API has changed. But it might not be clear to someone maintaining a Python application that changing a parameter name might change an argument - that is not the case in any other language (until PHP 8).

Guess how I discovered this issue :)


Well, if the approach was more pervasive, you'd be used to it just like seasoned Python developers are. :)


if you type check, then it ensures that only a 'second' can be passed to the function. This requires you to either create a second, or explicitly cast to one, making it clear what unit a function requires.

As per the article, if you dont have proper names and just an 'int', that int can represent any scale of time...seconds, days, whatever.

In python youd need something like mypy, but in rust you could have the compiler ensure you are passing the right types.


Having a type system to figure this out for us would be great, but there are languages where this may not be possible. As far as I know, Typescript is one such example, isn't it?


Depends. Yes, newtyping is pretty awful in TS due to its structural typing (instead of nominal like Rust for example).

You could perhaps newtype using a class (so you can instanceof) or tag via { unit: 'seconds', value: 30 } but that feels awful and seems to be against the ecosystem established practices.

This is indeed one of my gripes with TS typing. I'm spoiled by other languages, but I understand the design choice.


I was recently dealing with some React components at work, where the components would accept as an input the width or the height of the element.

Originally, the type signature was

    type Props = {height: number}
This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

I've ended up changing the component to only accept pixels and changed the argument to be a string like this:

    type Props = {height: `${number}px`}
Of course, if passing a "0" makes sense, you could also allow that. If you want to also accept, say, `50em`, you could use a union-type for that.

I think this could actually work for other units as well. Instead of having `delay(200)`, you could instead have `delay("200ms")`, and have the "ms" validated by type system.

Maybe the future will see this getting more popular:

    type WeightUnit = 'g' | 'grams' | 'kg' | ...;
    type WeightString = `${number}${WeightUnit}`;
    function registerPackage(weight: WeightString): void;


> This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

Because it's React, the expectation is "px", using a string with a suffix to override it: https://reactjs.org/docs/dom-elements.html#style


You can do something like this:

    export interface OpaqueTag<UID> {
        readonly __TAG__: UID;
    }
    export type WeakOpaque<T, UID> = T & OpaqueTag<UID>;
And then use it like this to create a newtype:

    export type PublicKey = WeakOpaque<Uint8Array, {readonly PublicKey: unique symbol}>;
To create this newtype, you need to use unsafe casting (`as PublicKey`), but you can use a PublicKey directly in APIs where a Uint8Array is needed (thus "weak opaque").


I tried to solve this problem in a way that's reasonably pleasant here https://github.com/spion/branded-types


The pattern you want here is branding.


You can simulate type opaqueness/nominal typing with unique tag/branded types in ts. We’re using it on high stake trading platform and it works very well.


Most answers here are answering your question literally, and explaining why you'd add a type for "seconds". But in reality you shouldn't create a type for seconds, the whole premise of the question is wrong.

Instead of a type for seconds, you'd want a type for all kinds of duration regardless of the time unit, and with easy conversion to integers of a specific time unit. So your foo() function would be taking a Duration as an input rather than seconds, and work correctly no matter what unit you created that duration from:

    foo(Duration.seconds(3600))
    foo(Duration.hours(1))
    foo(Duration.hours(1) + Duration.seconds(5))


You can construct a linter that will prevent you from trying to add seconds to dollars.


A better example would be seconds to minutes. I think i recall a jwt related cve related to timestamps being misinterpreted between sec/ms for example.


Adding seconds to minutes can actually make sense. A minute plus 30 seconds is 90 seconds, or 1.5 minutes. Whether your type system allows this, though, is up to the project.

You can't add seconds to kilometers, or to temperature, or to how blue something is.


Ideally you have a compiler that simply refuses to compile ambiguous code


Ideally, there is no compiler.


Ideally, you would detect all errors before runtime. Usually the compiler is the last gate to make that happen.


Ideally, hardware execution would happen on a language that can be proven correct and does not allow the programmer to make syntactic or semantic errors.

The world is far, far from ideal.


Isn't there an implicit conversion?


Well your function could then accept multiple types (Miliseconds, Seconds, Minutes, Hours) and do the conversion between those implicitly.

The units are also extremely clear when they are carried by the type instead of the variable name, where a developer could e.g. change some functionality and end up with a variable called `timeout_in_ms` while the function that eats this variable might expect seconds for some reason.

If it is typed out you can just check if the function performs the right action when passed a value of each time unit type and then you only ever have to worry about mistakingly declaring the wrong type somewhere.

But wether you should really do all that typing depends on how central units are for what you are doing. If you have one delay somewhere, who cares if it is typed or not. If you are building a CNC system where a unit conversion error could result in death and destruction, maybe it would be worth thinking about.


Tells you what the integer represents, so you aren't off by several orders of magnitude.


What does the integer represent? Nano, milli, microseconds, seconds, minutes, hours, days?


I _really_, _really_ like F#'s 'unit of measure': https://docs.microsoft.com/en-us/dotnet/fsharp/language-refe...


This. I really miss it when working outside of F#. I work on scientific code bases with a lot of different units and have been burned by improper conversions. Even with a high automated test coverage and good naming practices, such problems can go undetected.


It's a big omission that it doesn't support fractional units. These come up in things like fracture mechanics for stress intensity.


Are you talking about ksi√in or MPa√m? That is not a problem.


F#'s units don't support 1/2 dimensions. So you can't do dimensional analysis through the type system.


God I love F#'s unit types. C# is okay and all, but F# is, IMO, the best FP-language-on-a-corporate-backed-VM ever, even if the integration with the underlying VM and interop can get a bit fiddly in places (Caveat, my opinion is about 7 years old).

Yeah, you heard me Scala.


Sadly it hardly got better, C# gets all the new .NET toys, VB less so, C++/CLI still gets some occasional love (mostly due to Forms / WPF dependencies), and then there is F#.


>Having everything work this way makes it so much easier to review code for errors.

Had one making me pull my hair put the other day in C#. C# datetimed are measured in increments of 100 nanoseconds elapsed since January 1st 1 AD or something like that. Was trying to convert Unix time in milliseconds to a C# datetime and didn't realize they were using different units. My fault for not reading the docs but having it in the name would have saved me a lot of trouble.


What the heck is with MS and weird datetime precision? I figured out some bug with a third party API was due to SQL Server using 1/300th seconds. Who would even think to check for that if you’re not using their products?


1/300 seconds is an odd one for sure. In the case of DateTime however, I'd say it is designed to hit as many use cases as possible with a 64 bit data structure. Using a 1970 epoch as is (logically) used for Unix system times naturally misses even basic use cases like capturing a birthdate.

It is quite hard actually to disagree with the 100ns tick size that they did use. 1 microsecond may have also been reasonable as it would have provided a larger range but there are not many use cases for microsecond accurate times very far in the past or in the future. Similarly using 1 nanosecond may have increased the applicability to higher precision use cases but would have reduced the range to 100 years. Alternately, they could have used a 128 bit structure providing picosecond precision precision from big bang to solar system flame out with the resultant size/performance implications.


God speed trying to parse dates from excel. They have bugs _intentionally built in_


Who would even think to check if the system is counting from 1970/01/01 you’re not using their products?


If they’re using ISO format I don’t really care what they’re counting from. But I care if some ISO dates are preserved exactly and some are rounded to another value… especially when that rounding is non-obvious. It took me months to even identify the pattern clearly enough to find an explanation. Up to that point we just had an arbitrary allowance that the value might vary by some unknown amount, and looked for the closest one.


This is an established standard.

It's really not any stranger than starting dates at 1 CE, or array indexes starting at 0.


> This is an established standard.

Established doesn't mean it is understandable without documentation. Anyone who is not familiar with it doesn't know why it starts in 1970 and counts seconds. You need to actually open the documentation to know about that, and it's name (be it unix time, epoch, epoch time or whatever) doesn't help in understanding what it is and what unit it is using.


The metric system is also not understandable without documentation, byt you don't need to explain it every time because every living person should have gotten that documentation drilled into them at age ten.

UNIX time is an easy international standard, everybody with computer experience knows what it is and how to work with it.


> everybody with computer experience knows what it is and how to work with it.

Thanks for the laugh.

The only ones who needs to know about unixtime is:

developers when they do something which takes it/produces it

*nix sysadmins

Everyone else with "computer experience" could live all their life without the need to know what unixtime is.


Yeah, that's what I meant. People who program or do sysadmin, ie anybody who will ever need to call a sleep function, should know what a unixtime is.


> anybody who will ever need to call a sleep function

Bullshit.

I never needed to know what unixtime is when I wrote anything with sleep().

All I needed to know is how long in seconds I want the execution to pause for, never ever I needed to manually calculate something from/to unixtime, even when working with datetimes types.


No, but as a person who has had a need for sleep() before, you are also the type of person who could be expected to know what unix time is.

Nobody is saying that the two things need to be connected, the point is that it can be name dropped in a type definition, and you would know what it means.


https://devblogs.microsoft.com/oldnewthing/20090306-00/?p=18...

Windows uses the Gregorian Calendar as its epoch.


I'm not sure I agree. When I convert fields to DateTime, I remove the unit suffix from the name. The DateTime is supposed to be an implementation-agnostic point in time. It shouldn't come with any units, and nor should they be exposed by the internal implementation.

The factory method used to convert e.g. Unix timestamps to DateTimes, now that should indicate whether we're talking seconds or milliseconds since epoch, for example, and when the epoch really was.


They do:

    .ToUnixTimeSeconds()
    .ToUnixTimeMilliseconds()
    DateTimeOffset.FromUnixTimeSeconds(Int64)
 
   DateTimeOffset.FromUnixTimeMilliseconds(Int64)

https://docs.microsoft.com/en-us/dotnet/api/system.datetimeo...

https://docs.microsoft.com/en-us/dotnet/api/system.datetimeo...


How does the C# DateTime type distinguish between dates and a point in time whose time just happens to be midnight?

Much of the C# code I've seen uses names like start_date with no indication of whether it really is a date (with no timezone), a date (in one particular timezone), or a datetime where the time is significant.

I'm certainly not a C# developer, though my quick reading of the docs suggests that the DateOnly type was only introduced recently in .NET6.


Yeah, before the new DateOnly (and TimeOnly) types, there was no built-in way in C# to specify a plain date. NodaTime[1] (a popular third-party library for datetime operations) did have such types though.

[1]: https://nodatime.org/


F# has unit support in the type system :)


An example for unfamiliar folks.

Many Languages

  var lengthInFeet = 2;
  var weightInKg = 2;
  
  var sum = lengthInFeet + weightInKg; // Runs without issue but is an error
F#

  [<Measure>] type ft
  [<Measure>] type kg
  
  let lengthInFeet = 2<ft>
  let weightInKg = 2<kg>

  let sum = lengthInFeet + weightInKg // Compile time error
More info at https://fsharpforfunandprofit.com/posts/units-of-measure/


What precisely could have been changed to make you realize that C# DateTime is not the same as Unix time? Perhaps Ticks could be renamed to Ticks100ns but I'm not sure how to encode the epoch date such that it is not necessary to read any documentation. I suppose the class could have been named something like DateTimeStartingAtJan1_0001 but obviously would have been ridiculous.

Naming is an optimization problem: minimize identifier length while maximizing comprehension.


> how to encode the epoch date such that it is not necessary to read any documentation

And you need to read the docs to know why some systems use 1970 as the reference point. Should we rename it to Unix1970datetimeepoch everywhere?


The wonder elm-units is such a pleasure to work with and does just that.

https://package.elm-lang.org/packages/ianmackenzie/elm-units...

Even if you don’t work in Elm take a moment to look at it.


Nitpicks:

Why would your type system have encoded unit for kilo-pascal, but not hecto-pascal, mega-pascal, micro-pascal etc?

If you only encode base units (e.g. seconds), then we should use exact-precision arithmetic instead of f32 or f64, which is sometimes an overkill.

If encoding all the modulos (kilo/milli/mega etc) I feel like there are some units may have name clashes (e.g. "Gy" -- is it giga-years, or gray)?

Should we encode only SI units, or pounds/ounces/pints as well?


(e.g. "Gy" -- is it giga-years, or gray)

In my opinion this is not a real problem, since ideally no-one should such meaningless abbreviations in code. Just write giga_years or GigaYears or whatever your style is, problem solved, doesn't get any clearer than that.


In defence of Gy, it isn't meaningless in astronomy - it's a very well used unit. Though I do agree that it might be less common in code.


I myself can't remember seeing Gy in astronomical papers, but I've seen Ga (gigaannum).

https://en.wikipedia.org/wiki/Year#SI_prefix_multipliers



Isn’t Gy a bit much even in astronomy? With 15 Gy you have the age of the universe right?


Your comment piqued my curiosity, and I looked at https://en.m.wikipedia.org/wiki/Future_of_an_expanding_unive... and found:

"Stars are expected to form normally for 10^12 to 10^14 (1–100 trillion) years"

So it seems Gy and even Ty units will be a reasonable scale for events during the period of star formation.


Ah, that is a good point. For some reason I was thinking only backwards. I never considered that there’s orders of magnitude more time in front of us.


Does the type system handle equivalent units (dimensional analysis)? eg, N.m = kg.m^2.s^-2.

Does the type system do orientational analysis? If not you to assign a value of work to a value of torque and vice versa, as they both have the above unit.

There are several other similar gotchas with the SI. I think descriptive names are better than everyone attempting to implement an incomplete/broken type system.


The Python package astropy does all these things. There's a graph of equivalencies between units.

0. https://docs.astropy.org/en/stable/units/index.html


Speaking of astropy units. I had a hilarious issue last week, which was quite hard to identify (simplified code to reproduce):

  from astropy import units as u
  a = 3 * u.s
  b = 2 * u.s
  c = 1 * u.s
  d = 4 * u.s
  m = min([a, b, c, d])
  a -= m
  b -= m
  c -= m
  d -= m
  print(a,b,c,d)
Output: 2.0 s 1.0 s 0.0 s 4.0 s

Note the last 4.0, while min value is 1.0

The issue is that a, b, c, d are objects when astropy units are applied and min (or max) returns not the value but an object with the minimal value, thus m is c (in this particular case c has the smallest value) so c -= m makes m = 0, so d remains unchanged. It was very hard to spot especially when values change and occasionally either one of a, b, c or d has the smallest value.

In-place augmentation of a working code with units may be very tricky and can create unexpected bugs.


This is really an issue with Python in general (specifically, mutable types).

You'd get the exact same behavior with numpy ndarrays (of which astropy Quantities are a subclass).


> This is really an issue with Python in general (specifically, mutable types).

Unit-aware values as a type where assignment as mutation is an odd choice though (normal mutable types do not exhibit this behavior, it’s a whole separate behavior which has to be deliberately implemented.) It may make sense in the expected use case (and as you note reflects the behavior of the underlying type), but more generally it's not what someone wanting unit-aware values would probably expect.


That sounds like a bug in astropy type definitions: did you get a chance to report it as one?

While it can sometimes be undefined behavior (a minimum of incompatible units), in cases like these it should DTRT.


Mutable types are hard.


I see dimensional analysis, but in this table[1], torque and work have the same unit, and that unit is J.

The SI itself states[2]: "...For example, the quantity torque is the cross product of a position vector and a force vector. The SI unit is newton metre. Even though torque has the same dimension as energy (SI unit joule), the joule is never used for expressing torque."

[1]:https://docs.astropy.org/en/stable/units/index.html#module-a... [2]:https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-...


This is the answer from boost units:

https://www.boost.org/doc/libs/1_78_0/doc/html/boost_units/F...

tl;dr is uses some sort of pseudounits to make dimensionally similar but incompatible units, well, incompatible.


Time units are messy.

Once we get to Gigayears, what is the size of single year? Just 365 days? Or which of Julian, Gregorian, Tropical, Sidereal? At even kilo prefix the differences do add up. Or would you need to specify it?

Days, weeks and months are also fun mess to think of.


> Once we get to Gigayears, what is the size of single year?

The appropriate system of units is context-dependent. The astronomical system, for instance, has Days of 86,400 SI seconds and Julian years of exactly 365.25 Days; if you have a general and extensible units library, then this isn't really a difficulty, you just need to make a choice based on requirements.


This is going to depend on the precision that you need.

(Calculations without associated uncertainty calculations are worse than worthless anyway - misleading due to the inherent trust we tend to put in numbers regardless of whether they are garbage.)


You'll want to leave the point floating to floating point numbers, but whenever you interact with legacy APIs or protocols you want a type representing the scale they use natively. You wouldn't want to deal with anything based on TCP for example with only seconds (even if client code holds them in doubles), or with only nanoseconds. But you certainly won't ever miss a type for kiloseconds.


Isn't floating point specifically for dealing with large orders of magnitude ?

(Economics of floating vs fixed point chips might distort things though ?)

Also, in the case that you meant this : you might need fractions of even base units for internal calculations :

IIRC banking, which doesn't accept any level of uncertainty, and so uses exclusively fixed precision, uses tens of cents rather than cents as a base unit ?


The standard symbol for a year is "a". Using "y" or "yr" is non-standard and ambiguous, and they should be avoided in situations where precision and clarity matter.


Thanks, didn't know. Although in astronomy they use Gy often. PS: Don't know why people downvote, your comment is useful.


Given a powerful enough type system, you can parameterize your types by the ratio to the unit and any exponent. Then you can allow only the conversions that make sense.


I don't think the parent meant to exclude hecto-pascals from their hypothetical type system.


Why wouldn't the type system be able to take care of that?


In my software project, all measurements exist as engineering units since they all come from various systems (ADCs or DSP boxes). We pass the values around to other pieces of software but are displayed as both the original values and converted units. We have a config file that contains both units and conversion polynomials, ranging from linear to cubic polynomials. Some of the DSP-derived values are at best an approximation so these have special flags that basically mean "for reference only". Having the unit is helpful for these but are effectively meaningless since the numbers are not precise enough, it would be like trying to determine lumens of a desk lamp from a photo taken outside of a building with the shades drawn.


I love when types are used to narrow down primitive values. A users id and a post id are both numbers but it never makes sense to take a user id and pass it to a function expecting a post id. The code will technically function but it’s not something that’s ever correct.


Having units as part of your types improves legibility for whoever's writing code too, not just reviewing. You won't make (as many) silly mistakes like adding a value in meters to another in seconds.


If your use of a value in units lives for so long as for it to not be clear in what unit it is in or should be in (eg. spans more than 20 lines of code), I think you've got a bigger problem with code encapsulation.

I think the practical problem stems from widespread APIs which are not communicating their units, and that's what we should be fixing instead: if APIs are clearer (and also self-documenting more), the risks you talk of rarely exist other than in badly structured code.

Basically, instead of having `sleep(wait_in_seconds)` one should have `sleep_for_seconds(wait_time)` or even `sleep(duration_in_seconds=wait_time)` if your language allows that.

But certainly use of proper semantic types would be a net win, but they usually lose out in the convenience of typing them out (and sometimes constructing them if they don't already exist in your language).


The Python package astropy extends numpy arrays to include units.[0]

It can convert between equivalent units (e.g., centimeters and kilometers), and will complain if you try to add quantities that aren't commensurate (e.g., grams and seconds).

The nice thing about this is that you can write functions that expect unit-ful quantities, and all the conversions will be done automatically for you. And if someone passes an incorrect unit, the system will automatically spit out an error.

0. https://docs.astropy.org/en/stable/units/index.html


I think you would enjoy programming in Ada.


When I first started to learn Ada, I found it one of the most verbose languages I’ve ever used. Now I find myself practicing those wordy conventions in other languages too.


Have you come across a system or language that handles units and their combinations / conversions well? I have a project I want to undertake but I feel like every time I start to deal with units in the way I feel is "proper" I end up starting to write a unit of measure library and give up.


> physical quantities should [...] be encoded in the type system.

But types are not just useful to specify the content of a variable, they are also useful to specify the required precision.

So, if there is a type for seconds in the type system, should it be a 32 bits int or a 64 bits float? Only the user can say.


A generic type would be useful then:

type DurationSeconds<Value> where Value: Numeric { ... }

DurationSeconds<UInt32>

DurationSeconds<Float64>


This is exactly what C++ 11 did with std::chrono [0] except it goes one step further and makes the period generic too.

[0] https://en.cppreference.com/w/cpp/chrono/duration


And let's not forget money. Martin Fowler has been (correctly) banging this drum for years: https://martinfowler.com/eaaCatalog/money.html


You don't need a type system to do this. Generic operators/functions and composite structures are sufficient and more flexible. Some languages let you encode this in type systems as well, but that's an orthogonal feature.


Add to this any type of conversion or relations between units.

PIXELS_PER_INCH or BITS_PER_LITER rather than SCREEN_SCALE_FACTOR and VOLUME_RESOLUTION, avoids all kinds of mistakes like inverting ratios etc.


For everyone using Python and willing to try this:

https://pint.readthedocs.io/en/stable/


I'm sorry, but this whole comment section is in some collective psychosis from 2010. You don't need to mangle variable names or create custom types (and then the insane machinery to have them all interact properly)

Have none of you ever refactored code..? Or all of your writing C?

Your library should just expose an interface/protocol signature. Program to interfaces. Anything that is passed in should just implement the protocol. The sleep function should not be aware or care about the input type AT ALL. All it requires is the input to implement an "as-milliseconds()" function that returns a millisecond value. You can then change any internal representation as your requirements change


> You don't need to ... create custom types > > All it requires is the input to implement an "as-milliseconds()" function

A.k.a. custom type!


That's also a custom type. But for libraries, custom types for everything in interface definitions are problematic. Let's say you use library A which has some functions that take a time delta, and another library B, that also has functions which take time deltas. Now both library would define their own time delta type, and you have two types for time deltas in your application and most likely need to add an additional 'meta time delta type' which can be converted to the library specific time deltas types. This will explode very quickly into lots of boilerplate code and is a good reason to use 'primitive types' in interfaces.

If you replace 'time delta' with 'strings' then the problem becomes more apparent. Each library defining its own string type would be a mess.


If you are working with a crappy language then that's probably true. But it doesn't have to be a lot of boilerplate and it can be local in scope. In Clojure it's a one-liner.

    (extend-protocol
    libA/stringable
    myLib/InputThing
    as-string [x] (myLib/to-string x) )

    (libA/some-func (myLib/make-a-thing))
And sure it can just be doing something as simple as returning an internal value directly or calling some common stringifying method. You do have a good point that you may have redundant stringifying protocols across libraries - which sucks.

> This will explode very quickly into lots of boilerplate code and is a good reason to use 'primitive types' in interfaces.

I feel you were so close and missed :) The natural conclusion would be to have primitive interfaces - not primitive types. This way libraries only need to agree to adhere to an interface (and not an implementation)


> Have none of you ever refactored code..? Or all of your writing C?

You seem to ignore the much broader context here: the article wasn't just about code. It included configuration files and HTTP requests.

It's also worth noting that depending on the language in question and the usage pattern of the function/method, using interfaces or protocols can cause considerable overhead when simply using a proper descriptive name is free.


> [...] create custom types (and then the insane machinery to have them all interact properly)

That's an argument against languages with type-systems that make this a PITA, not against the idea itself.


'Why do at build time what you could do with bugs at runtime'?


premature abstraction is the root of all evil.


You might find this interesting: https://github.com/SciNim/Unchained


There are usually libraries for doing this. For Python, there are several, for instance “Pint”.


I was excited when I found Pint, but then was disappointed : too much extra overhead for the project I was then working on.

(I settled on units in variable names instead, was essential when some inputs were literally in different units of the same physical dimension.)

EDIT : more on this, and on astropy (which I was not aware of and/or didn't exist back then) :

https://github.com/astropy/astropy/issues/7438


> Option 2: use strong types


This is just about the best justification I have heard for type calculus.


I think Go has a reasonable approach:

  time.Sleep(1 * time.Second)


Like many things with Go, its approach seems reasonable and simple at first, but allows you to accidentally write code that looks right but is very, very wrong. For example, what do you think this code will do?

    delaySecs := 1 * time.Second
    time.Sleep(delaySecs * time.Second)
Now I insist on using the durationcheck lint to guard against this (https://github.com/charithe/durationcheck). It found a flaw in some exponential-backoff code I had refactored but couldn’t easily fully test that looked right but was wrong, and now I don’t think Go’s approach is reasonable anymore.


Perhaps the function shouldn't accept the unit of sec². Not least because I have no idea what a delay in that unit could signify.


It doesn't actually use units. Everything is in nanoseconds, so time.Second is just another unitless number.

  const (
   Nanosecond  Duration = 1
   Microsecond          = 1000 * Nanosecond
   Millisecond          = 1000 * Microsecond
   Second               = 1000 * Millisecond
   Minute               = 60 * Second
   Hour                 = 60 * Minute
  )


Note that the wonderful Go type system interprets time.Second * time.Second as 277777h46m40s with the type time.Second (not sec^2)


  time.Second * time.Second
The type of this is `time.Duration` (or int64 internally), not `time.Second` (which is a const with a value).

I agree, though, that this is not quite sound, because it can be misused, as shown above with `time.Sleep(delaySecs * time.Second)`.

In Kotlin you can do `1.seconds + 1.minutes` but not `1.seconds * 1.minutes` (compilation error), which I quite like. Here is a playground link: https://pl.kotl.in/YZLu97AY8


Certainly, but for that the type system should be rich enough to support unit designators.

I know how to implement that in Haskell, and that it can be implemented in C++ and Rust. I know how to logically implement that in Java or Typescript, but usability will suck (no infix operators).


Go tends to cover such things by incorporating them directly in the language. But then it tends to not cover them at all because it would "overcomplicate" the language...

For a good example of what it looks like when somebody does bother to do it, see F# units of measure.


This looks to me like the semantics are good but the implementation details are broken. 1 * time.Second * time.Second semantically reads to me as 1 second. If time.Second is some numeric value, that’s obviously wrong everwhere unless the type system reflects and enforces the unit conversion.


> 1 * time.Second * time.Second semantically reads to me as 1 second.

Which is wrong, 1s * 1s = 1s².

For example, the force of gravity is expressed in m/s² and describe an acceleration (m/s / s, aka a change of velocity per time units, where velocity is a change of distance per time units).


Okay so do I need to consult Relativity to program 1sec + 2min?


Since 1min could be 61 seconds[1], yes?

But assuming your comment is not a joke. You probably want to convert minutes to seconds in order to work with the same units, then add the scalar parts together.

That's how you deal with different quantities: convert to same unit, add values.

This is analog to fractions: 1/2 + 1/4 = 2/4 + 1/4 = (2+1)/4 = 3/4.

  [1] - https://en.wikipedia.org/wiki/Leap_second


In basic middle school math it’s common to multiply different units as a basic conversion mechanism. Multiplying by the same unit is semantically equivalent to “x times 1 is identity(x)”, and other cross-unit arithmetic implies conversion to ensure like units before processing. A typed unit numeric system would imply that to me. It would not imply I’m multiplying the units, but rather the scalar value of the unit.


> In basic middle school math it’s common to multiply different units as a basic conversion mechanism

EDIT: Yes, you multiply the units `2m * 2s` : you first multiply the units to get: `m.s`. This is what I say: you convert everything to the same units before doing the calculations.

> Multiplying by the same unit is semantically equivalent to “x times 1 is identity(x)”

This is wrong.

1kg * 1kg = 1kg² period.

What you're saying is `2kg * 1 = 2kg`, which is right, because `1` is a scalar while `2kg` is a quantity. This is completely different than multiplying 2 quantities.

> It would not imply I’m multiplying the units, but rather the scalar value of the unit.

That's where you're wrong. When doing arithmetic on quantities, you have 2 equations:

  x = 2kg * 4s
  unit(x) = kg * s = kg.s
  scalar(x) = 2 * 4 = 8
  x = 8 kg.s
Or

  x = 5m / 2s
  x = (5/2) m/s
  x = 2.5 m/s
There is a meaning to units and the operation you do with them. `5m / 2s` is 5 meters in 2 seconds, which is the speed `2.5 m/s`.

`2m + 1s` has no meaning, therefore you can't do anything with the scalar values, and the result remains `2m + 1s`, not `3 (m+s)`.


All unit conversions are actually multiplications by the dimensionless constant 1, i.e., no-ops.

Let's say that you want to convert `2 min` into seconds. You know that `1 min = 60 s` is true. Dividing this equation by `1 min` on both sides is allowed and brings `1 = (60 s) / (1 min)`. This shows that if we multiply any value in minutes by `(60 s) / (1 min)`, we are not actually changing the value, because this is equivalent to multiplying it by 1. Therefore, `2 min = 2 min * 1 = 2 min * (60 s) / (1 min) = 2 * 60 s * (1 min) / (1 min) = 120 s`. We didn't change the value because we multiplied it by 1, and we didn't change its dimensionality ("type") because we multiplied it by a dimensionless number. We just moved around a dimensionless factor of 60, from the unit to the numerical value.

I think that you misremember, or didn't realize that to convert minutes into seconds, you were not multiplying by `60 s` but by `(60 s) / (1 min)` which is nothing else than 1.


What is the "scalar value of the unit"?

Units can be expressed in terms of other units, and you can arbitrarily pick one unit as a base and then express the rest in it. But the key word here is "arbitrarily".

If multiplying by the same unit yield the same unit, then how did you compute area or volume in school?


Wait, would you really expect 1m * 1m to be anything other than 1m²? When does it ever happens that you want to multiply to non-unitless[1] measurements and not multiply the units???

[1] would that be unitful?


I expect 1 * m = 1m, and 1 * m * m = 1m because applying a unit doesn’t inherently have a value of that unit associated with it. (1 m) (1 m) obviously equals 1m^2, but ((1 m) m) is not the same expression.


Since when `((1 m) m)` is a valid mathematical expression?

You cannot have a unit on its own without a scalar value. It makes no sense.


If you look upthread, there was a mention of F# unit types. Taking off my programmer hat and returning to my middle school anecdote which also evidently made no sense: expression of a unit without a value is (or should be to my mind, based on my education) a cast, not a computation of N+1 values.

- 1 is unitless

- 1 * m casts the value to a value 1 of unit m = 1m

- 1 * m * m casts the value 1 * m = 1m to 1m then casts 1m to m which = 1m

Admittedly my educational background here might be wildly unconventional but it certainly prepared me for interoperable unit types as a concept without changing values (~precision considerations).


> If you look upthread, there was a mention of F# unit types.

And the syntax is `3<unit>` not `3 * unit`

- 1 is a scalar - 1m is a quantity - 2 * 1m "casts" 2 to a meter, but really this is just multiplying a quantity by a scalar - 2 * 1m * 1m "casts" 2 to meter², multiplying 2 quantities then by a scalar

I insist, `1 * m` does not make sense. This is not a valid mathematical expression, because a unit can never be on its own without a value.

> expression of a unit without a value is (or should be to my mind, based on my education) a cast

There is no casting in math. Mainly because there is no types, only objects with operations. A vector is not a scalar and you can't cast it into a scalar.

A quantity is not a scalar either, and you can't cast one into another.

A quantity is an object, you can multiply 2 quantities together, but you can't add them if they are different. You can multiply a quantity to a scalar, but you still can't add a scalar to a quantity.


> And the syntax is `3<unit>` not `3 * unit`

Well, yeah, F# represents this at the type level. Which I’ve said elsewhere in the discussion is preferable. Not knowing Go, but knowing it only recently gained generics, I read multiplying by `time.Seconds` (which does not have a visible 1 associated with it) as perhaps performing an operator-overloaded type cast to a value/type with the Seconds unit assigned to it. I’ve since learned that Go also does not support operator overloading, so I now know that wouldn’t be the case. But had that been the case, it isn’t inconceivable that unitlessValue * valuelessUnit * valuelessUnit = unitlessValue * valuelessUnit. Because…

> I insist, `1 * m` does not make sense. This is not a valid mathematical expression, because a unit can never be on its own without a value.

Well, if you insist! But you seem to be imposing “mathematical expression” on an expression space where that’s already not the case? Whatever you may think of operator overloading, it is a thing that exists and it is a thing that “makes sense” to people using it idiomatically.

Even in languages without overloading, expressions which look like maths don’t necessarily have a corresponding mathematical representation. An equals infix operator in maths is a statement, establishing an immutable fact. Some languages like Erlang honor this, many (most? I strongly suspect most) don’t! I couldn’t guess without researching it also treat infix = statements as an expression which evaluated to a value.

The syntax of infix operators is generally inspired by mathematical notation, but it’s hardly beholden to that. The syntax of programming languages generally is not beholden to mathematical notation. Reacting as if it’s impossibly absurd that someone might read 1 * time.Seconds * time.Seconds as anything other than 1 * 1s * 1s is just snobbery.

Not knowing Go, I focused on the syntax and the explicit values, and tried to build a syntax tree on top of it. I’m not a fan of infix operators, and I am a fan of lisps, so my mental syntax model was (* (* 1 time.Seconds) time.Seconds)), which still doesn’t “make sense” mathematically, but it can make sense if `*` is a polymorphic function which accepts unquantified units.


> Not knowing Go, but knowing it only recently gained generics, I read multiplying by `time.Seconds` (which does not have a visible 1 associated with it) as perhaps performing an operator-overloaded type cast to a value/type with the Seconds unit assigned to it.

This sums up your incomprehension. `time.Seconds` is just a constant. An integer with the value `1_000_000` meaning 1 million of nanoseconds.

In an expression of the form `a * b` you should always read `a` and `b` as constants. This is true for EVERY programming language.

> it isn’t inconceivable that unitlessValue * valuelessUnit * valuelessUnit = unitlessValue * valuelessUnit.

It is. For example, what would be the meaning of this:

  struct Foo {
    // ...
  }

  2 * Foo
Valueless unit (or any type) is just not a thing, not in math, not in any programming language.

> But you seem to be imposing “mathematical expression” on an expression space where that’s already not the case? Whatever you may think of operator overloading, it is a thing that exists and it is a thing that “makes sense” to people using it idiomatically.

Operator overloading works on typed values, not "valueless" types. In some programming languages (like Python), class are values too, but why implement `a * MyClass` when you can write `MyClass(a)` which is 100% clearer on the intent?

Using operator overloading for types to implement casting is just black magic.

> expressions which look like maths don’t necessarily have a corresponding mathematical representation

Programming languages and the whole field of Computer Science is a branch of mathematics. They are not a natural language like english or german. They are an extension of maths.

> An equals infix operator in maths is a statement, establishing an immutable fact.

An operator only has meaning within the theory you use it.

For example:

  `Matrix_A * Matrix_B` is not the same `*` as `Number_A * Number_B`
  `1 + 2` is not the same `+` as `1 + 2 + 3 + ...`
  `a = 3` in a math theorem is ont the same `=` as `a = 3` in a programming language (and that depends on the programming language)
As long as the theory defines the operators and the rules on how to use them, it does not matter which symbol you use. I can write a language where you have `<-` instead of `=`, and the mathematical rules (precedence, associativity, commutativity, ...) will be the same.

> Reacting as if it’s impossibly absurd that someone might read 1 * time.Seconds * time.Seconds as anything other than 1 * 1s * 1s is just snobbery.

First, that's not what I said. You should read that as `scalar * constant * constant` because reading that as `scalar * unit * unit` does not make sense nor in math, nor in any programming language.

If caring about readability and consistency is snobbery, then so be it.

> Not knowing Go, I focused on the syntax and the explicit values, and tried to build a syntax tree on top of it.

And the syntax is pretty explicit, because it's the same as math or any programming language: `scalar * constant * constant`. This is why using math as a point of reference is useful, you can easily make sense of what you're reading, no matter the syntax.

> I am a fan of lisps, so my mental syntax model was (* (* 1 time.Seconds) time.Seconds))

I still read this as `(* (* scalar constant) constant))`. And I expect your compiler/interpreter to throw an error if `time.Seconds` is anything without a clear value to evaluate the expression properly.

And I would expect to read `(* (* 1 (seconds 1) (seconds 1)))` as `scalar * quantity * quantity`, and I would expect to get square seconds as an output.

Anything else would not be correct and have little to no use.


You can just do `1 * time.Second + 2 * time.Minute` to do that. Adding times works intuitively. It's multiplying durations that gives you accelerations.


I’m not a Go developer, but I understand that from a type and mathematical theory perspective Go’s time.Duration is extraordinarily awful, because of Go’s simplistic type system.

int64 * Duration → Duration and Duration * int64 → Duration both make sense, but I gather this only works with constants. For other values, I believe Go only gives you Duration * Duration → Duration which is just wrong, wrong, wrong, requiring that one of the two “durations” actually be treated as though unitless, despite being declared as nanoseconds.

In the end, it’s probably still worth it, but it’s a case of Go trying to design in a certain way for ergonomics despite lacking the type system required to do it properly. I have found this to be a very common theme in Go. Also that it’s often still worth it, for they’ve generally chosen their compromises quite well. But I personally don’t like Go very much.


my 2c, I think the issue is simply in the name collision of units and the constants. (ie, "seconds" and time.Seconds) .

In reality most programming languages do not have units what so ever (built into the language, maybe tacked on as a library after the fact). They have int64s, a unitless value that just keeps track of whole numbers of whatever it semantically means to the developer. If we want to truly have units in values then we either need 1st class language support (including syntactical support) or a rich library that doesnt just "put units in [symbols]". One could probably make it happen with a type that keeps track of units

    ```
    type unitedVal struct {
         denomerator int64
         numerator int64 
         denomUnits string
         numerUnits string
    }

    func (united *unitedVal) Multiply(by unitedVal) *unitedVal {

        return &unitedVal{
           numerator: united.numerator*by.numerator,
           denominator: united.denominator*by.denominator,
           denomUnits: united.denomUnits + " * " + by.denomUnits,
           numerUnits: united.numerUnits + " * " + by.numerUnits,
         }
    }
    ```
and then filling out for all the other operations you want to support.


I mean in their defense I see almost nobody doing this right... Like to do full SI you really need 7 rational numbers, maybe a sort of inline “NaN” equivalent for when you add things with incommensurate units... you might also want display hints for preferred SI prefixes (might just be a dedicated format type), and while you're at it you might as well use a Decimal type instead of doubles, oh and probably these numbers should have an uncertainty in them, so probably you want a modular system where you can mixin units or mixin uncertainty and you can start from a base of doubles or decimals or, hell, you start experimenting with continued fractions...Sigh. The real world is complicated.

Every implementation of units is secretly trying to be Moment.js, basically.


All you need is a language that actually incorporates units of measure into the type system - i.e. you can define units orthogonal to types (including relationships between units), and then you can specify both the unit and the underlying numeric type in a declaration.

https://docs.microsoft.com/en-us/dotnet/fsharp/language-refe...

This can also be pulled off with somewhat less pleasant syntax on top of a sufficiently flexible parametrized type system - e.g. C++ templates.


And yet people actually do manage meaningful unit systems that don't allow this.


C++11's std::duration manages just fine without that much scope bloat. operator* for two durations is simpley not defined so will lead to a compile error.


Company i worked for 20 years ago had commercial engineering modelling program that did all physical types correctly and did scaling for user but it was fairly unique in scope and like you say almost all programming languages fall short here and it had its own quirks.


All that, and yet I don't see a single example of what would be "correct", or an example language that does it better. This just comes off as poorly thought out rant.

In my opinion, it is well designed. First of all, who is multiplying time.Duration against itself? I've been programming Go for a few years basically every day, and I've only ever seen the package constants used by themselves, or with untyped constant. I think it's a great syntax, better than any example in the article, as you don't have mystery numbers.


One alternative is rust Duration, which makes you spell out all your arithmetic operations. But this is exactly the approach taken by go time.Time, so the part about the type system being incapable or whatever is kinda misguided. https://doc.rust-lang.org/std/time/struct.Duration.html

C++ chrono::duration allows arithmetic operations with more sensible overloading. https://en.cppreference.com/w/cpp/chrono/duration


Once you go beyond constants (which in practice means literals, I think, but I’m not conversant enough in Go to be confident), Go requires that you multiply Duration by Duration—you can’t multiply it with a scalar outside of constants. In other words, as soon as a scalar multiple becomes a parameter rather than a constant, you can’t just do `n * duration`, but have to do something more like `Duration(n) * duration`, which is obviously physically wrong for a system of units, because the unit should be time squared, not time.

As for languages doing it better, approximately every single language that has strong static typing and uses dedicated types for times does it better. Rust is the one I’m most familiar with and comfortable with.


You seem to be confusing the Duration type with Duration values. Yes, if you insist on using a typed value, instead of an untyped constant, then that value needs to be type Duration.

But Duration(1) is way different than time.Second. honestly it just sounds like you don't know the language, and aren't willing to learn it. Go is not Rust. Things are different, that doesn't mean Go sucks.


The entire purpose of the distinction that I’m remarking on is that Duration * Duration → Duration is mathematically utterly incorrect, and especially super misleading when the base quantity for the unit is nanoseconds rather than seconds, yet that is what Go requires, beyond constants, which it special-cases. To be sure, with durations, the distinction doesn’t matter much because arithmetic performed is with constants, and that’s why I say that Go is probably still better with this wonky unit scheme than with entirely unitless quantities, but there are plenty of situations where you will want to multiply durations by typed numbers, and so you’re forced to do the mathematically-ridiculous `Duration(n) * duration` rather than `n * duration`.


I'm sorry, but I just don't think people use code like you're describing. In your mind, you see this:

    hello := 2
    hello * time.Second // oh no
But people actually use code like this:

    hello := time.Second
    hello *=2
People working with Go code (who know what they're doing), don't declare an int, only to immediately cast it to something else.


You’re looking at this from the application perspective, where with the specific example of durations constant multiplication is certainly far, far more common. But you’re discounting library concerns, where it would not be out of the ordinary to receive a time.Duration and an int64 from parameters or a struct or similar.

This is also pretty typical of the trade-offs Go makes: it focuses on making things nice for the application writer, mostly pretty successfully, but at the regular cost of pain and with serious typing compromises for the library writer.


What is correct is that duration ± duration = duration, duration * scalar = duration, timestamp ± duration = timestamp, timestamp - timestamp = duration, and anything else doesn't compile.


To call it out specifically: this does not include `duration * duration = duration`

Go is currently allowing that, which makes `delaySecs * time.Second` a billion times larger than it appears to intend. I've personally run across code that has this kind of flaw in it... at least several dozen times. It's the kind of thing that's only noticed when it misbehaves visibly while someone is watching it.

(I read a lot of other-teams' code, which is in various states of quality and disarray)


And it’s not just that Go allows that, but that that’s actually the only general way of doing it, as Go only allows duration * scalar in constant context (which is admittedly all most people do with durations, which is why I say it’s still probably better than Go did it this way, given their deliberately limited type system).


not being able to sum timestamps is a bit beyond the scope of units, it is more of a vector vs point distinction


If you exponent time.Duration, that's a user error.

What you're suggesting would be like a compiler error for multiplying two int64s


What they're suggesting is that `time.Duration` should not be an int64.

  1 second * 1 second = 1 second²
  1 meter * 1 meter = 1 meter²
  1 meter / 1 second = 1 m/s
Those are not user errors, those are physical values. A physical value is two things:

  - a scalar (int64, float, ...)
  - a unit (meter, second, inches, ...)
If the type system of your programming language does not allow you to define units, this should at least be a structure with a scalar and an enum, and functions to cast from one unit to another (if possible).

Working with units is a common thing in science.


> A physical value is two things:

I would argue it is actually three things: a scalar, a unit, and an indication of error – which is at least another scalar, but there are multiple ways of expressing error, so it might require more than just a single scalar (such as an interval and the probability the actual value lies within that interval.)

> If the type system of your programming language does not allow you to define units, this should at least be a structure with a scalar and an enum

Ideally more than just an enum – Newton = kg*m*s^-2 (equivalently kg^1*m^1*s^-2) – which suggests a set of pairs (unit and exponent).


Yes, thank you for the precisions, this just make my point stronger: int64/float are very ill-suited to represent such values.

And it's especially true for time units.

For example, "how many seconds is one month?" does not make sense, but "how many seconds is january/february/march?" does make sense. The unit "month" does not really exist, each calendar month is its own unit.

And "february" is not even a "stable" unit because sometimes it's 28 days, sometimes it's 29 days. Even a minute can some rare times be 61 seconds.

This is why in physics, we use seconds multiplied by powers of 10 and nothing else.

To my knowledge, there is not a single programming language that differentiate a "scalar" and a "quantity" (scalar, unit, error).


It should be fine to multiply two durations, but it should not return the same time.Duration type


> All that, and yet I don't see a single example of what would be "correct", or an example language that does it better.

F#, Rust, C++?..

> First of all, who is multiplying time.Duration against itself

Physicists do, every time they have to deal with acceleration - m/s^2.


You are right however physicists do not multiply duration x duration but time x time. Duration is discrete and time is infinitesimal.


The provided example is pretty rough, and I'm sure it's occurred in the wild. Sure, put the units in the variable name, and it encourages this kind of mistake, because time.Duration is not "seconds", it is a duration. The variable name should match the API of time.Sleep, which takes a duration. The variable should be named delay. A variable named delaySecs is the same kind of maintenance headache as a variable named "two_days_ago = 2.days.ago" in ruby.

The variables used for accepting and parsing input are the ones that should have units in them in this example. Although if you need a delay specified, it's valuable to be explicit and robust in the input and accept a string time.ParseDuration understands. Then you don't have this units problem in your variable naming at all, allows easier input of wider ranges of values by the operator, and makes input validation (if only a subset of durations are allowed) more concise and consistent.


I've seen a lot of code at my last job in Go where all duration variables included their units. It was amazingly bad Go code (it was built in part of the projects PoC by new Go devs) but it doesn't help that for the most part it did work.

Time was honestly our biggest source of bugs by far. Although adding time.Time ended up being more problematic than durations which were mostly only constructed like that in tests



Just contrived examples, like gravity * t^2 to get distance to fall and such, probably


Go's time package is famously horrible. First they didn't expose any monotonic clocks only wall time. Then after some public outages, like time travelling backwards for Cloudflare they were forced to act. In the end they managed to fold monotonic clocks into into the original type to cling on to the "Go just works" mantra, but adding even more edge cases.

https://pkg.go.dev/time#hdr-Monotonic_Clocks

> RRDNS is written in Go and uses Go’s time.Now() function to get the time. Unfortunately, this function does not guarantee monotonicity. Go currently doesn’t offer a monotonic time source (see issue 12914 for discussion).

https://blog.cloudflare.com/how-and-why-the-leap-second-affe...

> time: use monotonic clock to measure elapsed time

https://github.com/golang/go/issues/12914


Rust has a similar-ish API for durations, Duration::from_milis(1000) or Duration::from_secs(1) in the type system, and the method can just take a Duration struct and transform it into whatever internal representation it wants.

There is a Duration::new() constructor that's more ambiguous, but it's your choice as a dev to be ambiguous in this instance, and code review should probably catch that.


Yeah, Rust is one of the few languages that gets it right! And before they had the `Duration` type with `thread::sleep(d: Duration)`, there was `thread::sleep_ms(ms: u32)`, which is also unambiguous.


Or Ruby on Rails:

  sleep(5.seconds)
  sleep(1.minute)
  sleep(2.hours)
etc etc


In Ruby that would be

    sleep 3.seconds
Hard to to be more concise than that


Technically Ruby only accepts seconds. You're thinking of ActiveSupport from Rails.


Here's a fun way to annoy a Rails developer.

    (Time.now + 1.month).to_i == Time.now.to_i + 1.month.to_i
    #=> false


Not sure why that's annoying? The right side doesn't really make sense.

Though it does seem pretty easy for a novice to do thinking it's the same, but what does 1.month.to_i even mean!?


> 1.month.to_i

Duration of a month in seconds? Before you balk at the idea, there exists a definition of a constant month duration for accounting stuff. I you hate dates - and yourself - try accounting, there's mind boggling stuff that makes the engineer mind recoil in absolute terror.


Actually taken almost verbatim from the report summarizing a real and subtle bug (distributed across multiple files) in code written by definitely-not-novices.


I'm not really familiar to RoR. Is is due to Time.now getting called twice and each returns slightly different value?


Alas, no. It's because ActiveSupport's duration arithmetic is not distributive under conversion.

The expression on the left hand side advances time (as a domain object) by exactly a month, then converts the result to integer unix time. The expression on the right adds 2629746¹ to the current unix time.

The conversion becomes dangerously magical in the presence of shared code that accepts both object and integer representations of time & duration. A consumer from one part of a system can inadvertently obtain different results to another unless they use identical calling conventions.

[1] this is 1/12 of the mean length of a gregorian year²

[2] 365.2425 days i.e. 31,556,952 seconds


Oh wow. This is totally make sense when you think about it, but something that'll never cross my mind when casually checking the code. I guess this is why python's timedelta doesn't have month unit as the length of a month is highly context dependent.


I would just use `1.month.from_now` instead of the additions anyway


That won’t save you; 1.month.from_now is implemented by addition.


Are we golfing? Because that's identical to

   sleep 3


Are we golfing? This whole discussion is about clarifying units.


I must clarify, that was intended rhetorically, and in the most self-serving fashion; I try never to miss a golfing prompt


Or Dart

    Future.delayed(const Duration(seconds: 2)
    Future.delayed(const Duration(milliseconds: 2000)


That’s similar to Crystal. All numbers have built in methods to convert them to a Time::Span object. So I could have a function that takes a Time::Span instead of an Int, like:

    def sleep(num : Time::Span)
      # do something here
    end
I would call it like:

    sleep 300.seconds


I'm not sure why but I have a visceral, negative response to this. It might be the best solution, but it definitely /feels/ like the worst of all the worlds.


Is it? In most languages you do something like `time.Sleep(1 * 1e6)` instead at which point it could be a second, a few minutes, a day, who really knows?

I'm just not seeing any major downsides of this, keep in mind `time.Second` isn't the only one of its kind, you have millisecond, minute, hour, etc etc.


Having used it extensively, it's actually quite nice.


Overloading the asterisk is always weird because multiplication is expected to be associative etc etc.


I'm not sure I follow.

2 * 3 * time.Second is the same whether you group 2 * (3 * time.Second) or (2 * 3) * time.Second (namely, the implicit grouping under left-associativity).

You wouldn't normally write time.Sleep(time.Second * time.Second) because your units wouldn't work out. (Apparently you can write that in Golang; it's just very sketchy and results in a 30-year sleep.)


But from a mathematical point of view, the relationship between a unit and its coefficient is that you're multiplying them together. Why would it be weird to overload the multiplication operator to represent multiplication?


There is no operator overloading here. The type Duration is effectively an int64: https://pkg.go.dev/time#Duration


It's not overloaded. It's a unit. You can just type

  time.Sleep(time.Second)
but this reads nicely

  time.Sleep(3 * time.Minute)


time.Second*1 gets the same value; it's not overloaded. Well, ok, so what actually happens is that time.Second is a time.Duration, and Duration*int yields Duration (and int*Duration yields Duration).

But the value of time.Second is actually a Duration with value 1000000, IIRC -- it's microseconds. It's just the type that's special, and the int handling here is general over a lot of types.

It really is nice in practice.


(As noted elsewhere it's nanoseconds.)


It's not really overloading. it is associative:

    2*time.Hour+1*time.second == time.Second+time.Hour*2


I believe that's illustrating commutativity, not associativity.


The illustration is wrong, but the claim is correct; multiplication between units and scalars is just as associative as you'd expect. Multiplying one kilowatt by an hour gives you exactly the same result as multiplying 1 by a kilowatt-hour.


I hope late millenials and generation Z rediscover types soon.


huh? it's us millenials who decided that all dynamic typing was immoral and wrong. Back in the day Gen-Xers on HN and slashdot were talking about how great common lisp and ruby were.


Imma get my cane and hit you with Perl and PHP ;)


Except Common Lisp is typed language (with more expressive type system than most)


And now millennials seem to be doing the same with JS. sigh


Big fan of this.


And if you're working with an existing system that doesn't accomodate the suggested options, a well placed comment can go a long way. Provide the unit as well as WHY that value exists.

    # Session inactivity, in seconds
    time.sleep(300)

    # Prevent pileup of unprocessable requests, in seconds
    request_timeout = 10


Comments are misunderstood far too often as some sort of accompanying inner monologue. Instead, I see a lot of noise like

    # Sleeps before continuing
    time.sleep(…)

    # Time before request timeout
    timeout = …


One more tip: If you're ever designing an API like this, instead of:

time.sleep(300)

Put your units in the method name:

time.sleep_seconds(300)

Likewise, if you're making a .length() method where the units are at all ambiguous (like the length of a string), name your units. Bad: str.len(). Good: str.len_chars() / str.len_utf8_bytes(). (I'm looking at you, Rust!)


time.sleep_seconds is appealing, but this approach doesn't work when a function takes additional arguments. for example, if your function is

  poll_file_descriptors(read_descriptors, write_desciptors, timeout_in_seconds)
it would be awkward to change the name of the function just because on of the arguments happens to refer to time.


This is where Objective C brings subtle magic with mid-function name arguments. Sure it's crazy verbose but insanely readable: printFile(file: File)withDelayInSeconds(delaySeconds: int)andPrintColor(PrintColor: UIColor)


This is just the builder pattern but worse?


It’s actually just an obj-c method call. There’s no temporary builder object (or associated class) that needs to be implemented and instantiated like you’d need for a builder.


I would always make a local variable for things like this, to avoid magic numbers (especially if a function signature has several numeric arguments). Then you can explicitly put the units in the variable name, e.g. duration_s = 300 and pass that. It's fine if the signature tells you the units, but it doesn't help anyone reading the code if you have poll_file_descriptors(read_descriptors, write_desciptors, 300).


But how do you know the unit to use in the variable?


You have to read the documentation. But you need to do that in order to be able to use the API properly anyway.

The point the article is making is not about making it easier to write a line of code, but about making it easier to read a line of code. Given that most lines of code are read many more times than they are written, this is a good thing to focus on.

"Programs must be written for people to read, and only incidentally for machines to execute."

― Abel and Sussman, Structure and Interpretation of Computer Programs, 1984


Being forced to read the documentation carefully every time you call the method is really bad UX though. And the approach in the article (let ms = 5; sleep(ms);) can be wrong. Maybe the method is actually taking microseconds not milliseconds and this causes a bug. I can also imagine linters (or other humans) fighting this style, since it’s longer, and something you have to actively do every time you call the method.

Putting the units in the method name (when you can) fixes all of this. Sleep_milliseconds(5) is impossible to misuse or leave undocumented.


I'm sorry, but if a language forced me to write str.len_utf8_bytes() every time I wanted the length of a string, I'd just stop using that language.


Though it might encourage some to consider str.len_extended_grapheme_clusters()


Out of interest, which length did you want?


Number of characters


What's a "character"? A codepoint? A glyph? Is "fi" a single character? How about "ß"? "Ȣ"? "IJ"?

When programmers get answers to these questions wrong, the code that they write ends up being broken for someone out there. And they do get it wrong if their native culture instills a wrong kind of "common sense" about scripts. Which is precisely why we need more stuff like str.len_utf8_bytes() - because it forces the person writing it to consider their assumptions.


> Is "fi" a single character? [...] "Ȣ"? "IJ"?

I don't know these, someone who does could properly tell you.

The implementation could fallback to "glyph" or "codepoint" (don't know what those are exactly) if it is unsure. Mostly right is better than nearly always wrong (= returning number of bytes) IMHO.

> How about "ß"?

That's one character.

> When programmers get answers to these questions wrong, the code that they write ends up being broken for someone out there.

So? Returning the number of bytes would also be wrong.

Renaming len to len_glyphs will just result in most programmers typing more and some accidentally using len_utf8_bytes (returns the number of bytes?) where len_glyphs would be less wrong.


Yep. I think about this stuff a lot and I still mess this up all the time. I’ve come to realise that string.length in javascript is a footgun. Basically every time I use it, my code is wrong when I test with non-ascii characters.

I’ve spent the last decade writing javascript and I’ve never once actually cared to know what half the UTF16 byte length of a string is.

The only legitimate uses are internal in javascript (since other parts of JS make that quantity meaningful). Like using it in slice() or iterating - though for the latter we now have Unicode string iteration anyway.


The solution to this is as old as salt:

  typedef int seconds;
and

  time.sleep(seconds duration);


That doesn't really seem better, because the fact that the unit is seconds is only obvious if you happen to be looking at the function signature in the header or somesuch. This is because C typedefs are type aliases, not newtypes. If seconds were instead a struct with a single int field, or something like that, then that would help a little. To really solve this problem, you need information hiding, which C doesn't have except through questionable hacks.


I once went in to clean up a project that was designed by a committee of people spread all over the world. The unit was large moving equipment that if something went wrong, people might die. The unit was composed of several different CPU modules communicating on a property bus. Each module's software was written by a different group in a different part of the world.

The operator's requested speed was input in Feet Per Minute. The output to a Variable Frequency Drive was in tenths of Hertz. The tachometer feedback was in RPM, and to top it off all the internal calculations were done in Radians-Per-Second.

The first thing I did to get the project back on track was to adopt a standardized variable naming convention that included the units. For example the Operator Request became operator_request_fpm_u16. You then knew immediately you were dealing with Feet Per Minutes, and that it was a 16 bit unsigned variable.

After the variable name cleanup many of the bugs became self documented, when you saw something like "operator_request_fpm_u16 / vfd_hz_s32" in the code, you knew there was a problem that needed to be fixed...


This is Hungarian notation. A combination of both flavors into one. https://en.wikipedia.org/wiki/Hungarian_notation.

My take is that Hungarian notation exists to work around a deficiency in tooling. I think this is clearer when encoding e.g. u16 into the identifier, since that information is redundant (the declaration or schema already encodes that it is u16).


Worth mentioning Frink.

> Frink is a practical calculating tool and programming language designed to make physical calculations simple, to help ensure that answers come out right, and to make a tool that's really useful in the real world. It tracks units of measure (feet, meters, kilograms, watts, etc.) through all calculations..

https://frinklang.org/


Frink's units definition file is a great read: https://futureboy.us/frinkdata/units.txt

I use Frink very regularly as a calculator. I've got a keybinding in emacs to bring it up in comint-mode, which works very well.

It's also a general-purpose programming language, at least in theory. I actually tried using it for that purpose once, with I think a few thousand lines of Frink code in total. It was not a pleasant experience. It's fine if you want to write a short script that's dimensionally aware, but for modelling of a complex physical system there are much better tools, such as Modelica.


A really great read. I found particularly interesting the long comment about the Hertz inconsistency, I had no idea the S.I. Hertz, when applied to circular motion, was not a full circle per second.


I typically stick to "SI" units.

Then, somebody asks me to code in the temperature of a system. And I have to think: "Is now really the time that I want to teach people the difference a kelvins and celsius?"

So my rule becomes, SI "except" temperature. Sigh...


We also strictly stick to SI however we usually say kilograms in var names to be clear.

Haven't come across temperature however we would probably stick with kelvin.

We use a strict set of units in databases and while processing, conversions are localized if necessary only at the view layer.

We also only use UTC for date/times.

We only use E164 format (without spaces etc) for phone numbers: e.g. +12345678901 for an example number in OH, US. see National format https://libphonenumber.appspot.com/phonenumberparser?number=...

We only use iso3166-1 country codes and iso3166-2 region codes and translate on view.


This, one million times this. Use SI units. Don't measure distance in hotdogs, time in fortnights and speed in hotdogs per fortnight! It is as stupid as it sounds.

If you do, be explicit about it either in parameter or function name. I'm not going to put you on my shitlist if you're naming your function `microsleep`, but if I have to go look into implementation to see that you count timeout on your database in microseconds (looking at you, couchbase, like you ever could return something from a larger dataset in microseconds, lol) or, even worse, cache expiry time in minutes (hello unknown developer), I am going to go on the internet and complain about you.


This is actually an issue with thermal imaging cameras. Typically you'll get calibrated readings back in Kelvin, not in Celsius. Usually it's well documented by the camera manufacturer, but if you're providing an API to users you need to make them aware what the units are and make a decision on what you're going to return. For example this crops up if the sensor only provides counts which you need to convert into a temperature.

From a hardware perspective it makes sense to use K because you can encode the image directly using unsigned numbers plus some gain to allow for fractional measurements.


But you're still including the units in your identifier names (or encoded in type system), right?


No. Typically SI is implicit. Everything else is explicit.


SI doesn't prescribe that you have to use a single unit for all measurements. Are distance in meters or kilometers? Weights in kilograms or grams?

I assume you always just use the base units? kg, m, s, etc.? (I always think it odd that kilogram is the base.) I feel like could get weighty for some applications of a different scales when milligrams, millimeters, kilometers, days, etc. could be clearer. And even if you use "standard" units, if you aren't clear about what standard you use and what that makes that units, people won't always guess the correct option.


which is great when you're writing from scratch, but as soon as you have to start calling a library with functions based in non-SI units then you've got some ambiguity.


There should be a exception in every code standard that says SI units are OK where otherwise all lower case is enforced. Example to use mega (M) vs milli (m).


MKS or CGS?


CGS is so the sixties!


I'm confused, because one delta degree Celsius is exactly the same as one delta degree Kelvin. And you can convert with an offset.


In thermodynamic calculations you likely need absolute (Kelvin) values. But in many calculations where only temperature difference is used, either unit works equally well.


That sounds like a case for Kelvin. I still don't see why Celsius is the one exception that isn't SI.


I'll take a survey tomorrow to see how many people know what kelvin to C conversion is off the top of their head.

The other issue is when you get to Candelas!


How often are you doing the conversion manually? It seems like the sort of thing that should happen in the presentation layer so people never see the actual Kelvin amount. If you have a system where you always use SI then it's strange the have a single exception for temperature.


Hm, is it wrong to dream of a world, where this along with other basic science, would be considered basic knowlege?

Not blaming anyone who does not know it, but I would argue for more and better science education ..


in the genre of naming things, some related things to explore:

- avoiding naming things if you can help it

- using tooling to autogenerate names

- avoiding too short, likely overloaded names like `id`, `name`, and `url`

- don't choose lazy pluralization - eg instead of `names/name`, use `nameList/nameItem`

- encoding types to make Wrong Code Look Wrong, a famous Spolsky opinion (lite hungarian notation)

- coming up with grammars for naming, eg React had an exercise to name its lifecycles combinations of THING-VERB-ACTION, like `componentDidMount`, before open sourcing, which helped learnability

pulled from my collection of Naming Opinions here: https://www.swyx.io/how-to-name-things


>- don't choose lazy pluralization - eg instead of `names/name`, use `nameList/nameItem`

Isn't this just extra noise? If the type of the variable is an Array wouldn't `nameArray` be superfluous? Worse still is if the type changes but the name stays the same.

I get the advice is probably Javascript specific, but even in a Typescript world it doesn't make much sense to me to do this kind of type-in-name encoding.


Even in typed languages `names` and `name` are too similar to slow code reading down.


Exactly this. In many languages the compiler will help you.

But this has bitten me in Ruby, JavaScript and PHP several times. Runtime errors and downtime. Most recent: autocompled some updatedCartsItems when it had to be UpdatedCartItems. Both were used in the same class. Had they be named sensible, like CartListWithUpdatedItems and UpdatedItemList or something better, I'd have saved myself hours of WTFing through CI logs.


Disagree, but in that case wouldn't `nameList` and `name` be best?


> avoiding too short, likely overloaded names like `id`, `name`, and `url`

In a small enough context, no way :)

> don't choose lazy pluralization - eg instead of `names/name`, use `nameList/nameItem`

This just seems redundant and is a personal annoyance. Is plural meaning list or sequence not widely understood enough?


Next step, please don't call your variables ..._percent but interpret 0.5 as meaning 50% of the reference value.


I 100% agree.

That said, "percent" is a common and precise word, and I haven't come up with anything quite as good for variables in range [0,1]. I generally use "ratio," but I don't love the word. Is there anything better?


From one perspective it is the "normalized" unit, so not giving it a name makes as much sense as trying to find a word for it: if you were going to call something ProgressPercent the normalized float/double form is just Progress. This is probably the closest to what I try to stick too, but it can be hard especially if you have six other things that also want to be called "Progress" that aren't normalized floats/doubles.

(Another name for the "normalized" unit is sometimes the "unit" unit. ProgressUnit sounds dumb, but is an available option in English and shorter than ProgressNormalized.)


I use "andel", but if you don't write your variable names in Danish, then it might look a little strange. "Share" would be the direct translation, but that word usually means something else in programming. In English, I use "ratio" also; in context the meaning is always clear enough.


That's actually not a terrible idea! German is known for having words for concepts other languages lack, and I guess that applies to Danish by extension. Maybe I should ditch the thesaurus and teach myself German.


Maybe "fraction" would be suitable.


Bad naming is a universe of programming flaws, much of which boils down to not thinking about how a different person might read your code the first time.

Unfortunately "Theory of Mind" is weak in people on the spectrum.


How do you mean?


Suppose our application deals with classifying cheese, and has a field for how much fat a given sample has, maybe it needs 20% to be classified as Grade A Ste-Madeleine-de-la-Foo. Some developers like to call this field MinimumFatContentPercent and set it to 0.2.


Not GP, but 0.5% == 0.005, not 0.5.


Or join the ranks of your peers who have seen the light and use types.

Documentation should always be secondary to an obvious, descriptive interface. We've evolved beyond register positions and phonebook-style paper documentation. Use the tools available to you.


That would be the second paragraph in the article


The article addresses this somewhat using Python's type hints. If I give you his signature, can you intuitively tell me what unit is being taken in by the function?

    void addPadding(int height) {

    }
Types are useful, and even as a Python programmer I gravitate towards type hints for all new code, but any application where a programmatic type maps to a real-world type with common conversions (like, say, pixels, em, en, millimetres, centimetres, or inches in the above example) tends to be a victim of the same issue, where the type system isn't expressive enough to clearly describe the real-world type being assumed by the function.


That's because your not using the type system at all in your example. Change it to "addSomething(Length height)", and have constructors for Length.ofCm(2), Length.ofMeters(5) etc.


Sure, but this isn't a solution that's unique to statically-typed languages, and the syntax can get more awkward if you use a language without good OO support.

I also haven't come across a thorough implementation of unit conversions to do stuff like this (not that it's not a pattern that works -- I personally like this); it might be common in domains where dealing with real units is central to the business, but in industries where that's done incidentally, convenience classes like this just aren't around (in my experience).


the article mentions strong types


I can understand the reasoning, in this specific case. But having every function designating their unit is a new hungarian notation. Imagine working with:

setSpeedInKilometersPerSecond vs. setSpeed

I'd rather have it baked into the language ala CSS ("100ms").


First, I'd say that you probably want to invest in your IDE or editor.

But more important: there's middleground. What about setSpeedMmps() if setSpeedInMillimeterPerSecond is too noisy.

My ROT is that the unit or type designation schould not overshout the most important info. Which, in case of setSpeedInMillimeterPerSecond is arguably, indeed the case. But the solution is almost never binary, but a more succint version. Everyone understands km, mm, sec, hr (though the win with such is silly) px and so on.


> What about setSpeedMmps

Mmps would be confusing and frustrating the first time I think.


Do they charge you by written bytes? I really don't see the problem with larger function names...


"It's annoying" is a pretty good reason. If you don't mind function names like that, go become a Java Enterprise programmer, see how you like it after a few years.

In addition: it's the wrong solution to the problem. The units shouldn't be baked in to the function definition, just the type of quantity. You should be able to do both:

    setSpeed(10 m/s)
and

    setSpeed(45 km/h)
Which you can do in many languages. In fact, time functions in C++ work exactly like this. If you want to suspend a thread for 10 seconds you do

    std::this_thread::sleep_for(10s);
but if you want to suspend it for 5 milliseconds, you do

    std::this_thread::sleep_for(5ms);
And if you just supply a unitless number, it's a compiler error.


You are missing one crucial element here: `setSpeed(meterPerSecond: float)`


Are you coding without any form of auto completion?


As silly as this may sound, you may run into issues with certain linters or formatting tools in some languages.

A prominent example is PEP8 in Python, which suggests a line length limit of 79 characters (though 80-100 chars are considered to be "OK" as well). Long variable or function names can lead to distracting "forced" line breaks in such scenarios.


Is anyone these days? That said I, personally, do prefer more concise naming.


IMO it's about readability, not the typing that can be automated.


For C++ you can use boost units. It is basically built up of template magic which only compiles when your units are correct. The disadvantage: when it doesn't, it explodes in a huge template error. But when it does, you can be pretty certain (depending on the actual calculation) that your code is correct. It is also fully verified at compile time, so there is no runtime overhead.

https://www.boost.org/doc/libs/1_78_0/doc/html/boost_units.h...


I prefer using timedelta (or TimeSpan for the .NET crowd) but I’ve run into a somewhat funny resulting problem: often these time values are read in from a config file, so now people need to know/remember the serialized time span format. Like what is “00:50”? Is it fifty minutes? Seconds?


Sure, we got used to the format. I have a gist that among other things lists all of the cases, including the harder ones such as "10ms", or "3 days".

FWIW, in "TimeSpan for the .NET crowd", you would write "00:00:50" for (zero hours, zero minutes and) fifty seconds. Hours, minutes and seconds cases are easy now that we know the format.

Yes, you can omit some of the zeros, but your comment above makes the case that for clarity, you should not.


That's why you use a constructor with the unity names, not the conversion from a unityless string.


You don't have to use the default serialization format. You can have `timeout_seconds = «int»` or `timeout = «int»s` in your config file, store it internally as timedelta, and convert when you read the file.


Seems to negate the motivation for using a timespan datatype in the first place.


Along these lines: if you are designing an API for pricing data or fintech and think “let’s send things in integer cents of the currency so there aren’t any rounding errors” - setting aside that there are many situations where this is not appropriate, if you do still want to do this, for the love of Pete please name your variables and attributes with `_cents` in the names. No exceptions. It is the biggest unit-related footgun imaginable, and since it’s not a “unit” technically, it’s easy to overlook.


Assuming all currencies have 2 decimal units and subunits have 1:100 ratio is another major footgun: https://en.m.wikipedia.org/wiki/ISO_4217


Absolutely agreed! And a more nuanced approach is what Stripe does, which is to take the "zero-decimal-currency" as the unit of record, be that USD cents or the non-subdivisible JPY. https://stripe.com/docs/currencies#zero-decimal

But in that case, and especially in that case, in any system that might interact with, say, a UI where a field is in decimal USD, Stripe needs to be careful about which things are price_decimal and price_zerodecimal. Unless you have a near-religious fervor that all variables not annotated with _decimal are assumed to be _zerodecimal, in such a situation one might go so far as to auto-reject any PR that does not suffix all variables explicitly. Because the alternative is hard-to-detect 100x-off bugs that occur when battle-tested systems/libraries are reused in different contexts.


As we all know, naming things is so difficult.

Most currencies do not have "Cent" as a base unit. And there are currencies without a smaller base unit, most notably japanese Yen.

The problem is, that there exists no established currency neutral name to distinguish between the two possible units that coincide in such cases as Yen.

I was thinking about the following:

  class AmountOfMoney
    property int FinerAmount
    property decimal CoarserAmount
In the case of Yen (FinerAmount == CoarserAmount) is always true.

Any suggestions for better wordings?


Splitting the integer and the decimal part is always annoying because now you have to deal with non-normalized quantities. An option is to just use a custom float type:

  class AmountOfMoney
    int mantissa
    int exponent # base 10


This is a missunderstanding.

If we have for example USD 10.42, the values in my implementation are:

  FinerAmount = 1042
  CoarserAmount = 10.42
  
The profile of the class was abbreviated. Of course, we also need a currency property for the general case:

  Currency = USDollar
The definition of the class for the currency in pseudo-code:

  class Iso4217CurrencyCode 
    int NumericCode
    string AlphabeticCode 
    int DecimalPlaces   
    
  static USDollar = new Iso4217CurrencyCode(840, "USD", 2);
  static Yen = new Iso4217CurrencyCode(392, "JPY", 0);
The constructor for the amount of money only uses FinerAmount, and CoarserAmount is calculated:

  CoarserAmount => Convert.ToDecimal(FinerAmount / Math.Pow(10, Currency.DecimalPlaces));
So

  foo = AmountOfMoney(FinerAmount: 1042, Currency: USDollar);
  
sets foo.CoarserAmount automatically to 10.42.

But

  bar = AmountOfMoney(FinerAmount: 1042, Currency: Yen);
  
sets bar.CoarserAmount automatically to 1042.


In Python you can force the use of a parameter’s name with *

  def my_sleep(*, seconds: int):
     pass

  my_sleep(seconds=3)
This will force those who use your function to use the parameter name. A sort of documentation I suppose.


That's in the article...?


Sometimes the verbosity annoys me a little while writing it, but I do think Rust made the right decision making you write code like the following to use the standard sleep function.

  std::thread::sleep(Duration::from_millis(300));


In Ada you can do something similar:

    delay 0.3;
Or with package Ada.Real_Time:

    delay To_Duration (Milliseconds (300));
Or use `delay until`:

    delay until Clock + Milliseconds (300);
`delay until` is useful in loops because `delay` sleeps at least the given duration, so you get drift. (You call Clock only once and store it in a variable before the loop and then update it adding the time span each iteration)


In fact std::thread::sleep doesn't exist. There's sleep_for and sleep_until. Time deltas and time points are incompatible at the type level, so you have to do:

   std::this_thread::sleep_until(std::chrono::system_clock::now() + 300ms)
That's true for all functions that take timeouts (At least since C++11).

edit: s/thread/this_thread/


GP's snippet was in Rust (not that I know if it's valid Rust code).

However, in C++ std::thread is a class and it does not have a static member function sleep_until. (However, std::this_thread, which is a namespace, does have such a function).


D'oh! Thanks for the correction! Rust using std:: for its standard library does make some code snippets ambiguous!


Yeah, I agree. :-)


If you're passing around parameters, use a Duration class. Most languages have one:

* https://docs.oracle.com/javase/8/docs/api/java/time/Duration... * https://docs.microsoft.com/en-us/dotnet/api/system.timespan?... * https://en.cppreference.com/w/cpp/chrono/duration

If you're putting something in a config file, as the blog says - always put the unit in the key name.


If you're using Javascript or Typescript, there is an Eslint rule called "no-magic-numbers" which enforces a rule where numbers are required to be assigned to a variable: https://eslint.org/docs/rules/no-magic-numbers

You can set ones to ignore, a minimum value for the rule to apply, or require the variable to be a const.


I'm a fan of the middle ground here where you limit magic numbers to known constants and arithmetic. This means some constants don't have magic numbers, but also some inline uses don't necessarily have constants.

For instance 7, 24, 60, 1000 allowed in defining units of time, and any other intervals are defined as 5 minutes = 5 x 60 x 1000.

Still, I've had plenty of experiences with 'is this unit in seconds or milliseconds'. As I recall, Gilad Bracha at one point looked at attaching units to numbers but I think he got wrapped around the axle on automatic conversions - if you divide meters by seconds, this is a meters/second unit. But do you convert to newtons or joules automatically too?


60 minutes or 60 seconds?


This is more about units.

int length = 5

Is quite different than:

int lenght_in_meters = 5


I've heard of some Ada coding standards that require this in all variable names. It must have made the code awfully clumsy, but as the examples in OP show, it can be made nice. So the general advice is good. I'm slightly disappointed that the Haskell library didn't use a newtype, e.g. threadDelay (Microseconds 300).


Haskell has a couple libraries that would be great here; either the excellent time library, or the very powerful dimensional library with full statically typed SI units. Of course for something as core as threadDelay, a newtype is more appropriate as you say.


Another approach (used somewhat by Python, at least for time units) is use SI units or some other standard or convention, and be willing to use floating point numbers for measurements. That was considered bloat in the old days, but reasonable general purpose computers these days almost always have FPU hardware.


"Native" Ada way would be to declare units in types and require explicit casts (also, you can prevent casting to unsupported types, so no retrieving Int64 from hypothetical Duration_Nanoseconds


Most people are talking about time here, but the one I've seen that has caused the most issues in my experience is deg/rad. UI are pretty much always going to be in degrees, and trigonometric functions in radians, but if you forget to convert it's not always going to be obvious.


Yes, don't use Magic Numbers, but use named constants instead.

http://catb.org/jargon/html/M/magic-number.html

https://en.wikipedia.org/wiki/Magic_number_(programming)

    delay_ms = 300;
    [...]
    Thread.sleep(delay_ms);
Works in any programming language.


Fails in any programming language, too.

    delay_seconds = 300;
    [...]
    Thread.sleep(delay_seconds);
This kind of bug can easily surface if you need the constant for calls taking different units and can be hard to spot.


Yes; still parent solution is better than nothing if you have to deal with an existing API. At least if you make a mistake there is a chance someone else might spot it instead of wondering what the original intent was.


C++ has, in recent years, improved support for this. The best state of affairs is with time units:

    using namespace std::chrono_literals;
    auto lesson = 45min;
    auto day = 24h;
    std::cout << "one lesson is " << lesson.count() << " minutes\n"
              << "one day is " << day.count() << " hours\n";
and you can compare durations specified with different units etc. There mpusz/units library, which may go into the standard at some point, lets you do this:

    using namespace units::isq::si::references;

    // simple numeric operations
    static_assert(10 * km / 2 == 5 * km);

    // unit conversions
    static_assert(1 * h == 3600 * s);
    static_assert(1 * km + 1 * m == 1001 * m);
and also write conceptified functions which "do the right thing", e.g.:

    constexpr Speed auto avg_speed(Length auto d, Time auto t)
    {
        return d / t;
    }
this will normalize appropriately. But - the specific units used will need to be known at compile-time. Choosing units at run-time is a different kettle of fish.


I recently worked on some disk file formats (.vmdk, .vhd), and I always put the units in the name, because I'm always switching byte, sectors, blocks. Same for addresses, is it a LBA on the virtual disk, or an offset in the disk image file.


When I learned Objective-C I remember being flabbergasted at all the method and argument names that were exceedingly wordy. At some point I read a blog post[0] that said while this is out of the ordinary when it comes to most code having long, descriptive names helps when reading code.

Some people complain that it involves too much typing or takes up too much space. In reality, we spend much more time reading code than we do writing it, so names that provide context, direction, and fore-shadowing are really useful.

I still tend to write method names and important variable names like that to this day in pretty much any language I use. IMHO, its a great hack.

An extreme example is in the NSBitmapImageRep class [1]

[0]: https://www.cocoawithlove.com/2009/06/method-names-in-object...

[1]: https://developer.apple.com/documentation/appkit/nsbitmapima...


Autocomplete handles the "too much typing" case.


> Don’t design your config file like this:

> request_timeout = 10

> Accept one of these instead:

> request_timeout = 10s

> request_timeout_seconds = 10

What does ”s” imply? Can I write “10m” for 10 minutes, or is that 10 months? Non-standardized syntax is dangerous.

For config files, I use RFC 3339 duration syntax; i.e. “PT5M” for five minutes. It was the most standardized syntax for semi-human readable time periods which I could find.


`m` is not a legal value, should be either `min` or `mo` for minutes or months respectively.

`s` is actually one of the SI base units: https://en.wikipedia.org/wiki/International_System_of_Units#...

edit: actually, month should never be used for the case described anyway... 28-31 days?


If the syntax isn’t obvious without looking at the manual, you might as well use an actual standardized syntax. For months (without context, and where you need an absolute interval), I just interpret it as 4 weeks.


I just interpret it as 4 weeks

Funny, I would interpret a month as 30 days (since the lunar cycle is slightly over 29.5 days).


If the syntax isn't obvious, you should use however much verbosity it takes to make it obvious, because people will still ignore the standard and make mistakes.


Had I seen PT5M I would have never guessed it meant five minutes, so you would need to refer to the RFC then.


Agree with the post.

On JSON, I always do { unit:XXXX, value:XXXX }, it is quite verbose but it does help a lot when you least expect it.

I also wrote a small utility that transforms between values like { unit:'m', value:5 } to { unit:'km', value:0.005 } so that makes it super easy for me to pipe stuff from one context to another when is needed.


Do you have a link to this utility?


Oh sorry, I haven't open sourced it, I never thought of it actually until now.

As I now realize, it could be quite useful for others.


Java has a good library that supports a variety of measurement units: javax.measure https://docs.google.com/document/d/12KhosAFriGCczBs6gwtJJDfg... http://unitsofmeasurement.github.io/unit-api/site/apidocs/ja...

The java.time API goes in the same direction, but is obviously limited to things around time.

I think units should be really expressed with the type system, because types give meaning _and_ safety. If used APIs require typed units you wouldn't even need a coding convention to put the unit in a variable name.


An often better solution is to have a type which is not an integer/string as input and have various ways to convert from/to integers in a unit showing way.

Types like e.g. Duration (for e.g. sleep), Instance (for calculating time passed based on monotonic timestamps), Distance etc.

Through at the same time in static typed languages being generic over the unit is often unnecessarily and hinders productivity (e.g. Duration<TimeUnit> with types Duration<Seconds>, Duration<Hours>) similar having unit types (e.g. passing instances of Seconds, Hours to sleep) is also often not necessary and more harmful then good. (Exceptions exists, where you e.g. need to have where high precision and range while at the same time need to keep storage constraints as small as possible while also juggling units, like some scientific or embedded applications).


Or use an IDE? Here[0] is a screenshot of the Java example in idea, not only does it show an inlayed parameter name hint, but simply mousing over the method will show you the documentation.

[0]: https://i.imgur.com/8UcEO5N.png


This doesn't work looking at a PR.

It also doesn't make a unit error stick out, as you have to take the time to specifically hover to check. What I mean by this is consider you've opened a mature code base and are browsing around, and scroll past:

    timeout = 60000;
I'd assume this is milliseconds, and whether I check would depend on what I'm doing and probably also my mood.

Whereas if I see:

    timeoutSeconds = 60000;
It'll draw my attention and no matter what I'm doing I'll either open a bug or just fix it.


Programs should be comprehensible as text, without machine assistance.


Preach! This is a big issue across all projects written in any programming languages.

Without this small info, you have to grep the docs or worse, grep the source code itself. There's nothing wrong with looking at the source code but you shouldn't have to if you just want to use the public interface.


Please do not, unless you are actually implementing unit conversion logic.

Pass around types encoding certain quantities. You do not want a Thread to "sleep for x units", but rather "sleep for this specific Duration", therefore pass around Durations and implement appropriate utilities `.toSeconds(Duration d)`, `.fromTime(Time t1, Time t2)`. The very moment your `.frobnicate(int time_in_seconds)` gets wrapped in `foobar(int duration)` all meaning is lost and you have no control over it.

Type systems are there to encode meaning (sometimes including possible values) behind a value - use it. And if you use highly dynamic prototyping language in production... Well, inability to encode and enforce meaning behind a value is part of the compromise.


From the article:

> Option 2: use strong types, An alternative to putting the unit in the name, is to use stronger types than integers or floats. For example, we might use a duration type


This is not exactly the same thing, but for the love of god, if you have a monitoring or logging product, make the date and time zone visible wherever time is displayed. I cannot tell you how many times I've been sent screenshots of graphs, showing some catastrophic scenario at, say, 08:43, with no idea of what time zone the graphs are using, or what day the graph is showing.


make everything UTC, people. It'll make your life way easier. Try it out.


Internally yes, this is a no brainer. However in log files that are often read directly by humans, who want to see things in local time, it can become a balance and you might need to fight some admins for it. A proper structural logging system solves this, but log files on disk and other ad-hoc diy logs is still extremely common to see.


On-disk logs should use ISO-8601 in UTC, ideally first thing on the line.

That way you can merge them from different sources and sort them to help figure out interactions and causality. There might be clock drift from different sources but that's a much smaller problem than trying to eyeball multiple different logs concurrently, or write ad-hoc scripts to enable merging.


For pure servers, what you say is easy and yes, almost exclusively, the preferred option. The tricky situations usually occur in CI, where you script together a lot of weird tools, most of them originally designed to be run by hands on developers workstation, where they want to see local timestamps and never had time to implement log config. Some days you can consider yourself lucky if the log files have any timestamps at all.


Especially in log files I want to see UTC. When you have servers all around the world, I do not want to figure out what timezone was the specific server running in nor I want to convert between timezones when comparing logs of different servers.


Just live in GMT+0. Easy!


I have sometimes put my computer in the Iceland time zone for that reason when using dual boot with Linux, since Linux prefer to run UTC on the clock, and the Iceland time zone is UTC+0 with no daylight saving time. Making windows believe it is in Iceland make sure it does not mess up the clock. Though it is possible to set windows to UTC too with the same effect.


Only if literally _everything_ is UTC.

If you can’t go that far and still need local time zones in some specific part, just embrace your fate and keep and display timezones everywhere, even when it’s UTC.

TBH, I’ve never seen an organization that could 100% move everything to UTC. Especially with countries following DST and other shenanigans, and you want to quickly be able to compare events at the same local time.


Use timezones as an interface accommodation and not a storage format.

If you're going to do it in storage, keep it outside the ISO/8601 string. Chronology is the most important thing to get right. Locality comes after


Still need the timezone in the timestamp, UTC or not.


Absolutely. I support this 100%. But even if everything is in UTC, please, please make sure that it's clear that the time is UTC. That could mean an ISO 8601 timestamp with 'Z' or '+0' at the end, or a 'UTC' in the corner of your graph, or using unix timestamps.

Unless the whole world agrees to communicate times in UTC, at some point, you will share a screenshot or a log message with someone outside of your company, and they cannot assume what time zone it's in.


This is good advice, but ONLY for timestamps.

When people try to apply this to other time related info, it's always a disaster.


One especially frustrating version of this is a log or the like reporting a time as UTC but it’s not. So close.


Yeah, this is also happens in code. The solutions are the same, either use a type that is unambiguous (`timestamptz` in Postgres, `datetime` in Python with `.tzinfo is not None`, `DateTimeOffset` in .NET, `UTCTime` in Haskell etc.), or if you have to use naive datetimes, put the time zone in the name and call it `datetime_utc`.


If it was a Google product, it's implicitly Pacific Standard Time.


I prefer the use of Unix timestamps. Easier to filter with standard Unix tools, and if I really need formatted date+time in my output, I can always use awk.


I used to prefer unix timestamps too. But I Saw The Light.

For one: those standard Unix tools can convert from date/time values to timestamps just as easy. Arguably easier.

And a timestamp is merely an Int, so many languages, databases, APIs and so on, lack ergonomic methods to convert timestamps to datetimes. But have easy ways to do the reverse. It's almost always easier to convert dates, times or datetimes to timestamp ints (or floats) than to convert an int or float to a date, datetime or time.

A timestamp.toString() is unfamiliar to humans, a datetime.toString() is not. The list of small benefits in favor of actual date or time types just goes on.


I don like it, say I open a database table and search for some events , then look at the created_at column and see a timestamp, so now I need to copy paste timestamps into a webpage and convert it to human readable time to then notice is 5 years old and not what i am looking for.

What are you doing in this case?(in case you worked with database logs), export to csv and then do it with unix tools?


depending on the sql dialect, you can convert it in sql. or even add a virtual column with the conversion for your convenience


This usually bites me when it see a function that takes a parameter named angle. I usually have to read the docs or inline comment to know if it's in radians or degrees. At worst case, I have to go to read the implementation to figure it out.


C++ gets this right for time durations, and has packages that do it for other units as well.



Yes, the first being part of the standard and the abseil one a 3P implementation.

Boost units add all sorts of dimensional analysis.


user defined literals

using namespace std::chrono;

std::this_thread::sleep_for(30s);


In a lot of software you can enter units into input fields. For example in Blender I can move an object in a direction by entering something like: `(1m+5inch)/2`. Or I can set the max render time to `20sec` or `2min`.

When the input box is a function the only unit you need to know is 'distance'. So I don't completely agree with the article. A sleep function does not need the unit in it's name but should accept a domain unit: duration. The function itself can convert it to something the function can handle.

`sleep(2s+5ms)` should just sleep 2 seconds and 5 milliseconds.


Good C++ library for that topic is [0]. You can even go further and combine with something like [1] which is super helpful for kalman filters and other stuff where you have heterogeneous units in one vector.

[0] https://github.com/mpusz/units

[1] "Daniel Withopf - Physical Units for Matrices. How hard can it be? - Meeting C++ 2021" https://m.youtube.com/watch?v=4LmMwhM8ODI


but how can we keep the code readable even for people who haven’t encountered time.sleep before?

Contrarian opinion: You don't. You make people look up and internalise such information; that way they'll actually learn, as otherwise they'll forever stay within the realm of "beginner". Perhaps that's a goal for those who want to make programmers fungible (and I suspect a lot of the "readability" movement is merely an extension of that), but I don't think that's something we should encourage.


How is rote memorization of the arguments the stdlib of a language the difference between a beginner and an expert? I forget those kinds of things all the time, even for the language I use 99% of the time.


How is rote memorization of the positions of the keys on a keyboard the difference between a beginner and an expert?

Or for that matter, the spelling of words and their meaning in your choice of (human) language.

We call those who can't do the latter "illiterate".

Imagine if you had to look up almost every word you speak or write in a dictionary.


Being able to touch type makes you a good typist, and being good at spelling makes you good at spelling. Those two don't make you a good programmer or writer.


People have died because airplane mechanics thought they were working with one unit instead of another. Don't be a macho tough guy who thinks you'll never slip up.


Disagree.

What you say works at a small startup. Absolutely.

My reality and lots of other people at companies with larger code bases: you need to constantly look at code you have never seen before in one of many languages used at your company using one framework or another that is out of date by years or just came around the corner and you just heard the name of for the first time.

This is the reality of many an architect or team lead or principal engineer. Of course you will say: why make everything better for them, works fine for me who I only ever work in language X with framework Y? I agree that makes sense for the you of now. Why care?

Think about the future you. The principal engineer you. The architect you. Heck even just the 9 months from now you when you have moved on to the next framework. Never mind any other changes.


You’d prefer we made our code as hard to read as possible to increase job security?


Perhaps sleep in $YOURLANG is essential/trivial so there’s an argument that you should memorize it. OF course! It’s just one bit of info. But there are more APIs than sleep that take a number. Are they guaranteed to be consistent? No. What about `someVendorApi.setTTL(ttl: number)`? Is this also another thing to be looked up and internalized? What if there are hundreds of APIs that take numbers?

I would rather have my team spend time internalizing good design rather than the trivia of what units are associated with every numeric argument in every codebase.


Although there is some merit to learning by repetition and using memorization to keep core concepts top of mind, this is a very bad example of it since it’s literally just trivia. There is nothing at all fundamental about time.sleep being seconds. People memorize it because the have to and have used it enough times.

If time.sleep took a timedelta it becomes impossible to use incorrectly with a type checker since the caller specifies their own units. There is no merit to worshiping ambiguous design.


Sure, but how does the reader of the code know if the writer of the code did internalize this information or just made a mistake?


Swift is pretty good at this. e.g.:

  Task.sleep(nanoseconds: 3e11)


In swift I quite like DispatchTimeInterval's approach [1] (see Enumeration Cases). In most of my projects I end up adding a simple extension to TimeInterval to get the same behavior. Which makes reasoning about time very simple eg:

  Date().advanced(by: .hours(2) + .seconds(10))
I'm honestly not sure why TimeInterval doesn't include this representation by default.

1. https://developer.apple.com/documentation/dispatch/dispatcht...


or just have multi-dispatch, Julia:

  julia> using Dates

  julia> Millisecond(10)
  10 milliseconds

  julia> sleep(Microsecond(100))

  julia> sleep(Millisecond(10))


Maybe this is just me, but this suggestion seems worse than the problems

  def frobnicate(timeout: timedelta) -> None:
      ...

  timeout = timedelta(seconds=300)
  frobnicate(timeout)
Now instead of remembering that frobnicate takes an argument of seconds, one now needs to remember it takes a completely different type, the constructor for which now also needs to be memorized. When the main problem presented is one of memorization, this seems obviously worse?


The problem is one of readability, not memorization. A lot of readability amounts to taking what the initial author already knows and expressing it so that it's easy for everybody who comes after to know too.

Making implicit units explicit is a great example of that. Look at Martin Fowler's writings on Money types, for example. Or look at the $125 million failure of a space probe because people used implicit units: https://www.simscale.com/blog/2017/12/nasa-mars-climate-orbi...


> the constructor for which now also needs to be memorized

This seems less like a memorisation problem and more of an IDE one. If I'm presented with a type in my IDE, I can just look through its constructors and methods to find the one I require; no need for memorisation.


I suppose that it varies depending on the project/person, but I agree that using timedelta is less intuitive than just using seconds (as is done everywhere in Python).


Your editor will just tell you the type you need, no?


I remember as a kid, my Physics teacher used to penalize us for not writing the unit in the calculations. It seemed absurdly archaic at that time, but it makes so much sense now.


Thread.sleep (the java example) is ancient, but java is at least consistent and uses millis for everything. Almost all other timing functions in java have a 2-arg setup: `.foo(5, TimeUnit.HOURS)`. Where TimeUnit is an enum that has the usual options (HOURS, MILLIS, SECONDS, etc). It doesn't go beyond 'days' (once you get to months in particular, is that 28 days, 30 days, 31 days, 30.437 days (which is the actual average)....).


PHP offers sleep[1] and usleep[2] which sleep for a period in either seconds or microseconds respectively. I don't know why every other programming language ever hasn't adopted this habit. (There's also time_nanosleep[3] which definitely could be better defined, but it's PHP, sleeping for some amount of time between a nanosecond and a microsecond is pretty hilarious when you're running on an interpreter).

1. https://www.php.net/manual/en/function.sleep.php

2. https://www.php.net/manual/en/function.usleep.php

3. https://www.php.net/manual/en/function.time-nanosleep.php


The Unified Code for Units of Measure (UCUM) is a code system intended to include all units of measures being contemporarily used in international science, engineering, and business. The purpose is to facilitate unambiguous electronic communication of quantities together with their units. The focus is on electronic communication, as opposed to communication between humans. A typical application of The Unified Code for Units of Measure are electronic data interchange (EDI) protocols, but there is nothing that prevents it from being used in other types of machine communication.

http://unitsofmeasure.org

An example of use is in the FHIR (Fast Healthcare Interoperability Resources (hl7.org/fhir)) standard as a valueset.

https://www.hl7.org/fhir/valueset-ucum-units.html

[edit to remove dupe linke]


This is one of the reasons why I love F#, baked in support for units of measures. So instead of having to rename the function to include the unit I can just declare it to take the type `int<second>` and the compiler will complain if you try to pass an `int` or an `int<minute>` unless you declare an explicit conversion between the units.


Tangentially related:

Would love HN’s recommendations for blogs/writing on how to do better programming.

A lot of writing that I am exposed to is about the business of software, or engineering leadership, or architecture/tech stacks — which, to be clear, I really like reading, and find value in!

But I would love to read more thoughts on how to actually write good code on a regular basis.


Forever and always. :) If I see a numerical value that has units but doesn't in the name, I add it.

Also, I love F#'s units of measure.


PHP mess detector has a rule for this called "Magic Number Detector" (https://github.com/povils/phpmnd) which will cause an error when it encounters numbers that violate the parameters you set.


Similar topic, instead of var names, in typed languages you can use domain-specific types for primitives.

https://overcoddicted.com/domain-specific-types-for-primitiv...


I end up writing this function in most projects I work on. Python version:

    @lru_cache(maxsize=None)  # memoize
    def seconds(time_code):
        """number of seconds defined in DNS format: '2m30s' = 2 min 30 sec = 150s. h = hour, d = day, w = week"""
        result = 0
        number = ""
        for char in time_code:
            if char.isdigit():
                number += char
            else:
                secs = {"s": 1, "m": 60, "h": 3600, "d": 24 * 3600, "w": 7 * 24 * 3600}[char]
                result += int(number) * secs
                number = ""
        return result


Passing the time unit as a second parameter is used a lot in Java APIs and seems to work well


Java's Duration is also useful for time units

    void foo(Duration duration);

    var result = foo(Duration.ofHours(2));
You can also do simple calculations easily

    var bar = Duration.ofHours(2).plusMinutes(30);


...and so the tradition of loading up variable names with meta information that should really be in the type system is passed down from Hungarian notation to present day...

It's funny how the article talks about using the type system, but the title does not.


My strong recommendation at work is: _do not_ use configuration values such as "int dataCacheTimeMins" as these are prone to error, inflexible (cannot represent 2 mins 30 seconds) and not used consistently (will be mixed in with "int dataRequestTimeoutSeconds" and "int someOtherTimeMillis" and values named _without_ explicit units).

Instead always use "TimeSpan dataCacheTime" as this is more flexible: can hold values from sub-millisecond up to multiple days, and is easier to use with framework methods that will also expect this type.

In other words, use the appropriate type instead of putting the duration unit name in the var.


Eh, it seems the author never worked for a big organization and when it finally did, it has a pet peeve.

Lemme tell you, every single organization that respect itself, it will tell you, during code review, that you do not use magic numbers in code. It's a red flag, so his "Do this: frobnicate(timeout_seconds=300)" will actually fail the code review.

Here is what you would do instead. Declare a constant in some constants only unit, with explanation what it does and then use that constant, like this:

//constant unit

const TIMEOUT_SECONDS = 300;//bla bla why this is 300 seconds, usually from client requirement

//code unit

frobnicate(TIMEOUT_SECONDS);


But that still using a magic number, so obviously you should do something like this:

   const TIMEOUT_SECONDS = ONE * HUNDREDS;
The definition of ONE and HUNDREDS is left as an exercise.

Snark aside, the rule zero of all coding conventions is that they should not be applied if they do not make sense in a specific case.


The "bla bla" explanation will suffice. Usually will include a link to documentation, no need to further split the hair


In this case, the units are in the variable name. I think that was what the author was arguing for.


Julia seems to have some unit support (via package):

http://painterqubits.github.io/Unitful.jl/stable/


Is this not what comments are for?

frobnicate(300) // Duration in seconds to do X

Sure, change the variable names or whatever as well, but ... kinda seems we already have a way to make less obvious things obvious?


Problem with this is that you now need to ensure everyone adds that comment everywhere the function is used.

If it’s in the parameter name it’s impossible to use it without doing it correct, which is almost always preferable.


It helps. Even more so if documented with the function signature so documentation (in your editor or ide) can communicate it.

But one better than documenting comments is no documenting comments. Selfdocumenting code is certainly not always feasible. But in this case it's both possible and easy.


in the same vein, please name associative structures (hashes, maps, dictionaries, etc) as value-type-by-key-type (for example "username_by_id", rather than "username_lookup" or such. Conventions like these don't have an impact one-by-one, but rather when applied to a body of code as a whole. In this case when one has to do one lookup by the result of another the explicit "x-by-y" naming really helps reassure me the right values are being looked up!


In Elixir you can use an unused variable pattern matching:

    Process.sleep(_milliseconds = 300)
I started using it more and more because it makes the code more readable.


There's a good old article about this from Joel Spolsky

https://www.joelonsoftware.com/2005/05/11/making-wrong-code-...

It also makes the distinction between "apps hungarian" (good, unit-types, eg: meters, seconds) and "system hungarian" (bad, storage-types: float, int)

Sadly I've never worked on a codebase that follows this.


A while back I made a units and dimensions library where I worked. Everything was built into the type system, with a required SI conversion for each unit of a certain dimension.

This did the job really well, but it was incomplete. Being able to create new types dynamically, when multiplying/dividing was something that was always missing. I guess now you could probably do this with macro programming, but back then I don’t think my language had it.


For an example of physical units done right check out the GEANT4 monte-carlo system. It pretty much "just works"

https://geant4.web.cern.ch/sites/default/files/geant4/collab...


Well, the only correct way to do it then is the way Python does it.

The SI unit is seconds. So unless you have some weird aversion to using decimals to represent smaller units, any kind of delay function should take input in seconds.

Not sure if this should be a cautionary tale about using units so much as using the correct units in the first place.

Apparently there's a Java function that lets you specify time units as secondary input. Use that to make in unambiguous.


Well, no. There are language's default unit is milliseconds.

The foundation of web, javascript is one of them. Send seconds in api to a frond-end without documenting is just confusing and hugely ambiguous.

And it makes more sense from the GUI's perspective. What did you mean paint next frame after 0.016s? Isn't it much easier to read when write it as 16ms?


When dealing with vectors, it is also important to establish your frame of reference. Sometimes, though by no means always, there is a well-established convention. When it is not, however, it can get verbose to encode into identifiers. It can also be tedious, as no-one needs to be reminded of it every time they read an expression. In these cases, documentation is the best choice.


When I put together Kal (a compile to JavaScript language, now defunct), I was particular about this and used the sleep syntax:

    pause for 3 seconds


With some other options for units. https://github.com/rzimmerman/kal#asynchronous-pause


This is a very valid argument. Go has this time.Second duration type that is used often and it helps a lot understand things. But things like Js/Pure Ruby have just raw numbers where you have to either check docs or just remember that sleep takes miliseconds or seconds etc.

It seems types are useful piece of information. Who would expect?


I have a similar nitpick with boolean arguments. How many times have you seen something like: DoSomething(true, false, false)? It's much more readable if you use enums (in a language that doesn't let you specify the argument name at the call site). E.G. DoSomething(ENABLE_REPORTING, USE_UTC, OUTPUT_VERBOSE).


I prefer to do this with types versus variable/parameter names where possible, but yep runtime names otherwise. It annoys some people but the thing is almost always either autosuggested or showing up in your editor as you type as docs, so… to quote Mitch Hedberg, “we apologize for the convenience”.


And, please, also attach units to numeric input fields in the UI so users know what they're entering.


I wrote about this a long time ago too: https://explog.in/notes/units.html. Units are critical, particularly when crossing language/engineer boundaries.


I thought this was in reference to the HNews post from WashingtonPost about the Eastern Antarctic being “70 degrees” warmer.

https://news.ycombinator.com/item?id=30733387


Does anyone else like to put data types in their variable names? Sometimes in finance I come across dollars and cents presented as an int and sometimes as decimal(18,2). So I don't confuse them I do something like payment_amt_d1802 or payment_amt_int.


Though you don't actually need to enforce keyword arguments, any good IDE will give you the (positional) arg names if you mouseover or w/e, so as long as the positional arg is named `timeout_ms` instead of `timeout` it should still be fine.


I think the problem is a bit more nuanced, but i don't have a perfect answer.

This is fine but relying on an IDE for code reviews really sucks. I'd much prefer to be able to do it on github. And it sucks for any language which infers types :( Scala is in particular awful for reviewing on github


IDE hints like this are useful when you're writing the code, but not so useful when you're reviewing it in online code review tools.

I wish code review tool could provide this hover functionality too.



dimensioned[1] got me interested in Rust. I’m not far enough to recommend it, but the concept seems right.

[1] https://github.com/paholg/dimensioned


Unconventional take: just take the SI base unit any time. For time it's 1 second.


This is good scouting. In Java there is a naming convention. In<metric>, like TTLInMillis. In C# you often use TimeSpans, like TimeSpan.FromSeconds(123). However, I see this problem more and more often for some reason.


I ended up building a library for elixir a few years back for this kind of thing: https://github.com/meadsteve/unit_fun.


Probably already suggested but I'm a big fan of .Net attributes or other such decorators. A system level Units decorator which was respected in argument passing and maybe even automatically printed would be dope.


Ruby has a very neat syntax that can solve that. For instance, you'd just:

    sleep 5.seconds
or

    temperature = 0.17.kelvin
Not sure that last abuse actually works, but one may try. :-P


Ha. I've learned it decades ago. It is handled on a level of my spinal chord.

Same thing with explicit memory allocation. If for whatever weird reason I need to explicitly allocate RAM I first write deallocation.


This is something that I've done forever.

For example, when I create a static value to hold a constant, I usually do it like so:

    static private let _maximumButtonHeightInDisplayUnits = CGFloat(30)


Is there a standard for Durations, like we have for Time? (Like RFC3339)


ISO 8601 specifies durations [1]. For example, Java's Duration type [2] is based upon it.

Depending on your language, and how much type safety you want, you could use something Haskell's units library [3].

I believe F# also has units [4], possibly built in to the language? (I've never used F#.)

[1] https://en.wikipedia.org/wiki/ISO_8601#Durations

[2] https://docs.oracle.com/en/java/javase/11/docs/api/java.base...

[3] https://hackage.haskell.org/package/units

[4] https://docs.microsoft.com/en-us/dotnet/fsharp/language-refe...


Ahh the cycle continues. We have lost touch with the lore of ages past, and only the old crones know of the Hungarian notation [1]. As we discover the ruins of old we glean their knowledge, and take up their practices because they were from a more civil time. What we don't recognize is that the creators of old didn't solve these problems and they were always in competition with themselves over which path is righteous. This ultimately led to their downfall, and the ruins you have discovered.

Take heed young adventurer: This balm will not solve all your ills. It will be your hard work, compassion, and perseverance that will bring the new golden age and sustain it.

TLDR: We've done this before, it solves some problems and causes other.

1: https://en.wikipedia.org/wiki/Hungarian_notation


*cough* Golang *cough*

    time.Sleep(42 * time.Millisecond)
    time.Sleep(42 * time.Second)
    time.Sleep(42 * time.Nanosecond)
    time.Sleep(42 * time.Hour)


The trouble with Go is that you can't multiple ints by Durations, leading to people doing `time.Sleep(time.Duration(x) * time.Second)` which is fine by itself, but then I see often people refactor and accidentally remove the second part.

The fact you can go from int to duration without specifying the unit does not give you the strong benefits. It's better than nothing though


> The fact you can go from int to duration without specifying the unit

The unit that is converted in, is in the first sentence of the two-sentence documentation of time.Duration():

    A Duration represents the elapsed time between two instants as an int64 nanosecond count. 
https://pkg.go.dev/time@go1.18#Duration


Yes, in the documentation. But if I'm reading the original code and refactoring, having the docs open might not be my top priority


    time.Sleep(time.Second * time.Second)


So? Non Canonical code can be written in any language. The important thing is: There is a canonical way in the stdlib to specify durations that is easy to read for the programmer.


Ok, I'll bite. Why does time.sleep(secs) not accept keyword arguments, but

    def foo(x):
        print(x)
accept `foo(x='I accept keyword arguments!')`


time.sleep is implemented in C. For Python, adding keyword arguments to functions implemented in C takes a not-insignificant amount of additional boilerplate, so its often not done.


Ah, of course. Posting before my morning coffee. (:


Julia supports zero-overhead units:

https://github.com/PainterQubits/Unitful.jl



If you’re old you call this the Whole Value pattern.

http://fit.c2.com/wiki.cgi?WholeValue


Naming the method sleepMillis or sleepSeconds would help too.


In my opinion that's too specific as you'd need a dozen methods just to accomplish the common needs. Why not use a type as the argument that's explicit and meets even more use cases? Meaning instead of accepting an int just accept a `timedelta` in python or a `time.Duration` in go, or `2.seconds` (I don't recall the type) in rails?


Meanwhile in Java this is handled by the IDE. The IDE will insert the variable name since you are passing in just a number.

  Thread.sleep(millis: 300)


I don't think a language should require an IDE to be usable.


On a related, but not quite the same, it'd be quite cool to be able to express number literals as: 2K (for 2000), or 2K4 (for 2400), or 1Ki for 1024.


Discord API is guilty of one of these very examples, they send a retry-after header on 429 responses, but the thing is.. is it in Bananas? Apples? Elephants?


The retry-after header is documented as being in seconds (or a date). https://httpwg.org/specs/rfc7231.html#header.retry-after


Funny because Discord implements it as milliseconds


Unfortunately it's part of http. One thing you can do is to send both the standard Retry-After header for tools that rely on it, and a nonstandard but unambiguous Retry-After-Seconds. That makes responses self-documenting. (One downside is that you now have two numbers, and if you introduce a bug that makes them different, it will be more confusing.)


Clicking on the link I though "I wonder it it starts with time.sleep"

It was a pretty safe guess, though, the number of bugs related to this is pretty epic.


C#:

    Thread.Sleep(TimeSpan.FromSeconds(3));
or with its int overloaded method:

    Thread.Sleep(millisecondsTimeout: 3000);


Or just document it. And the programmer should read documentation for the method. time.sleep() docs tells you exactly what units it uses.


Or, except if you need absolute precision, you wrap all those primitives in a function that uses SI. So seconds for your sleep wrapper.


F# has units of measure built into the language.

let time: float<s> = 0.1<s>

let length: float<m> = 10.0<m>

let speed: float<m/s> = length / time


My preferred way in Python is:

```

time.sleep(timedelta(minutes=5).total_seconds)

```

Or

```

time_to_sleep = timedelta(minutes=5) time.sleep(time_to_sleep.total_seconds())

```


Unless I'm confident that I remember `time.sleep` wants a number of seconds, then I could miss an obvious error there.

For example...

    time.sleep(timedelta(minutes=5).total_milliseconds)
...looks just as plausible. So what does this solve?


It communicates my intent to the reviewer and future readers of the code that: 1) I mean to set the duration to 5 minutes, which is easier parsed by humans than 300 seconds. 2) I'm passing seconds into time.sleep, which makes any mistakes I make more obvious.

I'm not saying time.sleep can't be improved but my method makes it easier to find any mistakes I've made in the future.

As a demonstration, reading your code makes it more obvious that you've passed the wrong unit into time.sleep.


Got it — so it's more about mitigating the badness when it's somebody else's API.

Seems reasonable! :)

When I encounter these sorts of methods (which is surprisingly rare these days) I tend to double-check what units the method takes and additionally add a comment above my call saying "`sleep` takes duration in seconds", just to give future readers more ways to check whether I screwed up.


One more related problem is coordinate systems for positions in any kind of robotics software or even 2d UI applications.


I disagree with this, you should infer the units from context. I don't want this hungarian notation in my codes


Somewhat unrelated, the typography and use of color on this article are sublime. No cookie popup either. What a joy.


Also, when using Go, use the time.Duration type in your API. It makes things unambiguous but adds no overhead.


imho it's bad practise to not use SI units [0] by default.

Why would you want to enter a percentage of a minute?

[0] https://en.wikipedia.org/wiki/International_System_of_Units


It would be much better to (also) put the units int the function name. Don't just have sleep() but sleepSeconds(). This prevents the person writing code from having to look up or guess what units the parameter has, as well as a person reading it later. This is more an API design issue than an API user issue, although both approaches can be used together.


But I this is not really necessary, if you use a IDE with integrated docs, where you see the parameters of a function on a mouseover.


Is there a programming language that supports units (and calculation with units) out of the box?


Units in values > units in names.


I regularly program in Go and Python.

When I hover over time.sleep(delay) in my Python IDE the detailed pop up shows me that this sleeps for delay seconds and that delay is an int.

When I hover over time.Sleep(wait) in my Go IDE, the popup shows me that time.Sleep takes a time.Duration.

Please get better tools and stay the heck away from both my languages and coding style guidelines.


In Golang, I've seen time.Sleep(2 * time.Second). That's pretty explicit, so, it seems someone carried the whole concept forward.


Agreed, always one of my peeves.


Haven’t had a mistype or a case of confusion with using Ada language to date.


just today I wasted some debugging cycles when my problem was using "sleep(500)" vs "usleep(500 * 1000)" (500 seconds vs 500 ms).

"why isn't my code doing anything?!"


> Option 2: use strong types

This should've been Option One


> And don’t design your CLI accounting app like this: > show-transactions --minimum-amount 32

I'd argue that it's fine if the accounting app is never going to handle multiple currencies.


Always missed Pascal’s typed scalar ranges.


what about having measure class instead. Ex:

Length object = Length.Parse(string);

or

Temperature object = Temperature.Parse(string);

this would but much simpler!


Rust got it correct imo.


300 * TIME_IN_SECONDS


In JS/TS I’d do:

await sleep(5 * MINUTES)

Where MINUTE and MINUTES is a constant for 60 * SECOND.


three_seconds=3000

...

time.sleep(three_seconds)


Ridiculous. You're supposed to know what you're doing. He should stop coding instead and do something that fits his skills.

When it's already too hard knowing the units of the function you're supposed to know, then why does he think he's supposed to code in the first place?

Bat-shit ridiculous. Of course people with low skills will be all over this, happily embracing it, but it's still absolutely ridiculous.

If you can't even remember this, then you should not be programming in the first place!


Wouldn't it help other to learn though if it's clearer? As you have to label variables, why not include units? How does that hinder you?

Do you label your variables a, b, c,...?


Don't be ridiculous. You can't take my complain about this and generalize it.

It's a function. You're supposed to know what the function is doing and the parameter it is accepting. Someone who can't even do that and instead requires it to be fully laid out in front of him should consider looking for a different profession.

This just replaces knowledge with looking at things. It's not helpful at all. Imagine every variable in every function ever would be written down based on its type. Insanity!

... and yes. My main variables are q,w,e,r,t (DWords) ... qq,ww,ee,rr,ss (QWords) ... etc ... all sitting neatly in a cacheline. Those I use for loops and whatever else I need them. Variables that really need naming I actually name appropriately. Variables that do not need naming I simply don't.

You'd be amazed how easy it is to read my code. It's important to understand that, just because someone can do something, doesn't mean he's good at it. This problem gets even worse when everyone just keeps trying to make it easier for people who would be better off doing something else.

No, this isn't an "elitist" perspective, it's the perspective of someone who understands that constantly "lowering the bar of entry" aka "making everything more and more accessible" makes everyone dumber in the long run, because with each and every step one removes any need to actually think and understand what he's doing.

Do you understand that?


Variables? Point-free programming is the name of the game!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: