Please put units in names

armchairhacker · on March 21, 2022

Built-in unit types are underrated. F# has them. Other languages should seriously consider adding them, despite the fact that feature bloat is a serious language problem and they're relatively niche: they are that useful, and I honestly don't think they would interfere much with other features.

Something like:

    unit type Meters = m
    unit type Seconds = s

    function sleep(time: Int[s]): void

    val speed = 5.4 m/s // Type = Float[m/s]
    val distance = parseFloat(prompt('enter time')) * 1 m // Convert unitless to meters just by multiplying
    val time = distance / speed // Type = Float[s]
    print("${time}") // Prints "# s"
    print("${time / 1 s} seconds") // Prints "# seconds"

    val complexUnit = 7 * 1 lbf/in2 // 7 lbf/in2
    // != 7 psi (too hard to infer) but you can write a converter function
    function toPsi<N : Numeral>(value: N[lbf/in2]): N[psi] {
        return value * 1 psi*in2/lbf
    }

It requires extending number parsing, type parsing (if the brackets aren't already part e.g. in TypeScript), and extending types to support units of measurement at least if they are statically known to be subtypes of `Numeral`.

Naming variables with their units doesn't solve the issue of mis-casting and using units incorrectly, and newtypes are too inconvenient (and sometimes impossible without affecting performance) so nobody uses them. Even as a very software-focused programmer I encounter units like seconds, bytes, pixels, etc. all the time, they are almost never newtypes, and I get bugs from forgetting to convert or converting incorrectly.

lf-non · on March 21, 2022

Yup, I have been using branded types in TS to avoid similar issues, and I still do miss F# units of measure.

Even though branded types prevent assignment of values with invalid units, opaque types are inferior in that the compiler doesn't know the relationship between the types (as exemplified in the parent comment) and you need a slew of helper functions & casting.

WorldMaker · on March 21, 2022

I just realized that you could probably do a decently complex unit algebra (at least to the basic level of F#'s unit of measure support) using the unit strings as brands and Typescript's template string types to match them. I'm not sure if that's a good idea or not in practice, but after seeing Wordle done with template string types I'm more certain than ever it is possible to do some interesting things with it.

eyelidlessness · on March 21, 2022

I use a variant of branding in TS for this too, but it’s very much shoving semantics into the language that aren’t meant to be there. Subclasses of primitive types would be semantically closer to F#, but at the expense of being able to ever treat the values as values

gpderetta · on March 21, 2022

In C++ units can, and often are, implemented on top of the existing type system.

See std::duration/time_point specifically for times, and boost.units for generalized unit support.

It is implemented via (templated) wrappers, but they are very close to zero overhead if not completely free. User defined literals also allow for a very succinct syntax, but to be honest, I usually don't bother. I normally write:

    auto delay = std::chrono::seconds{3};

instead of:

    auto delay = 1s;

roeles · on March 22, 2022

Using templates I once implemented logic for dealing with SI units[1]. I also included angles, which is a bit odd but useful in practice.

It is really nice to have a type system that checks your formulas. Saved me a couple of times.

[1] https://gitlab.com/roeles/zen/-/blob/master

roeles · on March 22, 2022

https://gitlab.com/roeles/zen/-/blob/master/src/physics/unit...

SideQuark · on March 21, 2022

In C++ you can use user defined literals to make the units stuff compile time checked, similar to this https://github.com/bernedom/SI for example.

perth · on March 21, 2022

My Ti-89 has this. It's actually really useful because I don't always remember the magical ways units fit together and it solves and simplifies the units for me.

gilcot · on March 22, 2022

Long before that, very flexible units management was seen on HP-28 then HP-48 series.

cycomanic · on March 21, 2022

So from your example I understand that F# doesn't know how to do more complex unit conversion? Does it know N=kg m/s^2 for example? That's related to the issue I often encountered with unit types. They work great for simple use cases, and give you a sense of security, but then fail and you end up writing lots of code just to make the unit types happy (you might argue that it makes the code safer, but it can be a lot of time spend). One cthing that falls over in pretty much all implementations I've encountered is log units (dBm, dBW,...)

int_19h · on March 21, 2022

> Does it know N=kg m/s^2 for example?

F# does if you tell it, i.e.:

   [<Measure>] type kg = g
   [<Measure>] type m
   [<Measure>] type s
   [<Measure>] type N = kg * m / s^2

Although, of course, in practice you'd probably just use whatever's defined in https://fsharp.github.io/fsharp-core-docs/reference/fsharp-d...

Not for log units tho.

saeranv · on March 21, 2022

Can someone break this down for me? val speed = 5.4 m/s // Type = Float[m/s]

I understand that the speed variable is automatically assigned a type of Float[m/s] based on the m/s there, but I'm confused about how the units are just placed at the end of the value assignment: val speed = {value} {unit}

Is this just a F# property that units can be added at the end of value assignment, and F# interprets them as units correctly?

Also, anyone know of methods for incorporating units as types in Python? It would be great to have an equivalent method as demonstrated here, where dividing types automatically creates a new type unit, and perhaps even associates with a related type, i.e: type(kg / (m * m * m)) == density

AaronFriel · on March 21, 2022

I think the syntax for expressions was extended to allow a `<unit expression>`, with or without angle brackets, at the end of any numeric expression. They're erased at compile time, so at runtime or via reflection you can't inspect the dimensions of a value. The only value of the units - which isn't insignificant - is the compile time checking.

Unfortunately the same isn't as easily done in Python:

1. You would need to reify these units as actual constants with overridden operators to track values, and some wacky (if even possible) mypy type shenanigans.

2. The types aren't erased and so would impact the performance of any critical code. F# isn't quite used in high frequency trading, but it does target numeric heavy users like those wanting to write scientific workloads, and units help improve correctness there. But if the types are computed at runtime and not erased, it would be an enormous hit to performance.

Unless you can figure out some way to get "3.0 * m" to be a plain old datatype (float) in Python while still retaining type info in mypy/pylance/pyrite. Perhaps there's a way, but I'm not sure.

Edit: This Python library seems to be trying to solve for these issues! https://pint.readthedocs.io/en/0.10.1/

I'm not sure if they've actually solved the runtime performance problem, but I suppose as long as you're not doing unit assignment/conversion in a loop, it should be fine.

stefs · on March 21, 2022

your example is a bit confusing because you mislabeled the prompt for distance in

    val distance = parseFloat(prompt('enter time')) * 1 m // Convert unitless to meters just by multiplying

aspaceman · on March 21, 2022

See https://unyt.readthedocs.io/en/stable/ in Python. This sort of feature fits really well within Python's featureset.

eurasiantiger · on March 21, 2022

In the land of JavaScript, one could implement unit types as a babel plugin.

_flux · on March 21, 2022

Going from "could" to actually doing it are two very different things, however.

lf-non · on March 21, 2022

I don't see how you'd go about doing this. Babel doesn't have a type checker, and for type level features you'd need support from typescript.

So you need a typescript plugin - however the typescript compiler api (which btw is not officially supported) is not flexible enough to support new type syntax (for things like m/s) so you'd a fork.

eurasiantiger · on March 21, 2022

babel-plugin-typecheck, babel-plugin-tcomb and babel-plugin-jsdoc-type-checker (among others) are able to do similar things, so there must be a way. And I don’t see why not, since babel is ”just” an AST parser/transformer.

lf-non · on March 21, 2022

Those plugins primarily help with with runtime type validations - the static type checking is delegated to Flow/TS.

The benefits around F# units of measure that the parent talks about are primarily focused on static type checking.

If you are fine with the overhead of runtime boxing, it is relatively straightforward to have different container classes for different units and have bridge apis to convert/operate between them. A babel plugin could potentially help with operator overloading here, but this is quite far removed from F# units of measure at this point.

eurasiantiger · on March 21, 2022

Is writing a static unit type checker as a plugin a daunting task, then? I’ve never thought of it in detail, but it would seem that unit type checking rules for basic operations like add/subtract and multiply/divide would be straightforward, and everything else could be derived from there. The laborious task is just formulae.

Closi · on March 21, 2022

I used to work at a company with two database fields, speed_kmph.

The documentation read "speed_kmph - This field contains the travelling speed in MILES PER HOUR - please do not be confused by the name".

Woo, great job guys.

quickthrower2 · on March 21, 2022

DB field names can be really hard to change. All those crystal reports to update, and the sales managers pet access DB that the hero set up for him that you don’t know about.

tonyedgecombe · on March 21, 2022

>DB field names can be really hard to change.

I find it really hard to accept any excuse for something like that. It's why enterprise code is so sloppy, people doing the most expedient thing rather than what is correct.

Closi · on March 21, 2022

How do you write code so that it can accommodate DB field name changes without breaking SQL reports / BI?

Might be easy if you own the whole codebase and all reporting, but you are also going to break all those linked BI reports that the analysis team has written.

Too · on March 21, 2022

Breaking the report is a feature.

If you don’t, it will assume kmph instead of mph and just present the wrong data! But sure…those graphs still look pretty…

Closi · on March 21, 2022

Which links back to the original point from quickthrower that database field name changes are really hard to change then.

If you can’t make a change without it potentially breaking other stuff, it goes in the “not easy” category for me. Especially if you probably won’t know if stuff is broken until the change is pushed to prod.

im3w1l · on March 21, 2022

Create a new column speed_mph. Create a trigger that makes writes to either column gets written to both (make sure to write it to avoid infinite recursion). Copy all data from speed_kmph to speed_mph. Deprecate speed_kmph. Change existing usages at your leisure. Delete speed_kmph.

gilcot · on March 22, 2022

You can't just copy data, it's said to be they're factor k (i.e. 10^3) hahaha Even switching to "speed_mph" still makes the warning/comment relevant: that just means that "m" is for "miles" and not "meters"

Closi · on March 22, 2022

Fair enough, although sounds like it’s really hard to change then back to quickthrowers point.

pojzon · on March 21, 2022

Most often those reports are built on top of table views and not direct table contents. If you have an abstraction layer between its not hard to change whats under.

But that makes an assumption you didnt make any shortcuts.

Closi · on March 21, 2022

Considering I was actually consuming the data, if there was an abstraction layer then the field name was in that.

cle · on March 21, 2022

What is "correct" is up for debate, and often depends on cost tradeoffs. Sometimes this is the "correct" thing to do, no matter how gross we think it is.

tonyedgecombe · on March 21, 2022

Not often, what has always surprised me is how regularly these sort of problems are self inflicted. A lot of people will just do what is expedient regardless of what the management position is.

bombcar · on March 21, 2022

This is why it's dangerous to encode the unit in the variable, because it will eventually change (we had code that was measured in seconds and had to be changed to measure in 20/ths of a second for more granularity, if the timer variable had been named "timer_seconds" it would have had to be changed everywhere.

RandallBrown · on March 21, 2022

If you change the unit, I think the variable should be changed everywhere.

It's hard to know what assumptions code is making and by forcing you to change the code everywhere (which should be pretty easy in any modern IDE) you at least have a chance to evaluate any issues that unit change could cause.

banana_maker · on March 21, 2022

Someone doing a half-assed refactor doesn't mean it's "dangerous" to encode a unit in a variable or function.

I do agree that generally it's better to use something like a struct that's more flexible, but doing that for every function is also quite verbose. For some functions the time requirement may never change as well.

hallway_monitor · on March 21, 2022

In this specific case, it seems like creating a new property with the new data in a new name would be the safest path forward. Ensuring the unit are in both property names will prevent confusion.

llbeansandrice · on March 21, 2022

This sounds like a feature to me and modern IDEs make this type of simple refactoring quite straight forward for most languages.

jrodthree24 · on March 21, 2022

It's even better when the documentation gets stale too. And now this is in meters/second but the name implies km and docs specify miles.

gpderetta · on March 21, 2022

recently I was looking at some pcap parsing code which was storing the timestamp field into a timeval structure (that nominally has usec precision), but then treating it as nanoseconds (as pcap supports both resolution). It made for some very confusing reads!

jillesvangurp · on March 21, 2022

Java and Kotlin have a nice Duration class. So in Kotlin you can do

  delay(duration = Duration.ofMinutes(minutes = 1))

which is equivalent to

  delay(timeMillis = 60_000)

Using the optional argument names here for clarity; you don't have to of course.

Sticking with the JVM, a frequent source of confusion is the epoch. Is it in millis or in seconds? With 32 bit integers that has to be seconds (which will set you up for the year 2038 problem). However, Java always used 64 bit longs for tracking the epoch in milliseconds instead even before the year 2000 problem was still a thing. Knowing which you are dealing with is kind of relevant in a lot of places; especially when interfacing with code or APIs written in other languages with more 32 bit legacy.

Things like network timeouts are usually in milliseconds. But how do you know this for sure? Specifying, them in seconds means that there's less risk of forgetting a 0 or something like that. So, using seconds is pretty common too. You can't just blindly assume one or the other. Even if you use the Duration class above, you'd still want to put the magic number it takes as a parameter in some configuration property or constant. Those need good names too and they should include the unit.

zzbzq · on March 21, 2022

The Duration.ofMinutes thing doesn't address the exact same problem. Anyone can put the units on the "right side". You don't even need a fancy "Duration.ofMinutes" helper function. Clarity-wise that's not different than just putting a comment saying how long it is and what units. The problem in the article is getting the units on the "left side," which, yes, you have to put the units in the name of the function.

renewiltord · on March 21, 2022

Ah no. The function accepts a `Duration` type which could be precisely 1 m 23 s 14 ms 10 us 8 ns. And so you don't need `delay_ms` vs. `delay_ns` or anything because the type encodes the precise duration. If you passed a nanosecond `Duration` into `delay` it will delay ns, if you pass a millisecond `Duration` it will delay ms.

macintux · on March 21, 2022

You have to enforce the units within the function, but that’s not the same as putting them in the name of the function.

For example, in Erlang, the sleep function could/should have been written to take named tuples: `timer:sleep({seconds, 3})`

neop1x · on March 21, 2022

Go has time.Duration type as well and works like this:

time.Sleep(10 * time.Millisecond)

y3sh · on March 22, 2022

The Go compiler is also happy with time.Sleep(1000), which the new "expert full stack" dev just PR'd.

To me Durations are a zero sum gain, because in this case they hide relevant detail (int64 nanoseconds) without enforcing usage. Compare it against what Golang could easily provide -- time.SleepMilli(100), which is 40% less characters for my aging eyes to parse than time.Sleep(100 * time.Millisecond).

After all, in the very same package we have a unit in the name:

   t := time.Time{}
   t.UnixMilli() // 1647952024456

wodenokoto · on March 21, 2022

    timeout = timedelta(seconds=300)
    frobnicate(timeout)

Working with GCP or Azure's Python SDK is like navigating a jungle of types. Some calls return a `compute_engine_list_item` while others return a `compute_engine` type and these are difficult to inspect and reason about, because Python classes default to printing something along the lines of `<__main__.myclass at 0x7fa8864a1040>`, making heavily typed Python code quite difficult to work with.

No paradigm is going to save you from spaghetti-code, but being able to pass a list and get an integer in return, makes it very easy to reason about what you can do with these values, whereas it can be quite difficult to reason about custom types/objects.

My point is, that knowing if `frobnicate(timeout=300)` is in seconds or minutes can be just as difficult (or even more difficult) as knowing what specific object I need to instantiate and pass to `frobnicate` (in the above case a `timedelta`)

chriswarbo · on March 21, 2022

> being able to pass a list and get an integer in return, makes it very easy to reason about what you can do with these values

I disagree; to paraphrase Ian Malcolm: what you can do with these values is less important than what you should do with these values. For example, we can add a distance to a currency, if they're both int or float; that doesn't mean we should.

The most obvious example of this is "stringly-typed programming", where pretty much everything is "string". Can I append a user-input string to an SQL statement string? Sure; but I shouldn't. Can I append a UserInput to an SQLStatement? Not without conversion (i.e. escaping)!

akdor1154 · on March 21, 2022

In general you're right.. strong typing needs good tooling and thourough language support to be fun. Python is a bit lacking in both of these.

In this specific example i disagree, timedelta is part of the standard library and should be widely known and familiar.

necovek · on March 21, 2022

Badly designed type systems should not be the measuring stick.

But in your example above, typed Python would instruct your editor and tools like mypy to flag any improper use of frobnicate. You can set up your editor to offer you a tip on what type to use as soon as you type in `frobnicate(`.

In cases where typing is not really available, I prefer to put types into APIs instead of variable names (and Python makes that great: `frobnicate(duration_as_timedelta=timeout)`).

Aeolun · on March 21, 2022

I do not think the fact that Google API’s have terrible types is a good argument against typing in general.

dharmab · on March 21, 2022

When I dealt with the Azure SDk for Python I frequently wrote small scripts that created the types in question and then called breakpoint() so that I could examine the types interactively.

logbiscuitswave · on March 21, 2022

Fully agree with everything in this article. This is definitely a code smell to me to see unitless sizes and intervals specified. I know that whenever I’m doing a code review and I see some property or variable name like “timeout” or “size” I will ask the developer to change the name to make it clear what the unit is trying to portray.

When possible, I also encourage the use of better types than simple integer values, (like TimeSpan if .NET) as these further reduce ambiguity and the potential for mistakes.

This is such a simple thing to find and fix but it definitely helps in the long term.

bentcorner · on March 21, 2022

I also agree but would offer a third suggestion - add the type to the function name as well, so you see this line in code: `timeoutSec(timeout)`. You can't enforce people to name the parameter correctly, this means that anyone reading doesn't have to go digging to know the parameter type. Also stuff like `timeoutSec(timeoutMs)` stands out like a sore thumb.

ParetoOptimal · on March 22, 2022

> I know that whenever I’m doing a code review and I see some property or variable name like “timeout” or “size” I will ask the developer to change the name to make it clear what the unit is trying to portray.

In a dynamic language, maybe.

In a static language, especially Haskell/Scala/Ocaml/F#, I hate duplicating the type.

wyldfire · on March 21, 2022

I would go one step further and suggest that all physical quantities should either have the units in the identifier name or encoded in the type system. Meters, seconds, milliamps, bytes, blocks, sectors, pages, rpm, kPa, etc. Also it's often useful to explicitly distinguish between different unit qualifications or references. Seconds (duration) versus seconds-since-epoch, for example. Bytes versus page-aligned bytes, for another.

Having everything work this way makes it so much easier to review code for errors. Without this means that as a reviewer you must either trust that the units and conversions are correct or you should do some spelunking to make sure that the inputs and outputs are all in the right units.

jeffparsons · on March 21, 2022

100%. This is a baseline requirement where I work. If you don't either include the units in the identifier name, or use the type system where possible, your code is not getting merged.

The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.

michaelt · on March 21, 2022

> The only people I've ever met that think this is unnecessary are also the same people that ship a lot of bugs for other people to fix.

I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Asking bash coders to write "sleep 5s" instead of "sleep 5" - I doubt you'd get any objections at all.

But if you're putting a foo.bar.Duration on a getDefaultTimeoutFromEnvironment on a setConnectTimeout on a HttpRequest on a HttpRequestInitializer on a NetHttpTransport on a ProviderCredential to make a simple get request? People who've come from less ceremony-heavy languages might feel less productive, despite producing 10x the lines of code.

contravariant · on March 21, 2022

Well ignoring the silly stuff this would just boil down to something like:

    request = NetHttpTransport.Request()
    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)

which is verbose, but at least it's fairly clear.

And yes I'm calling a class named HttpRequestInitializer silly, I don't care if some language decided it should exist.

gregmac · on March 21, 2022

    request.setConnectTimeout(getDefaultTimeoutFromEnvironment().seconds)

This is actually a very good example of what not to do, with the mistake being in whoever implemented setConnectTimeout()

I actually don't know this particular API, but I'm used to timeouts being in milliseconds, so that code looks wrong to me.

Much better API, and what the article is talking about, is to change this to:

    .setConnectTimeoutMilliseconds(getDefaultTimeoutFromEnvironment().seconds)

Now the mistake is obvious, and even if the original developer doesn't notice it will stick out in a PR or even a causal glance.

LgWoodenBadger · on March 21, 2022

That's one of the benefits of having it exposed as a type-enforced parameter. setConnectTimeout() could take a Duration, which contains the amount and the unit, and therefore wouldn't care if consumer A provided a timeout in seconds, and consumer B provided a timeout in milliseconds.

gregmac · on March 21, 2022

Totally agree, but then I would expect the code would be:

    request.setConnectTimeout(getDefaultTimeoutFromEnvironment())

with getDefaultTimeoutFromEnvironment() returning a Duration.

Ideally this is consistent throughout the codebase, so that anything that uses a primitive type for time is explicitly labelled, and anything using a Duration can just be called "Timeout" or whatever.

makapuf · on March 21, 2022

This example is nice but if you put arguments metadata in the function name, you have to have one main argument, the function name can prove cumbersome if you have 3 or 4 arguments with units like

    .setPricePerMassInCentsPerKilogramsWithTimeoutInMilliSeconds(100,2,300)

elcomet · on March 21, 2022

I think you should rather do

  .setPrice(priceInCents=100, massInKg=2, timeoutInMs=300)

gregmac · on March 21, 2022

I'd argue there are better API patterns for this though -- keeping in mind this values code readability (and correctness) over micro-optimization:

    .setPriceInCents(100);
    .setMassInKilograms(2);
    .setTimeoutInMilliseconds(300);

or

    .calculate({
        priceInCents = 100,
        massInKilograms: 2,
        timeoutInMilliseconds: 300,
    });

HelloNurse · on March 22, 2022

Values with an intrinsic unit scale up to many units and values. If you declare the parameters of a "calculate" function as USACurrency, Weight and Duration you can write

    calculate(100cent, 2Kg, 300s)   
    calculate(0.1dollar,4.7lb/*approximate*/,5min)
    calculate(something.price(),whatever.weight(),options.getDuration("exampleTimeout"))
    calculate(USD(0.1),Kg(2),Minute(5))

contravariant · on March 21, 2022

All fair points. In this example I was suggesting what this might look like at the border of the application where you need to talk to some (standard) library which doesn't use the same convention.

michaelt · on March 21, 2022

A class named 'HttpRequestInitializer' and taking 10 lines to set a timeout on a HTTP request isn't merely hypothetical: https://developers.google.com/api-client-library/java/google... - and that's not counting any import statements.

(although getDefaultTimeoutFromEnvironment was artistic license on my part)

contravariant · on March 21, 2022

True, but like I said that doesn't make it not silly. Just harder to fix.

Edit: Also, overriding a class method dynamically inside a function? I usually program python these days and even I think that's wild.

jsight · on March 21, 2022

Its an interface with only method. I suspect that most of us would just us a lambda now.

contravariant · on March 21, 2022

I'll never quite understand why it wasn't simply a function in the first place.

dtech · on March 21, 2022

Java pre-8 only had anonymous classes, there were no lambdas

fiddlerwoaroof · on March 21, 2022

And Java lambdas are still syntax sugar for one-method anonymous classes.

sethammons · on March 21, 2022

This is where I divert. You just hard coded seconds into your test. Now your tests that cover this must take seconds to finish; thankfully you were not testing hours or days!

My last shop was a Go shop and one test I think shows this off was an SMTP server and we needed to test the timeouts at different states. The test spun up half a dozen instances of the server, got each into the right smtp state and verified timeouts in under 10ms.

The environment that set the timeout would either be "timeout_ms=500" or "timeout=500ms" (or whatever). This is where you handle that :)

contravariant · on March 21, 2022

Not sure I fully understood your objection, but the reason I specified seconds is because I presumed the setConnectTimeout to be part of the default HTTP library, which likely doesn't adhere to the same conventions, and that it expected seconds (which seem to be the usual for http libraries as far as I can tell).

Of course if the setConnectTimeout method was part of the same application you could just pass the timeout directly, but at the boundary of your application you're still going to have to specify at some point which unit you want.

hansvm · on March 21, 2022

If you're testing things with timeouts it's often a good practice to mock out the system clock anyway. That allows testing your timeouts to be nearly instantaneous and also catches edge cases like the clock not moving forward, having a low resolution, moving backward, ... deterministically.

dnadler · on March 21, 2022

The test could simply mock the default value to something reasonable like '.1 seconds' and test that duration instead, so I don't think this is a real problem.

catlifeonmars · on March 21, 2022

This is actually revealing a different problem: the system clock as an implicit dependency. YMMV depending on support of underlying libraries, but I will typically make the clock an explicit dependency of a struct/function. In Go, usually it’s a struct field `now func() time.Time` with a default value of `time.Now`.

dahfizz · on March 21, 2022

Many timeout functions take seconds as a floating point. So you could time out on 0.05 seconds (5 milis). But now the code is clear and less prone to bugs.

sethammons · on March 21, 2022

In Go, you don't pass a float, you pass a duration

catlifeonmars · on March 21, 2022

Which is unitless, hence there is no problem. `time.Duration.Seconds()` returns a floating point.

vvillena · on March 21, 2022

The 80/20 approach of renaming "timeout" to "timeoutSeconds" or "timeoutMillis" is also valid. They key takeaway is to not make assumptions.

rowanajmarshall · on March 21, 2022

> I find feelings about this depend a lot on how much ceremony the type system involves; and just how many units and references to them there are in the system.

Ideally it'd be as simple as:

typealias Dollars = BigDecimal

typealias Cents = Long

that's valid Kotlin, but the equivalent is doable in most languages nowadays (Java being a big exception).

dtech · on March 21, 2022

I recommend against that, and use proper (wrapper) types.

I don't know Kotlin, but in most languages if you alias 2 types to the same base type, for example Seconds and Minutes to Long, the compiler will happily allow you to mix all 3 of them, defeating the protection full types would bring.

stefs · on March 21, 2022

that's correct. typealiases are the wrong solution here. the better solution would be value classes, but of course, the unit shouldn't be the type.

dragonwriter · on March 21, 2022

> typealias Dollars = BigDecimal

> typealias Cents = Long

If I ever have to deal with monetary values in a program where someone thought this was a good idea, ... well, it really won't be the worst thing I’ve ever dealt with, but, still.

(If you have dollars and cents as types in the same program, they darn well better work properly together, which is unlikely to work if they are both just aliases for basic numeric types.)

stefs · on March 21, 2022

don't do this!

    typealias Cents = Long
    typealias Meters = Long
    
    Cents(3) + Meters(4) = 7L

that's exactly the thing we want to prevent.

the classes shouldn't be for units at all, the type should represent the type of unit.

so instead of a class Seconds you should have a class Duration, instead of class Meters you should have a type Distance. that's because the unit of the different types can be converted between each other.

Dollars and Cents are a bit of a bad example because currencies can't be easily converted between each other, as conversion is dependent on many factors that change over time. meters, yards, feet, lightyears, miles, chain, furlongs, whatever describe a multiple of the same thing though, so a different type for each unit isn't necessary, as the input that was used to create the instance isn' usually needed. the counter example would be datetime with timezones - a datetime with a timezone convers a lot more information than the datetime converted to UTC.

nightfly · on March 21, 2022

What type of industry/product do you work in/on? And what sort of languages do you work in?

physicsguy · on March 21, 2022

Not the commenter, but I work in scientific software development and it's just a minefield of different units, so being explicit is generally very useful. Even if you can assume you can stick to metric (you can't), what different users want varies across countries. For e.g. here in the UK we often want to represent data as millilitres, but in France the same measurements are often centilitres.

I don't use libraries to enforce it though, we did try this but found it quite clunky: https://pint.readthedocs.io/en/stable/

LeifCarrotson · on March 21, 2022

It varies across different users from the same city! The same family, even!

One piece of equipment I just finished working on measured and displayed vacuum at different sensors in PSIg, kPa, Pa, mmHg, and inHg. The same machine, the same measurement at different stages in the process, five different units!

DocTomoe · on March 21, 2022

Not OP, but I see that a lot in the aerospace and heavy industry sectors.

We keep laughing about "if we engineered bridges as we engineer software" ... the truth is that the areas where correct software matters tend to write very robust code, and the rest of the industry would be well advised to take notice and copy what they see.

Of course, writing robust code is a skill, and it takes extra time.

regularfry · on March 21, 2022

It takes time to learn, and to learn the value, and time to agree with the team that it's sensible. With this sort of thing - proper naming of variables - I disagree that it takes longer at point of use.

mhaberl · on March 21, 2022

> and the rest of the industry would be well advised to take notice and copy what they see

I don't agree.

There is a good reason that aerospace industry writes robust code - in invests time (money) to avoid disasters that could cause, among other things, loss of human life.

On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.

thfuran · on March 21, 2022

>On the other hand if for example some webform validation fails because the code was written as fast (as cheap) as possible, who cares really.

Who knows. Maybe a billion dollar company that can't fulfill orders. Maybe a million people who suddenly can't use their bank online.

mhaberl · on March 21, 2022

Yes, sure, but a "billion dollar company" in this case does not represent the whole industry.

You can probabbly find a some specific non-critical case in aerospace industry, but surely based on that example one would not suggest that the whole aerospace industry should just copy what they see in frontend dev.

Context matters, there are exceptions, but the standard practices are based on average scenario, not on extremes.

thfuran · on March 22, 2022

I'm not saying that all code should be developed under standards designed for embedded control system software. I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.

mhaberl · on March 22, 2022

>>> and the rest of the industry would be well advised to take notice and copy what they see

>> I don't agree.

>> [web form validation example]

>> That is just a tradeoff, in aerospace industry you spend more to lower the risk, somewhere else you don't.

> I'm just saying that "oh, it's just web stuff so it can't be important" is ridiculous.

This feels like a straw man.

The original argument is that the rest of the industry (that includes web, but a lot of other parts also) should copy what they see in aerospace industry.

I believe that would not be appropriate for the rest of the industry to just copy practices from any other part because each segment has its own risk factors and expected development costs and with that in mind developed their own standard practices. Nowhere did I state that the "web stuff can't be important" nor that there is no example of web development (form validation) where the errors are insignificant.

That said, I will go back to the "billion dollar company that can't fulfill orders" / "million people who suddenly can't use their bank online" catastrophe; this happens all the time. Billion dollar company doesn't fulfill orders, error is fixed, orders are fulfilled again, 0.0..01% of revenue is (maybe) lost.

In aerospace industry a bug is deployed, people are dead. No bugfixing will matter after that moment.

How can this two industries have the same standards of development?

a9h74j · on March 21, 2022

Yes, as a comparison, let's just take all the units (em, px, %) out of writing CSS and see how fun that becomes to review and troubleshoot.

dotancohen · on March 21, 2022

I'm certain that the Mars Climate Orbiter had a lot to do with this practice.

wffurr · on March 21, 2022

Also not OP, but I work on graphics software and we frequently deal with different units and use strict naming systems and occasionally types to differentiate them.

Even more fun is that sometimes units alone aren't sufficient. We need to know which coordinate system the units are being used in: screen, world, document, object-local, etc. It's amazing how many different coordinate systems you can come up with...

Or which time stream a timestamp comes from, input times, draw times (both of which are usually uptime values from CLOCK_MONOTONIC) or wall times.

ReleaseCandidat · on March 21, 2022

As a bonus, coordinate systems can be left-handed or right-handed, and the axes point in different directions (relevant when loading models for example).

Aeolun · on March 21, 2022

What does a type for ‘seconds’ do that an ‘integer’ doesn’t?

I may be misunderstanding this.

GuB-42 · on March 21, 2022

The type is "duration", not "seconds". "seconds" is the unit. You can think of the unit as an operator that converts an integer to a duration.

The advantages are:

- An integer doesn't tell you if you are talking about seconds, milliseconds or anything like that. What does sleep(500) means? Sleep 500s or 500ms? sleep(500_ms) is explicit

- It provides an abstraction. The internal representation may be a 64-bit number of nanoseconds, or a 32-bit number of milliseconds, the code will be the same.

- Conversions can be done for you, no more "*24*60*60" littered around your code if you want to convert from seconds to days, do fun(1_day) instead.

- Safety, prevents adding seconds to meters for instance. Ideally, it should also handle dividing a distance by a duration and give you a speed and things like that.

Under the hood, it is all integers of course (or floats), which is all machine code, but handling units is one of the things a high level language can do to make life easier on the human writing the code.

kitd · on March 21, 2022

A function might take multiple integer arguments, each in different units. Separate types for each unit guarantees you won't pass the wrong integer into the wrong argument.

Eg

  func transferRegularly(dollars(10000), days(30))

Meaning clear.

  func transferRegularly(10000, 30)

Meaning obscure, error prone and potentially costly

necovek · on March 21, 2022

With some languages like Python, you can use keyword arguments too (even out of order).

Eg. you could simply do

  transferRegularly(amount=10000, period_in_days=30)

I am always amazed how new languages never pick up this most amazing feature of Python.

Though obviously, this code smells anyway because 1. repetitive transfers are usually in calendar units (eg. monthly, weekly, yearly — not all of which can be represented with an exact number of days), so in Python you'd probably pass in a timedelta and thus a distinct type anyway, and 2. amounts are usually done in decimal notation to keep adequate precision and currency rounding rules (or simply `amount_in_cents`).

Still, I am in favour of higher order types (eg. "timedelta" from Python), and use of them should be equally obligatory for unit-based stuff (eg. volume, so following the traditional rule of doing conversions only at the edges — when reading input, and ultimately printing it out).

kitd · on March 21, 2022

I see keyword arguments as slightly different though. The keyword is like the parameter name. The value is still a plain integer and (theoretically) susceptible to being given the wrong integer. In contrast, unit types allow for hard checking by the compiler.

In practice, with good naming it won't make much difference and only shows up when comparing the docs (or intellisense) for an API with how it is actually used.

steelframe · on March 21, 2022

> transferRegularly(amount=10000, period_in_days=30)

dollars(10000) is still better than this example, because: 10000 what? Pennies? USD? EUR?

necovek · on March 21, 2022

Someone else caught me out on that too by suggesting making it `amount_in_dollars` elsewhere in the thread ;)

Now you can say how there are also AUD, CAD...

The point was simply that if units are needed due to lack of specific type being used, it's nicer to have that in the API when language allows it.

dotancohen · on March 21, 2022

PHP got this property in PHP 8.

One problem with this approach is refactoring. If you wanted to refactor your example with the parameter "amount_in_dollars" then you would either have to continue maintaining the legacy "amount" argument, or break existing code.

necovek · on March 21, 2022

So you mean just like with, eg. renaming a function? I agree it's an issue compared to not doing it, but a very, very minor one IMHO, and legibility improvements far outweight it.

dotancohen · on March 21, 2022

Renaming a function comes with the explicit implication that the API has changed. But it might not be clear to someone maintaining a Python application that changing a parameter name might change an argument - that is not the case in any other language (until PHP 8).

Guess how I discovered this issue :)

necovek · on March 21, 2022

Well, if the approach was more pervasive, you'd be used to it just like seasoned Python developers are. :)

davisoneee · on March 21, 2022

if you type check, then it ensures that only a 'second' can be passed to the function. This requires you to either create a second, or explicitly cast to one, making it clear what unit a function requires.

As per the article, if you dont have proper names and just an 'int', that int can represent any scale of time...seconds, days, whatever.

In python youd need something like mypy, but in rust you could have the compiler ensure you are passing the right types.

huynhhacnguyen · on March 21, 2022

Having a type system to figure this out for us would be great, but there are languages where this may not be possible. As far as I know, Typescript is one such example, isn't it?

kaoD · on March 21, 2022

Depends. Yes, newtyping is pretty awful in TS due to its structural typing (instead of nominal like Rust for example).

You could perhaps newtype using a class (so you can instanceof) or tag via { unit: 'seconds', value: 30 } but that feels awful and seems to be against the ecosystem established practices.

This is indeed one of my gripes with TS typing. I'm spoiled by other languages, but I understand the design choice.

hurflmurfl · on March 21, 2022

I was recently dealing with some React components at work, where the components would accept as an input the width or the height of the element.

Originally, the type signature was

    type Props = {height: number}

This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

I've ended up changing the component to only accept pixels and changed the argument to be a string like this:

    type Props = {height: `${number}px`}

Of course, if passing a "0" makes sense, you could also allow that. If you want to also accept, say, `50em`, you could use a union-type for that.

I think this could actually work for other units as well. Instead of having `delay(200)`, you could instead have `delay("200ms")`, and have the "ms" validated by type system.

Maybe the future will see this getting more popular:

    type WeightUnit = 'g' | 'grams' | 'kg' | ...;
    type WeightString = `${number}${WeightUnit}`;
    function registerPackage(weight: WeightString): void;

Izkata · on March 21, 2022

> This naturally raises the question - number of what? Does "50" mean `50px`, `50vw`, `50em`, `50rem`, `50%`?

Because it's React, the expectation is "px", using a string with a suffix to override it: https://reactjs.org/docs/dom-elements.html#style

dbrgn · on March 21, 2022

You can do something like this:

    export interface OpaqueTag<UID> {
        readonly __TAG__: UID;
    }
    export type WeakOpaque<T, UID> = T & OpaqueTag<UID>;

And then use it like this to create a newtype:

    export type PublicKey = WeakOpaque<Uint8Array, {readonly PublicKey: unique symbol}>;

To create this newtype, you need to use unsafe casting (`as PublicKey`), but you can use a PublicKey directly in APIs where a Uint8Array is needed (thus "weak opaque").

spion · on March 21, 2022

I tried to solve this problem in a way that's reasonably pleasant here https://github.com/spion/branded-types

smichel17 · on March 21, 2022

The pattern you want here is branding.

mirekrusin · on March 21, 2022

You can simulate type opaqueness/nominal typing with unique tag/branded types in ts. We’re using it on high stake trading platform and it works very well.

jsnell · on March 21, 2022

Most answers here are answering your question literally, and explaining why you'd add a type for "seconds". But in reality you shouldn't create a type for seconds, the whole premise of the question is wrong.

Instead of a type for seconds, you'd want a type for all kinds of duration regardless of the time unit, and with easy conversion to integers of a specific time unit. So your foo() function would be taking a Duration as an input rather than seconds, and work correctly no matter what unit you created that duration from:

    foo(Duration.seconds(3600))
    foo(Duration.hours(1))
    foo(Duration.hours(1) + Duration.seconds(5))

lrem · on March 21, 2022

You can construct a linter that will prevent you from trying to add seconds to dollars.

akdor1154 · on March 21, 2022

A better example would be seconds to minutes. I think i recall a jwt related cve related to timestamps being misinterpreted between sec/ms for example.

taneq · on March 21, 2022

Adding seconds to minutes can actually make sense. A minute plus 30 seconds is 90 seconds, or 1.5 minutes. Whether your type system allows this, though, is up to the project.

You can't add seconds to kilometers, or to temperature, or to how blue something is.

OtomotO · on March 21, 2022

Ideally you have a compiler that simply refuses to compile ambiguous code

eurasiantiger · on March 21, 2022

Ideally, there is no compiler.

_flux · on March 21, 2022

Ideally, you would detect all errors before runtime. Usually the compiler is the last gate to make that happen.

eurasiantiger · on March 21, 2022

Ideally, hardware execution would happen on a language that can be proven correct and does not allow the programmer to make syntactic or semantic errors.

The world is far, far from ideal.

CamperBob2 · on March 21, 2022

Isn't there an implicit conversion?

atoav · on March 21, 2022

Well your function could then accept multiple types (Miliseconds, Seconds, Minutes, Hours) and do the conversion between those implicitly.

The units are also extremely clear when they are carried by the type instead of the variable name, where a developer could e.g. change some functionality and end up with a variable called `timeout_in_ms` while the function that eats this variable might expect seconds for some reason.

If it is typed out you can just check if the function performs the right action when passed a value of each time unit type and then you only ever have to worry about mistakingly declaring the wrong type somewhere.

But wether you should really do all that typing depends on how central units are for what you are doing. If you have one delay somewhere, who cares if it is typed or not. If you are building a CNC system where a unit conversion error could result in death and destruction, maybe it would be worth thinking about.

indymike · on March 21, 2022

Tells you what the integer represents, so you aren't off by several orders of magnitude.

Cthulhu_ · on March 21, 2022

What does the integer represent? Nano, milli, microseconds, seconds, minutes, hours, days?

ReleaseCandidat · on March 21, 2022

I _really_, _really_ like F#'s 'unit of measure': https://docs.microsoft.com/en-us/dotnet/fsharp/language-refe...

fluid_ident · on March 21, 2022

This. I really miss it when working outside of F#. I work on scientific code bases with a lot of different units and have been burned by improper conversions. Even with a high automated test coverage and good naming practices, such problems can go undetected.

ok123456 · on March 21, 2022

It's a big omission that it doesn't support fractional units. These come up in things like fracture mechanics for stress intensity.

ReleaseCandidat · on March 21, 2022

Are you talking about ksi√in or MPa√m? That is not a problem.

ok123456 · on March 21, 2022

F#'s units don't support 1/2 dimensions. So you can't do dimensional analysis through the type system.

EdwardDiego · on March 21, 2022

God I love F#'s unit types. C# is okay and all, but F# is, IMO, the best FP-language-on-a-corporate-backed-VM ever, even if the integration with the underlying VM and interop can get a bit fiddly in places (Caveat, my opinion is about 7 years old).

Yeah, you heard me Scala.

pjmlp · on March 21, 2022

Sadly it hardly got better, C# gets all the new .NET toys, VB less so, C++/CLI still gets some occasional love (mostly due to Forms / WPF dependencies), and then there is F#.

HideousKojima · on March 21, 2022

>Having everything work this way makes it so much easier to review code for errors.

Had one making me pull my hair put the other day in C#. C# datetimed are measured in increments of 100 nanoseconds elapsed since January 1st 1 AD or something like that. Was trying to convert Unix time in milliseconds to a C# datetime and didn't realize they were using different units. My fault for not reading the docs but having it in the name would have saved me a lot of trouble.

eyelidlessness · on March 21, 2022

What the heck is with MS and weird datetime precision? I figured out some bug with a third party API was due to SQL Server using 1/300th seconds. Who would even think to check for that if you’re not using their products?

osigurdson · on March 21, 2022

1/300 seconds is an odd one for sure. In the case of DateTime however, I'd say it is designed to hit as many use cases as possible with a 64 bit data structure. Using a 1970 epoch as is (logically) used for Unix system times naturally misses even basic use cases like capturing a birthdate.

It is quite hard actually to disagree with the 100ns tick size that they did use. 1 microsecond may have also been reasonable as it would have provided a larger range but there are not many use cases for microsecond accurate times very far in the past or in the future. Similarly using 1 nanosecond may have increased the applicability to higher precision use cases but would have reduced the range to 100 years. Alternately, they could have used a 128 bit structure providing picosecond precision precision from big bang to solar system flame out with the resultant size/performance implications.

spyspy · on March 21, 2022

God speed trying to parse dates from excel. They have bugs _intentionally built in_

justsomehnguy · on March 21, 2022

Who would even think to check if the system is counting from 1970/01/01 you’re not using their products?

eyelidlessness · on March 21, 2022

If they’re using ISO format I don’t really care what they’re counting from. But I care if some ISO dates are preserved exactly and some are rounded to another value… especially when that rounding is non-obvious. It took me months to even identify the pattern clearly enough to find an explanation. Up to that point we just had an arbitrary allowance that the value might vary by some unknown amount, and looked for the closest one.

onlyrealcuzzo · on March 21, 2022

This is an established standard.

It's really not any stranger than starting dates at 1 CE, or array indexes starting at 0.

justsomehnguy · on March 21, 2022

> This is an established standard.

Established doesn't mean it is understandable without documentation. Anyone who is not familiar with it doesn't know why it starts in 1970 and counts seconds. You need to actually open the documentation to know about that, and it's name (be it unix time, epoch, epoch time or whatever) doesn't help in understanding what it is and what unit it is using.

hannasanarion · on March 21, 2022

The metric system is also not understandable without documentation, byt you don't need to explain it every time because every living person should have gotten that documentation drilled into them at age ten.

UNIX time is an easy international standard, everybody with computer experience knows what it is and how to work with it.

justsomehnguy · on March 21, 2022

> everybody with computer experience knows what it is and how to work with it.

Thanks for the laugh.

The only ones who needs to know about unixtime is:

developers when they do something which takes it/produces it

*nix sysadmins

Everyone else with "computer experience" could live all their life without the need to know what unixtime is.

hannasanarion · on March 21, 2022

Yeah, that's what I meant. People who program or do sysadmin, ie anybody who will ever need to call a sleep function, should know what a unixtime is.

justsomehnguy · on March 22, 2022

> anybody who will ever need to call a sleep function

Bullshit.

I never needed to know what unixtime is when I wrote anything with sleep().

All I needed to know is how long in seconds I want the execution to pause for, never ever I needed to manually calculate something from/to unixtime, even when working with datetimes types.

hannasanarion · on March 23, 2022

No, but as a person who has had a need for sleep() before, you are also the type of person who could be expected to know what unix time is.

Nobody is saying that the two things need to be connected, the point is that it can be name dropped in a type definition, and you would know what it means.

jodrellblank · on March 21, 2022

https://devblogs.microsoft.com/oldnewthing/20090306-00/?p=18...

Windows uses the Gregorian Calendar as its epoch.

kqr · on March 21, 2022

I'm not sure I agree. When I convert fields to DateTime, I remove the unit suffix from the name. The DateTime is supposed to be an implementation-agnostic point in time. It shouldn't come with any units, and nor should they be exposed by the internal implementation.

The factory method used to convert e.g. Unix timestamps to DateTimes, now that should indicate whether we're talking seconds or milliseconds since epoch, for example, and when the epoch really was.

gregmac · on March 21, 2022

They do:

    .ToUnixTimeSeconds()
    .ToUnixTimeMilliseconds()
    DateTimeOffset.FromUnixTimeSeconds(Int64)
 
   DateTimeOffset.FromUnixTimeMilliseconds(Int64)

https://docs.microsoft.com/en-us/dotnet/api/system.datetimeo...

stevesimmons · on March 21, 2022

How does the C# DateTime type distinguish between dates and a point in time whose time just happens to be midnight?

Much of the C# code I've seen uses names like start_date with no indication of whether it really is a date (with no timezone), a date (in one particular timezone), or a datetime where the time is significant.

I'm certainly not a C# developer, though my quick reading of the docs suggests that the DateOnly type was only introduced recently in .NET6.

Sharparam · on March 21, 2022

Yeah, before the new DateOnly (and TimeOnly) types, there was no built-in way in C# to specify a plain date. NodaTime[1] (a popular third-party library for datetime operations) did have such types though.

[1]: https://nodatime.org/

pharmakom · on March 21, 2022

F# has unit support in the type system :)

S04dKHzrKT · on March 21, 2022

An example for unfamiliar folks.

Many Languages

  var lengthInFeet = 2;
  var weightInKg = 2;
  
  var sum = lengthInFeet + weightInKg; // Runs without issue but is an error

F#

  [<Measure>] type ft
  [<Measure>] type kg
  
  let lengthInFeet = 2<ft>
  let weightInKg = 2<kg>

  let sum = lengthInFeet + weightInKg // Compile time error

More info at https://fsharpforfunandprofit.com/posts/units-of-measure/

osigurdson · on March 21, 2022

What precisely could have been changed to make you realize that C# DateTime is not the same as Unix time? Perhaps Ticks could be renamed to Ticks100ns but I'm not sure how to encode the epoch date such that it is not necessary to read any documentation. I suppose the class could have been named something like DateTimeStartingAtJan1_0001 but obviously would have been ridiculous.

Naming is an optimization problem: minimize identifier length while maximizing comprehension.

justsomehnguy · on March 21, 2022

> how to encode the epoch date such that it is not necessary to read any documentation

And you need to read the docs to know why some systems use 1970 as the reference point. Should we rename it to Unix1970datetimeepoch everywhere?

jweir · on March 21, 2022

The wonder elm-units is such a pleasure to work with and does just that.

https://package.elm-lang.org/packages/ianmackenzie/elm-units...

Even if you don’t work in Elm take a moment to look at it.

deepsun · on March 21, 2022

Nitpicks:

Why would your type system have encoded unit for kilo-pascal, but not hecto-pascal, mega-pascal, micro-pascal etc?

If you only encode base units (e.g. seconds), then we should use exact-precision arithmetic instead of f32 or f64, which is sometimes an overkill.

If encoding all the modulos (kilo/milli/mega etc) I feel like there are some units may have name clashes (e.g. "Gy" -- is it giga-years, or gray)?

Should we encode only SI units, or pounds/ounces/pints as well?

stinos · on March 21, 2022

(e.g. "Gy" -- is it giga-years, or gray)

In my opinion this is not a real problem, since ideally no-one should such meaningless abbreviations in code. Just write giga_years or GigaYears or whatever your style is, problem solved, doesn't get any clearer than that.

timthorn · on March 21, 2022

In defence of Gy, it isn't meaningless in astronomy - it's a very well used unit. Though I do agree that it might be less common in code.

gattr · on March 21, 2022

I myself can't remember seeing Gy in astronomical papers, but I've seen Ga (gigaannum).

https://en.wikipedia.org/wiki/Year#SI_prefix_multipliers

timthorn · on March 21, 2022

Ga is indeed used as well. https://en.wikipedia.org/wiki/Billion_years

Aeolun · on March 21, 2022

Isn’t Gy a bit much even in astronomy? With 15 Gy you have the age of the universe right?

em3rgent0rdr · on March 21, 2022

Your comment piqued my curiosity, and I looked at https://en.m.wikipedia.org/wiki/Future_of_an_expanding_unive... and found:

"Stars are expected to form normally for 10^12 to 10^14 (1–100 trillion) years"

So it seems Gy and even Ty units will be a reasonable scale for events during the period of star formation.

Aeolun · on March 21, 2022

Ah, that is a good point. For some reason I was thinking only backwards. I never considered that there’s orders of magnitude more time in front of us.

sparkie · on March 21, 2022

Does the type system handle equivalent units (dimensional analysis)? eg, N.m = kg.m^2.s^-2.

Does the type system do orientational analysis? If not you to assign a value of work to a value of torque and vice versa, as they both have the above unit.

There are several other similar gotchas with the SI. I think descriptive names are better than everyone attempting to implement an incomplete/broken type system.

DiogenesKynikos · on March 21, 2022

The Python package astropy does all these things. There's a graph of equivalencies between units.

0. https://docs.astropy.org/en/stable/units/index.html

nuccy · on March 21, 2022

Speaking of astropy units. I had a hilarious issue last week, which was quite hard to identify (simplified code to reproduce):

  from astropy import units as u
  a = 3 * u.s
  b = 2 * u.s
  c = 1 * u.s
  d = 4 * u.s
  m = min([a, b, c, d])
  a -= m
  b -= m
  c -= m
  d -= m
  print(a,b,c,d)

Output: 2.0 s 1.0 s 0.0 s 4.0 s

Note the last 4.0, while min value is 1.0

The issue is that a, b, c, d are objects when astropy units are applied and min (or max) returns not the value but an object with the minimal value, thus m is c (in this particular case c has the smallest value) so c -= m makes m = 0, so d remains unchanged. It was very hard to spot especially when values change and occasionally either one of a, b, c or d has the smallest value.

In-place augmentation of a working code with units may be very tricky and can create unexpected bugs.

DiogenesKynikos · on March 21, 2022

This is really an issue with Python in general (specifically, mutable types).

You'd get the exact same behavior with numpy ndarrays (of which astropy Quantities are a subclass).

dragonwriter · on March 21, 2022

> This is really an issue with Python in general (specifically, mutable types).

Unit-aware values as a type where assignment as mutation is an odd choice though (normal mutable types do not exhibit this behavior, it’s a whole separate behavior which has to be deliberately implemented.) It may make sense in the expected use case (and as you note reflects the behavior of the underlying type), but more generally it's not what someone wanting unit-aware values would probably expect.

necovek · on March 21, 2022

That sounds like a bug in astropy type definitions: did you get a chance to report it as one?

While it can sometimes be undefined behavior (a minimum of incompatible units), in cases like these it should DTRT.

deepsun · on March 22, 2022

Mutable types are hard.

sparkie · on March 21, 2022

I see dimensional analysis, but in this table[1], torque and work have the same unit, and that unit is J.

The SI itself states[2]: "...For example, the quantity torque is the cross product of a position vector and a force vector. The SI unit is newton metre. Even though torque has the same dimension as energy (SI unit joule), the joule is never used for expressing torque."

[1]:https://docs.astropy.org/en/stable/units/index.html#module-a... [2]:https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-...

gpderetta · on March 21, 2022

This is the answer from boost units:

https://www.boost.org/doc/libs/1_78_0/doc/html/boost_units/F...

tl;dr is uses some sort of pseudounits to make dimensionally similar but incompatible units, well, incompatible.

Ekaros · on March 21, 2022

Time units are messy.

Once we get to Gigayears, what is the size of single year? Just 365 days? Or which of Julian, Gregorian, Tropical, Sidereal? At even kilo prefix the differences do add up. Or would you need to specify it?

Days, weeks and months are also fun mess to think of.

dragonwriter · on March 21, 2022

> Once we get to Gigayears, what is the size of single year?

The appropriate system of units is context-dependent. The astronomical system, for instance, has Days of 86,400 SI seconds and Julian years of exactly 365.25 Days; if you have a general and extensible units library, then this isn't really a difficulty, you just need to make a choice based on requirements.

BlueTemplar · on March 21, 2022

This is going to depend on the precision that you need.

(Calculations without associated uncertainty calculations are worse than worthless anyway - misleading due to the inherent trust we tend to put in numbers regardless of whether they are garbage.)

usrusr · on March 21, 2022

You'll want to leave the point floating to floating point numbers, but whenever you interact with legacy APIs or protocols you want a type representing the scale they use natively. You wouldn't want to deal with anything based on TCP for example with only seconds (even if client code holds them in doubles), or with only nanoseconds. But you certainly won't ever miss a type for kiloseconds.

BlueTemplar · on March 21, 2022

Isn't floating point specifically for dealing with large orders of magnitude ?

(Economics of floating vs fixed point chips might distort things though ?)

Also, in the case that you meant this : you might need fractions of even base units for internal calculations :

IIRC banking, which doesn't accept any level of uncertainty, and so uses exclusively fixed precision, uses tens of cents rather than cents as a base unit ?

jltsiren · on March 21, 2022

The standard symbol for a year is "a". Using "y" or "yr" is non-standard and ambiguous, and they should be avoided in situations where precision and clarity matter.

deepsun · on March 22, 2022

Thanks, didn't know. Although in astronomy they use Gy often. PS: Don't know why people downvote, your comment is useful.

gpderetta · on March 21, 2022

Given a powerful enough type system, you can parameterize your types by the ratio to the unit and any exponent. Then you can allow only the conversions that make sense.

resonious · on March 21, 2022

I don't think the parent meant to exclude hecto-pascals from their hypothetical type system.

Juliate · on March 21, 2022

Why wouldn't the type system be able to take care of that?

wildzzz · on March 21, 2022

In my software project, all measurements exist as engineering units since they all come from various systems (ADCs or DSP boxes). We pass the values around to other pieces of software but are displayed as both the original values and converted units. We have a config file that contains both units and conversion polynomials, ranging from linear to cubic polynomials. Some of the DSP-derived values are at best an approximation so these have special flags that basically mean "for reference only". Having the unit is helpful for these but are effectively meaningless since the numbers are not precise enough, it would be like trying to determine lumens of a desk lamp from a photo taken outside of a building with the shades drawn.

Gigachad · on March 21, 2022

I love when types are used to narrow down primitive values. A users id and a post id are both numbers but it never makes sense to take a user id and pass it to a function expecting a post id. The code will technically function but it’s not something that’s ever correct.

philsnow · on March 21, 2022

Having units as part of your types improves legibility for whoever's writing code too, not just reviewing. You won't make (as many) silly mistakes like adding a value in meters to another in seconds.

necovek · on March 21, 2022

If your use of a value in units lives for so long as for it to not be clear in what unit it is in or should be in (eg. spans more than 20 lines of code), I think you've got a bigger problem with code encapsulation.

I think the practical problem stems from widespread APIs which are not communicating their units, and that's what we should be fixing instead: if APIs are clearer (and also self-documenting more), the risks you talk of rarely exist other than in badly structured code.

Basically, instead of having `sleep(wait_in_seconds)` one should have `sleep_for_seconds(wait_time)` or even `sleep(duration_in_seconds=wait_time)` if your language allows that.

But certainly use of proper semantic types would be a net win, but they usually lose out in the convenience of typing them out (and sometimes constructing them if they don't already exist in your language).

DiogenesKynikos · on March 21, 2022

The Python package astropy extends numpy arrays to include units.[0]

It can convert between equivalent units (e.g., centimeters and kilometers), and will complain if you try to add quantities that aren't commensurate (e.g., grams and seconds).

The nice thing about this is that you can write functions that expect unit-ful quantities, and all the conversions will be done automatically for you. And if someone passes an incorrect unit, the system will automatically spit out an error.

0. https://docs.astropy.org/en/stable/units/index.html

ranaexmachina · on March 21, 2022

I think you would enjoy programming in Ada.

ajdude · on March 21, 2022

When I first started to learn Ada, I found it one of the most verbose languages I’ve ever used. Now I find myself practicing those wordy conventions in other languages too.

albrewer · on March 21, 2022

Have you come across a system or language that handles units and their combinations / conversions well? I have a project I want to undertake but I feel like every time I start to deal with units in the way I feel is "proper" I end up starting to write a unit of measure library and give up.