Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
This is valid Python syntax (bitecode.dev)
133 points by ankitg12 on June 11, 2023 | hide | past | favorite | 106 comments


I don't find that so bad. They look complicated but behave as expected, and would only be a problem if you inherit a code base from a lone wolf without common sense.

Python has plenty of warts[1], but I don't think the syntax is one of them (yet). Some of the more concerning warts:

    # Mutable defaults in function parameters.
    def fn(arg=[]):
        arg.append(1)
        print(arg)

    fn() # [1]
    fn() # [1, 1]
    fn() # [1, 1, 1]

    # Loop variables are shared across executions.
    lambda_list = [lambda: print(i) for i in range(5)]
    for fn in lambda_list: fn()
    # 4
    # 4
    # 4
    # 4
    # 4

    # Iterating over an exhausted generator yields an empty list.
    numbers = (i for i in range(5))
    assert 1 in numbers
    print(list(numbers))
    # []
    # I hate this one because it could have easily been an error, it's 
    # hard to track down, and lots of stdlib functions return generators.


    # Chained operators can sometimes be very unintuitive.
    should_be_ascending = True
    assert 1 < 2 == should_be_ascending
    # AssertionError because it was interpreted as
    # `(1 < 2) and (2 == should_be_ascending)`
[1]: https://github.com/satwikkansal/wtfpython


First one is annoying, but it makes sense, to some degree. You say the default is this particular instance that gets created when you define the function.

Second is just how things work, the function is a closure that evaluates the non local variable when you run it.

Third one... Iterating over the generator does NOT yield an empty list, it raises StopIterator. As does trying to iterate after a generator is done, so there's no way for list to know what's happening. You can argue that's a bad design, but I'm not so sure, an iterator is something you need to make sure you don't try to iterate over twice...

I don't get the 4th. Comparison operators have the same precedence, so yeah.


> First one is annoying, but it makes sense, to some degree. You say the default is this particular instance that gets created when you define the function.

Could just as easily be an expression that's reevaluated on each call. In fact it's what most other languages do, making it twice as bad a decision. Even Javascript gets it right:

    function fn(arg=[]) { return arg; }
    fn() === fn()
    // false
> Second is just how things work, the function is a closure that evaluates the non local variable when you run it.

That's again a decision made by Python. Some other languages behave as if each iteration had declared a different variable. Here's Javascript again, you can even use `const` for the iterating variable:

    const fns = [];
    for (const i of [1, 2, 3]) {
        fns.push(() => console.log(i));
    };
    fns.forEach(fn => fn());
    // 1
    // 2
    // 3
> Third one... Iterating over the generator does NOT yield an empty list, it raises StopIterator.

Sorry, my phrasing was imprecise. It's not doing 'yield []', but an exhausted generator does behave like an empty generator. Maybe they could have made it so the first exhaustion raises StopIterator, and subsequent ones a different exception? Like reading from a closed channel.

> I don't get the 4th. Comparison operators have the same precedence, so yeah.

It's not precedence, it's Comparison Chaining, the same feature that enables '1 < a <= 10'. If the operators had simple precedence, it would be equivalent to '(1 < a) <= 10' or '1 < (a <= 10)', but Comparison Chaining evaluates it as '(1 < a) and (a <= 10)'. Useful for specifying ranges, but a foot gun in other scenarios, like in my example.


> That's again a decision made by Python. Some other languages behave as if each iteration had declared a different variable.

Doing that requires block scoping. Python does not have block scoping. The alternative would be to create a dedicated pseudo-scoping for for loops, which is technically possible (e.g. `except` blocks have special casing to avoid circular references) but is a lot of additional complexity, not “just” changing the scoping of iteration variables.

Furthermore (1) happens to provide a pretty good workaround: you can set the iteration variable as a default value of the closure, it’ll be evaluated at the creation of the closure thus “fixing” it.


>That's again a decision made by Python. Some other languages behave as if each iteration had declared a different variable. Here's Javascript again, you can even use `const` for the iterating variable:

But then can you use i after the loop? In Python you can use i after the loop, which can be useful (e.g. if you break out)

    for i in range(10):
        if condition(i):
            break
    else:
        print("didn't break")
    print("broke at", i)
It is also easy to "fix":

    ll = [lambda i=i: print(i) for i in range(5)]
This issue is always going to be an issue for Python because it doesn't have explicit variable declaration. In C++ you have the difference between

    for (int i = 0;; i++)
    // and
    int i;
    for (i = 0;; i++)
while in Python there's no way to write 'declare i here so it will be scoped outside the loop', which if it existed would make the behaviour you desire a reasonable alternative. Python variables are always scoped to the surrounding function.

    for i in range(5):
        if i % 2 == 0:
            q = i
        print(q)  #=> 0 0 2 2 4
>Could just as easily be an expression that's reevaluated on each call.

Not ideal but it is consistent with other parts of the language. This is just one of those things you need to learn and in practice isn't really an issue. The fact it's different from other languages is not a relevant concern at all. 'Being like JS' is not a virtue.

>Sorry, my phrasing was imprecise. It's not doing 'yield []', but an exhausted generator does behave like an empty generator. Maybe they could have made it so the first exhaustion raises StopIterator, and subsequent ones a different exception? Like reading from a closed channel.

Iteration is just calling __next__. It's useful to iterate over things more than once sometimes.

IMO, this:

    y = iter(x)
    for z in y:
        do(z)
        if condition(z):
            break
    for z in y:
        do2(z)
is bit nicer than this:

    flag = False
    for z in y:
        if flag:
            do(z)
            if condition(z):
                flag = True
        else:
            do2(z)
>Comparison Chaining

Incredibly useful feature. I've never seen anyone write 'a < b == c' so this seems a bit artificial. I'd hardly describe it as a footgun.


So can somebody explain to me, step by step with the "elements behind the curtain", what is actually going on in these two cases:

   >>> [ x() for x in [lambda: i for i in range(5)] ]
   [4, 4, 4, 4, 4]
Why?

   >>> [ x() for x in [lambda i=i: i for i in range(5)] ]
   [0, 1, 2, 3, 4]
And why?

Starting from the fact that the following give:

    >>> [lambda: i for i in range(5)]
    [<function <listcomp>.<lambda> at 0xC22>, 
     <function <listcomp>.<lambda> at 0xCAE>, 
     <function <listcomp>.<lambda> at 0xC9A>, 
     <function <listcomp>.<lambda> at 0xCD6>, 
     <function <listcomp>.<lambda> at 0xCC2>]

    >>> [lambda i=i: i for i in range(5)]
    [<function <listcomp>.<lambda> at 0xCF4>, 
     <function <listcomp>.<lambda> at 0xCFE>, 
     <function <listcomp>.<lambda> at 0xD08>, 
     <function <listcomp>.<lambda> at 0xD12>, 
     <function <listcomp>.<lambda> at 0xCCC>]


This is your first snippet (expanded to two lines and max index reduced from 5 to 3):

   lambda_list = [lambda: i for i in range(3)]
   results = [x() for x in lambda_list]
It expands to this:

   i = 0
   f0 = lambda: i
   i = 1
   f1 = lambda: i
   i = 2
   f2 = lambda: i
   results = [f0(), f1(), f2()]
The body of those functions f0, f1, f2, each looks in the function's own dictionary of local variables for "i". But it doesn't exist there, so it looks in the enclosing scope (and keeps doing this recursively up to global scope if necessary). At the next scope it does find a variable "i". This happens at the point you call the function (the same as anything else inside the definition of a function), in this case on that last line when you assign to results. By that point, the "i" variable in the enclosing scope has value 2. So results = [2, 2, 2].

[edit: this feature of functions in Python (variables are looked up by name when the function is run) is what lets you define functions foo() and bar() in a module / script that call each other. When the first one is defined, the other one doesn't exist yet, but you're not actually running the code inside the function so it doesn't matter. By the time you come to call it later, the other function does exist in the enclosing (module level) scope so the lookup succeeds.]

This is your second snippet (I changed one of the variable names from i to j for clarity, but you can rename it back and it would work the same way):

   lambda_list = [lambda j=i: j for i in range(3)]
   results = [x() for x in lambda_list]
it expands like this:

   i = 0
   f0 = lambda j=i: j
   i = 1
   f1 = lambda j=i: j
   i = 2
   f2 = lambda j=i: j
   results = [f0(), f1(), f2()]
But defaults of function arguments are evaluated at the point the function is declared, not when it's called (as noted elsewhere in this thread - hence the def foo(x=[]) gotcha!). You can imagine the second line of code above is like this, just to really drive the point home [edit: I originally used a made up syntax, but I updated to code that actually works]:

   f0 = lambda j: j
   f0.__defaults__ = i,
   f0.__kwdefaults__ = {"j": i}
So, f0 has the specific value 0 baked into it, and similarly f1 and f2 have 1 and 2. Therefore results = [0, 1, 2].


I've tried to see __defaults__ and __kwdefaults__ here:

    i = 7

    lambda_list = [lambda: i for i in range(3)]
    results = [x() for x in lambda_list]

    print( i )
    for k in range( 3 ):
        print( lambda_list[k] )
        print( lambda_list[k].__defaults__ )
        print( lambda_list[k].__kwdefaults__ )

    lambda_list = [lambda j=i: j for i in range(3)]
    results = [x() for x in lambda_list]

    print( i )
    for k in range( 3 ):
        print( lambda_list[k] )
        print( lambda_list[k].__defaults__ )
        print( lambda_list[k].__kwdefaults__ )
    print( i )

The output is:

    7
    <function <listcomp>.<lambda> at 0xB2>
    None
    None
    <function <listcomp>.<lambda> at 0xBC>
    None
    None
    <function <listcomp>.<lambda> at 0xC6>
    None
    None
    7
    <function <listcomp>.<lambda> at 0xD0>
    (0,)
    None
    <function <listcomp>.<lambda> at 0xDA>
    (1,)
    None
    <function <listcomp>.<lambda> at 0xE4>
    (2,)
    None
    7
If I understand it, __kwdefaults__ are always None and the __defaults__ used only in the second case. I've also tried to see the "i" as not being inside of the closure, but it remains 7, not like (as noted by 'remram here):

    i = 9
    y = 1
    for y in range( 1,-1,-1 ):
        try:
            x = 4 / y
        except Exception as i:
            pass
        print( i )

Which gives, for me, fascinating:

    9
    Traceback (most recent call last):
      File "p.py", line 10, in <module>
        print( i )
               ^
    NameError: name 'i' is not defined. Did you mean: 'id'?

I'm learning a lot here.


Oops, `__kwdefaults__` is only for keyword-only arguments

   fn = lambda x, *args, y=3: 0
   print(fn.__defaults__, fn.__kwdefaults__)
   # prints None {'y': 3}

Source: https://stackoverflow.com/questions/17533929/what-is-the-use...


[Edit: quietbritishjim was faster than me, and makes more or less the same points. You can safely ignore this post.]

First let's have a look at this:

   >>> [ x() for x in [lambda: i for i in range(5)] ]
   [4, 4, 4, 4, 4]
Variable i in the function body is a nonlocal or global, taken from the nearest enclosing scope where i is bound. When a function (lambdas are functions just like any other) is executed, a nonlocal or global variable gets the value it has in its scope at the time of the function call. Step by step:

    >>> import inspect

    >>> i = 0
    >>> a = lambda: i
    >>> inspect.getclosurevars(a)
    ClosureVars(nonlocals={}, globals={'i': 0}, builtins={}, unbound=set())
We see that variable i in a refers to the global i, which currently has value 0. Obviously a() returns 0:

    >>> a()
    0
Next we create a new lambda, with the same definition as before, but we first increment i.

    >>> i = 1
    >>> b = lambda: i
    >>> inspect.getclosurevars(b)
    ClosureVars(nonlocals={}, globals={'i': 1}, builtins={}, unbound=set())
So i in b refers to the global i (as before), which now has value 1. Unsurprisingly b() returns 1:

    >>> b()
    1
But variable i in a also refers to the global i which is now equal to 1:

    >>> inspect.getclosurevars(a)
    ClosureVars(nonlocals={}, globals={'i': 1}, builtins={}, unbound=set())
And sure enough, a() now returns 1:

   >>> a()
   1
We can set i to any value, and both a() and b() will then return that value.

    >>> i = 'gotcha'
    >>> a()
    'gotcha'
    >>> b()
    'gotcha'
Going back to your example, the i used in the lambdas refers to a nonlocal instead of a global (the i used in the list comprehension), but that doesn't make a difference for the topic at hand. What does matter is that the current value of i is used, which is 4 after the for loop in the list comprehension.

Now to the second example.

    >>> i = 0
    >>> a = lambda i=i: i
i is now a local, since it's a parameter to the function. It's a bit unclear because i is used in two different meanings here; let's rename one of the uses to j. This is exactly equivalent:

   >>> a = lambda j=i: j
   >>> inspect.getclosurevars(a)
   ClosureVars(nonlocals={}, globals={}, builtins={}, unbound=set())
This time there are no closure variables. j in the function body does not anymore refer to anything outside of the function scope. Instead a is now a function with one parameter, which returns the value of that parameter:

    >>> a(10)
    10
    >>> a('hello')
    'hello'
What's of course more important here is that we have a default value for that parameter. Default parameter values in Python are evaluated when the function is defined, not when it is executed. i was 0 when we defined a, so a will always have 0 as default parameter. We can see that default value:

    >>> a.__defaults__
    (0,)
Add a second function:

    >>> i = 1
    >>> b = lambda i=i: i
    >>> b.__defaults__
    (1,)
    
Now b is also a function taking one parameter, but this time that parameter has default value 1. Meanwhile the default value of function a has not changed:

    >>> a.__defaults__
    (0,)
So now we can predict that a() will return its default parameter value (which is 0), and b() will return its default parameter (which is 1). Let's check.

    >>> a()
    0
    >>> b()
    1
So the big difference is that your first example uses nonlocal/global variables which are evaluated when the function is called, while your second example uses default parameter values which are evaluated when the function is defined.

Note: you can change default parameter values though, if you really really want:

    >>> a = lambda i=5: i
    >>> a()
    5
    >>> a.__defaults__ = ("don't actually do this",)
    >>> a()
    "don't actually do this"


I like your explanation, it has more missing information and also demonstrates the use of "import inspect", it's definitely not to be ignored!

Did I understand correctly, the fact that the variable is not belonging to the closure is the non-intuitive part, not knowing "how it was decided to do in Python" I'd expect the variable to belong to the closure, but instead just the reference to the variable outside of the closure belongs to the closure? Which AFAIK doesn't save any memory (the reference still has to be there), it was simply a design decision?

So in Python the closure just doesn't "enclose" the variables at all? So any use of any variable in it can destroy something used "outside"? How do those using these construct cope with that in practice? Just ignore? Invent some naming schemes? Something else?

Reminds me on an old Fortran which allowed changing the global constants passed to the functions via an assignment to the function parameter inside of the function. :)


I guess it's a design decision, I guess it could have been designed differently, but I'm not really sure. Doing it differently could very well break other things that people do rely on.

Referring to the original variables instead of making copies does save memory though: in your example there were 5 functions that each would have their own copy, instead of only 1 instance shared between them.


> Incredibly useful feature

For something so "incredibly useful," I've literally never felt the need for something like this in any other language in 20+ years of coding

Elixir's deep pattern-matching, on the other hand, I miss in LITERALLY EVERY OTHER language I have to work in


you assume this is the case but it's not.

matlab has the same pattern for class initialisers for instance


I suspect it isn't so simple to fix the first example in the way you describe, because Python's a 30+ year old language with functions as first-class citizens. If we went back in time and built it from scratch to remove this wart, I'd vastly prefer we require all default kwargs to be immutable types, rather than allocate a new empty list with every function call. But doing that today would be an unacceptable breaking change.

The second one isn't a real issue IMO. It exists as an example of how closures and scoping work in Python, and as a warning to beginners to avoid doing un-Pythonic things with them. Python closures are late-binding. FWIW your JS example behavior works as you expect in Python, if you used a generator instead of a list comprehension, but why not just pass the variable as a parameter to your anonymous function instead? This is a silly, contrived example. Language designers can't fix stupid.

> subsequent ones a different exception? Like reading from a closed channel.

StopIterator is raised when you call next() on an iterator with no remaining elements. That's how it's defined. https://docs.python.org/3/library/exceptions.html#StopIterat...

Why no `close()`? Let me rephrase: why can't you use `with` on a generator? That's in their Design FAQ. TLDR: because it's an uncommon use case for generators, and if you really want to, there's ways to do it.

https://docs.python.org/3/faq/design.html#why-don-t-generato...

Changing the behavior of `for`, `in`, etc to call close() when the iterator is consumed, might seem simple, but do a little reading on coroutines and `yield from`, and you'll probably be convinced otherwise. Note that it's exceedingly rare that a Python programmer would want to use those two niche language features, but they're at the core of how certain standard library modules are implemented.

As for the last one, this is one nitpick I'm fully on board with. I always make a habit to break up boolean expressions, to avoid having to keep track of every language's boolean evaluation idiosyncracies. If I rewrote Python, I'd axe that chained boolean nonsense altogether because it's the kind of thing beginners will shoot themselves in the foot with over and over again, but it's another breaking change that would never be accepted.

Basically, don't use language features you understand, do a little defensive programming, and these gotchas aren't gonna bite you. JS, on the other hand, has numerous warts that are entirely unavoidable and far more frustrating, but even a broken clock's right twice a day.


> Second is just how things work, the function is a closure that evaluates the non local variable when you run it.

In python, yes. In other languages, not necessarily. There are two ways that this could reasonably not be the result:

1. For i in iterator could create a new variable i (shadowing the old i) on each iteration of the loop. This isn't how python works (which also means you can do things like access i after a for loop terminates) but there are languages that work that way.

2. Capturing an integer could copy that integer, instead of copying a pointer to a value that can change in the future. Again, not how python works, but how some languages work.

The first in particular I think is more intuitive in other languages, and leads to other footguns in python as well, e.g.:

    for i in range(5):
        for i in range(10):
            pass
        print("Finished loop iteration", i) # I meant 0, 1, 2, 3, 4. Instead I got 9 9 9 9 9


I can see how it happened, but the first one does not make sense. Most people would expect argument defaulting to be per execution. If the argument is None, instantiate it with this new value.

I'm curious what making it shared and mutable achieves.


Another weird syntax quirk is this:

    e = 4
    try:
        b = 4/0
    except Exception as e:
        print("exception raised")
    print(e)  # NameError: name 'e' is not defined
The captured exception only exists in the 'except:' block, but instead of creating a new scope (shadowing the outer 'e'), it deletes it afterwards.


On the generator example, it doesn’t yield an empty list. It continues to return an empty iterator. This couldn’t have easily been an error because one aspect of generators is they can be dynamic. There could be an API call in there, where it polls for new results. Raising a “NoMoreForever” error or something would break the interface.


You're right, "empty list" is an imprecise description. But couldn't they have made it behave like Go channels, for example? Read it until exhaustion, then close it, so that subsequent reads (or calls to 'next()') fail with something other than StopIterator.

I know it's not backwards compatible and hence a no-go now, but the original decision puzzles me.


1 and 2 are the scoping rules (since Python has no variable declarations [1], making all blocks scopes would be pretty awkward and require tons of "xy=None # reassigned later" and nonlocals), 3 should result in [2, 3, 4], not [].

[1] It actually does nowadays, ever since non-assigning annotations came around.


> ... making all blocks scopes would be pretty awkward and require tons of "xy=None # reassigned later" and nonlocals) ...

C, C++, Java, C#, JS/TS (through let), Rust, and more all require variable declaration before use. Python is the odd one out here. JavaScript had something similar (variable hoisting), but it's so problematic that we ditched it for let/const.


I agree that functions sharing the references to their arguments is awkward (though useful on occassion), however the alternative is to re-instansiate the arguments by a shallow copy, a deep copy, or full-on re-evaluation, all of which have their problems and are significantly slower (calls are already slow in Python as is!) so it makes sense on balance for many use cases, especially as most defaults in practice are immutables like numbers and strings (so it's really just the usual warts for references to mutables).

The others are a little less concerning to me. Sure, the lack of fine-grained scoping is annoying, but that's just how Python operates; the example confuses the point via the lack of variable capture in lambdas, when fn is called it does a normal locals/globals lookup for i, and the lack of scope means the last value of i spills outside of lambda_list, meaning all evaluate to 4, but this is primarily a problem with how lambda works without capture (again, scoping in Python is what it is). I've even begrudgingly (ab)used Python's lack of scoping on occassion, albeit mindfully aware of when I can and when it is actually defined, because it is very easy to cause problems with and generally best avoided.

Empty generators to empty lists isn't a huge concern in my experience, as it lines up with empty lists, sets, and dicts as the behaviour for comprehensions and how to handle them when there's nothing. Also, your example for it, uh, doesn't work as you say (it prints [2,3,4]).

Operator precedence is always awkward in every langauge that has it, and operators are still awkward in those that don't, and Python's is no exception, though it's largely better than C's -- in Python, comparisons share precedence and are lower than most other operators. I feel there isn't really a right answer to the problem as is, though ==/!=/in/is maybe shouldn't be part of chaining (but chaining is defined over all comparison operators [1] so that's probably not going to change). As always, when in doubt, use parenthesis.

[1]: https://docs.python.org/3/reference/expressions.html#compari...


Correction, the empty iterator example should have been `assert 5 not in numbers`, or another expression that exhausts the generator.

And to everyone saying that the behaviors are correct because of scoping rules, or iteration protocol: I know. Those are not compiler bugs, but bad design decisions. Every example here has a more intuitive behavior in another language.


Syntax is pretty much one of the downsides of Python. Especially the new additions, including f-strings, match-case, type annotations, async-await, data-classes etc. The largest downside is that they hugely blew up the language complexity for no tangible gains. It's like you had a crappy bicycle before, but after an "upgrade", you have a crappy bicycle that doubles as a crappy ice-cream maker.

This means that anyone who wants to write a parser for Python from scratch is screwed. The language was garbage before, and writing a parser for Python was already very difficult, but after more junk was added to it, it just became unnecessarily more difficult. So, if you might have had a glimmer of hope to have tools that analyze the source code and do something with it, even as trivial as highlighting... well, there's less and less hope of that happening.

So, OPs complaint about f-string is sort of valid. OP didn't really look for a very bad example though, but bad examples are hard to refine into more concise form. Fundamentally, what sucks about f-strings beyond them being completely unnecessary is the situations where one has to deal with multiple incoherent escaping rules, s.a. when one has to interface with, eg. logging module or string.Template or str.format() or % in general. Another vomit-inducing combo is raw strings combined with f-strings (i.e. building regular expressions through interpolation).

Yet another downside of f-strings is the departure from object-oriented approach. The benefit of str.format() was it being object-oriented, which allowed for extensions in the same way objects can be extended. f-strings are an extension of the language syntax, but Python doesn't have an easy general way to extend its syntax (barring the fact that you can abuse the import system to import altered Python sources s.t. they have desired syntax).

But, again, these disadvantages are hard to present as a one-liner, and for many simple uses of the language, which is the majority of its uses they are inconsequential. Unfortunately, that majority of uses is also transient and contributes nothing back to the language... so, it could be confusing to rely on the majority's opinion when it comes to judging the quality of various language features.


You don’t write a parser from scratch, you use the ast module.

The rest of this is excessive negativity trying to sound superior while ignoring history.

(I even agree about the match statement and walrus, but fstring is a very useful compromise.)


I'm using walrus a lot, can't find any negative: it's clear what it does, and the code feels cleaner.


Python already had the “as” and assignment statements so didn’t need three ways to do it. Remember the Zen? (99% of the time walrus is used in “if” or “while” with one expression, which as handles fine.)

Also, along with typing contributes to the excessive “colon blow” of modern Python.


There's no 'as' in Python if or while statements, only with.

    >>> if True as a:
      File "<stdin>", line 1
        if True as a:
                ^^
    SyntaxError: invalid syntax


Indeed. Coulda been.


> Remember the Zen?

Like Django's "explicit is better than implicit... no, wait, hold my beer" magic.


Why would I write it in Python? It's the worst language to do that...


Strange article. I thought it was going to show that these features interact in some unexpected way that causes weird bugs, but it all seems to work as expected.

So the conclusion, I guess, is that code doesn't make a whole lot of sense when it uses every available feature for no particular reason and foreign languages for variables names?


Author here.

There is no conclusion to the article, the weird syntax is just a mean to an end:

- This makes people curious, so they read.

- This gives an opportunity to show a lot of cool features of the language in a short format. As the post mentions, Python's learning curve is smooth but long, so it's nice to be able to introduce some tricks to coders that would otherwise not be exposed to them.

- This demonstrates those things indeed, as you said, work together. Even if Python is a language on which a lot of things have been bolted on after the fact, it's still decently integrated.

- This allow me to insert a warning about not abusing them because even if Python is a readable language, you can always go too far. It's good maintain this message in the community.

- It's just fun, really. It's an article I would enjoy reading.


Must admit I wasn't quite sure what the point of the first one was. 0xBEAF is literally just 48815, as is 0x_B_E_A_F, as the numeric literals now ignore interleaved "_", and you just format it as a float after-the-fact as normal in an f-string. So why? There's nothing particularly special in the use of syntax here, so hardly a curious or weird artifact, let alone abuse.

> * has a completely different meaning

I'd personally rephrase this, as the idea around splatting is that it symmetrically can "gather" and "spread" multiple values, albeit with the syntactic limtations of starred expressions (likewise with *).

You do seem to confuse syntax for semantics: [][:] = [...] is a semantic trick to avoid binding a sequence, if you did a = [][:] = [1,2] you'd find a == [1,2] because it just passes along the reference to the right hand side, so without another layer it's just skipping binding it, and that's the major trick with the second one, being that it "discards" the first value when destructuring this way (so long as that first value is a sequence). This can be done "better" with _, *_ destructuring, which probably weirds more people out to discover that shadowing in a statement is legal (and the unused/shadowed variable is "better" as it works even when the first value isn't a sequence, making [][:] more fragile). People with knowledge of other languages might get confused by _ being a variable, but that's a known stumbling block.

Naturally, everyone has their own journey through discovering Python (or whatever language) and may or may not touch upon many parts of it syntactically and semantically. However, as far as "weird" syntax that has valid semantics goes, I'd say this is far from esoteric or unusual (where's the walrus?), and some parts are just the natural expression (such as with the splatting). Really, the only code smell I find here is the inlining of everything and redundancy (i.e. splatting a generator into a list)!


as a semi plt nerd, i like to push grammars a bit deeper than average application, thanks

and often indentation helps keeping track of structures


This article just reminds me how unexpected JavaScript is, unlike Python.


I learned about Python dev mode in this, thanks!

For anyone else:

https://docs.python.org/3/library/devmode.html


> While not as powerful as destructuring in other languages (I still wish we could unpack dicts as JS does with objects)

Like this?

    In [1]: a = {1: 10, 2: 20}
    
    In [2]: b = {3: 30, **a}

    In [4]: b
    Out[4]: {3: 30, 1: 10, 2: 20}
Or pattern matching?

    In [6]: match b:
       ...:     case {3: somethingelse, 1: foo}:
       ...:         print(somethingelse, foo)
    30 10


I think they mean {a: x, b: y, *c} = d, which is a nice idea now we have a stable order to dicts, but I've rarely come across times when I needed this that I can't simply use .items() and such for, or care about more general pattern matching like values for specific keys, which you can just do a, b = d["a"], d["b"] with little loss of clarity, sure {"a": a, "b": b} = d is cute and all, and maybe we'll oneday see it now we have match as syntactic precedent, but I don't really feel concerned by its absence.


They mean you can unpack lists:

  a, b, *rest = args
So why not dicts:

  "a": a, "b": b, **rest = kw_args
Also, in JS the destructuring does not fail because of extra elements even if you don't capture them. Further, in JS you don't need the quotes, and a: a can be shortened as simply a. You'll need the dict curly braces though. Thus the same in JS, which you see a lot e.g. for properties of a component in React code:

  { a, b } = props;
It's quite nice with no excessive "line noise" and even makes up for JS functions lacking real keyword arguments.


Also JS allows you to define default values if the key is missing, and Python's pattern matching doesn't.


> if you use "_" in hex, and interpolate it with a 2 decimal precision

...or if you don't. The '_' doesn't change anything or add anything, here.


Biggest footgun in python IMO are mutable default parameters. It amazes me that someone thought this was a good idea.


You are too naive when you think that someone thought about it. Many things in Python happened because that's how they were implemented initially without much thinking or at all.

There was no plan, and, in the initial stages, I'd imagine that the author(s) were surprised that they'd even gotten this far.

Famously, the whole object system in Python was implemented in like a few days. That included the design and testing too... It did see some changes over time, but there's a lot of petrified turd there that's increasingly difficult to get rid of.


Arrogant bullshit, probably the worst non-spam comment I’ve seen on HN—congrats.

https://peps.python.org/pep-0671/

Truth is, early vs late binding is a tradeoff. Python picked one, but not like the other is without edge cases either.


I don't understand what's getting you all bent out of shape about this.

Unexpected emergent behaviors are extremely common in programming language design. Even when carefully planning every language feature, unexpected interactions can and do occur, sometimes these things are not ideal. I am not very informed about Python's history here, but it's not far fetched that this behavior was not planned and just happened due to unexpected interactions of other features.

Why would Python be immune to this? Are Guido and the rest of the Python contributors 1000x devs that can see infinitely far into the future and never make mistakes? All of Pythons quirks were actually planned from the very beginning I guess.


It was chosen, I listened to the debate around the pep. GP is talking out their ass.


It was chosen, I listened to the debate around the pep. Go is talking out their ass.


GP's point is that Python didn’t “pick” one but that it’s probably an accident of how the parser was implemented. It would see the `[]`, allocate a new object, then greedily assign it to the “variable”.

The whole “tradeoff” argument doesn’t make sense when you consider that Python is the odd one out here with early binding. In fact, there’s been so many bugs with it that they had to invent a whole new syntax to fix it.

When one says `intParam += 2`, why should they expect that to modify the default parameter? Yet when the default is a list, all of a sudden it’s fine. Python breaks the principle of least astonishment with default parameters, and it’s frustrating hearing people defend such a thing.


The point is that it wasn't an accident. It was picked for performance reasons. And Python isn't the only one that does it. Don't remember the exact list but it was given once in a python-dev thread regarding the mentioned pep.


What does this trash have to do with my comment?


The "new" object system in Python ("new" 23 years ago in Python 2.2 and the only system in Python 3 (2008 = 16 years ago)) was strongly and widely debated and refined over time. Maybe the old object system, which hasn't been relevant for 20 years, may have been implemented in a few days? So what? All of JavaScript was invented in a few days - the entire language. Yet people around here seem to love the language.


I couldn't quickly find the source, but what I recall is this story: the basic and only data type in Python from the very start was tuple, and then Guido decided to add objects and did it in like over the weekend sprint of "intense thinking and coding".

All the "discussion" wasn't really a discussion but post-rationalization and adoration of his "genius work".

No sane programmer with a modicum of experience would've designed objects to have both __slots__ and __dict__. The @property thing was also a crutch to deal with bad "design" decisions which were never properly addressed... and, basically, everything that was done afterwards was done in an additive way, so the old garbage stayed forever and became more entrenched.


You've imbibed a very strange, almost slanderous story and now you're repeating it as fact.

But perhaps old-style classes were awful. I don't really remember. I haven't used them for a very long time, because new-style classes, which were definitely not implemented in a weekend, have been around for approximately forever. Old-style classes were obsoleted in 2002 and don't even exist in Python 3, which was first released in 2008. Old-style classes didn't have multiple inheritance or many of the customisation points that new-style classes have.

>No sane programmer with a modicum of experience would've designed objects to have both __slots__ and __dict__.

Python objects don't have both __slots__ and __dict__. __slots__ is an optimisation for those cases where performance is important and you don't need the dynamism of __dict__. There is also no need for such extreme language.

>The @property thing was also a clutch to deal with bad "design" decisions which were never properly addressed...

Not at all. @property is just one example of a descriptor. Python attributes are just attributes, but descriptors allow them to be more than just __dict__ accesses. It's a very clever and flexible system that allows for some very expressive metaprogramming. @property isn't special and it has nothing to do with "bad design decisions".

>and, basically, everything that was done afterwards was done in an additive way, so the old garbage stayed forever and became more entrenched.

Python is literally one of the only languages that has got rid of its "old garbage".


You have it all backwards.

No, __slots__ isn't an optimization, it came before __dict__.

Old classes allowed multiple inheritance. And they are not so different from the new classes. The difference is superficial. There were and still are many things in Python that were written w/o any thought or cohesive organizing principle, but, over time, they put some effort to make it look more uniform.

For example, old errors didn't inherit from object, but eventually they were made to do so. Similarly, module type didn't inherit from object, but at some point it was made to do so. A lot of Python builtins were half-arsed reimplementation of object with a bunch of slots missing.

> Not at all. @property is just one example of a descriptor.

This is exactly why it's bad design. If one were to design object system with properties, they wouldn't need this patch. It would've been part of the system from the get go. There's no need for descriptors here or in general. The whole idea of decorators is a half-arsed attempt at macros...

> Python is literally one of the only languages that has got rid of its "old garbage".

No, absolutely not. They renamed things, but kept the implementation the same. Sometimes they added things, like in example with module type, to make it look more uniform. Sometimes they hid some functionality from the interpreted part of Python, but it is still accessible to C API etc.

It's the same story with multiprocessing -- it was a failure from the start, but they never admitted it, and kept incrementing to make it an even bigger failure... while deep inside their hearts they knew it was a failure, so they also tried other things... which also failed (like multiple interpreters).

When you read the C source of Python, the feeling that you get is that you are looking at someone who sucks at C, who struggles to make their code to run at least somehow. It's not the code of a person who knows their way around, can consider benefits and downsides of doing the same thing in different ways. It's a code of a person who adds asterisks and ampersands until it compiles, who's never heard of const-correctness, who doesn't try to minimize heap allocations... It's a code that you'd typically expect from a second-third year CS student.


Also a most random note here:

mutable /ˈmjuːtəbl/ adjective liable to change.

I like the emphasis that it’s liability haha


That's not really what's going on. Python has default values where, say Ruby, has default expressions.

    def myfunc(x=[])
      puts x
    end
In Ruby is the same as

    def myfunc(x=lambda:[])
      print(x())
in Python when the default is needed. The benefit is performance, you only have to evaluate the default once at function definition time. I won't say it's not a footgun but it's also I think a sane choice when in Python it's common to have functions with many default parameters.


"the benefit is its performance"

Sure, but when you design a language you are striking a balance between many things (which is why some features could take years to finalize and implement). In this case, the gain in performance is likely minimal, but the confusion and surprise it causes does a lot of damage.

(I recently read through the entire series of conversation with Anders Hejlsberg on the design of C# which is very insightful https://www.artima.com/intv/anders.html)


For what it's worth, mutable default parameters do have their use.

    def foo(x, cache = {}):
        value = cache.get(x)
        if value is None:
            ...
            cache[x] = value
        return value
This simple example, thankfully, is obviated by the @cache decorator as of py3.9, but I still occasionally find uses for the feature.


For lists, use a tuple as default, since it's immutable.

    def foo(allowed=()):
      truthy = ['true', 'True']
      truthy += allowed # most things work
      allowed.append('yes') # error
It's not perfect (`truthy = ['true', 'True'] + allowed` will not work).


while that works, it does not fully document what is expected from caller. And AFAIK using tuples like that is not very common. I'd prefer something like:

  def foo(allowed: Optional[List[ElemType]] = None):
    allowed = allowed or []
    ...


That example with the Chinese characters brings up a point that seems to have gotten lost or actively suppressed, likely by those with other motives: what may look "unreadable" to you may in fact be something that is perfectly normal to billions of others. Yet why are people constantly told to emphasise "readability" in the code they write, which really refers to a lowest-common-denominator dumbing down approach? Shouldn't the onus be on the reader to educate oneself sufficiently in the language to understand code that one thinks is "unreadable"? Moreover, unlike natural languages, programming languages are defined exactly and systematically, so there is no ambiguity whether a piece of code is valid or what it does.

What is "unreadable" to some could be perfectly readable to others; and what is readable to some could still be readable but unnecessarily verbose and tedious to read to others. We shouldn't let how we write code be dictated by those who barely know the language and want to drag us down to their level.


Which is why I think languages need UTF-8 identifiers and arbitrary UTF-8 operators.

Most people working on webdev in the USA should never touch them and should feel free to ban them in their linters.

But people who speak and write foreign languages with other alphabets and domain experts and people who just want to use π or µ or ∆v or whatever because they feel it makes their code clearer should be able to.

Mathematicians who have PhDs should be able to write arbitrarily complicated heiroglyphical operators that I probably won't choose to understand in this lifetime. And that's okay, I don't need a language that attempts to guarantee that I can always read it, that is probably literally impossible and just constrains it unnecessarily.


One case where UTF-8 identifiers are needed is (Danish) tax calculation and legal concepts which do not have an equivalent version in English.

I.e. you have a name/concept which cannot be translated into English. If you enforce "all names must be English" then you end up with a made up word.

If you are an English only speaker, try imagine that you were not allowed to use "w" in any variable name as it would be considered a "too weird" letter.


Coq has an interesting compromise, where the actual code passed to the compiler is in ASCII, but one can define sequences which have alternative renderings (usually to attempt to visually match existing math notation). For example, one could write

Theorem demo: forall (x: nat), (exists y, x = S y) -> x <> 0.

In many editors, this will get rendered as

Theorem demo: ∀ (x: ℕ), (∃ y, x = S y) → x ≠ 0.

This looks a lot nicer for mathematicians. (S here is Peano successor.)

One minor nuisance is that, without an additional plugin, you can't actually write source code literally using the Unicode symbols (they're normally display-only, not part of a source file). The plugin to allow one to type them explicitly (with a Unicode-aware input method) does exist, but would make one's source code not portable for other users.

Also, this is really specifically for math, not internationalization, and doesn't provide a helpful model for people wanting to type identifiers in non-ASCII-supported scripts.


Lol, and they say Perl is bad.

https://www.perl.com/pub/2000/01/10PerlMyths.html/#Perl_look...

This isn't meant as flamebait, I'm just pointing out you can write obfuscated code in any language.


> a, b = b, a

> (don't know why this is so famous, I never, ever used that in prod)

This allows you to sort an array in place. It's heavily used in Go:

https://godocs.io/sort#example-package


A variant is useful to update two (or more) dependent variables in one go without having to introduce temporary variables. For example to calculate the greatest common divisor using the Euclidean algorithm:

    def gcd(a, b):
        while b > 0:
            a, b = b, a % b
        return a


Swapping variables around is also really handy for patterns like "alter list a until stable", especially with the walrus := operator allowing you to avoid declaring b outside of the while loop, keeping the intent clearer.


Yeah but in Python list.sort() takes care of that.


True, but every once in a while we want to sort something that isn't a list and/or with a custom sorting algorithm.


You can always write something like {var tmp=a;a=b;b=tmp;} in any language if you want to sort in place.


> in place

that doesn't mean what you think it means.


Do you know what it means? It seems irrelevant to the discussion, introducing a new variable won't make you unable to sort in place. You could even allocate memory on the heap and still sort in place.


> Do you know what it means?

yeah, it means no temporary variable.


No, it means the original data structure gets modified. In this example, the contents of the array you pass to the sort function will be sorted, without copying to a new array. Some overhead is acceptable, if only because you need loop counters.


What I find interesting is that you call someone out on not knowing a term, then you're being called out, then you respond claiming you know the term and explain the term… And at no point did it occur to you to actually duckduckgo it and make sure you're correct.


If you're creating a new variable with allocated memory, your process is no longer "in place". Sorry I didn't graduate from the school of duckduckgo like you did, instead I studied the language specification.


Apparently they taught you there to never verify your knowledge not even if multiple other people point out you're wrong. A strong, religious-like conviction is all you need, it's not like you could possibly be in error. Of course a web search engine cannot lead to valuable information, therefore it is reasonable to ridicule anyone who uses it.


The difference here is, a,b=b,a will use one less bytecode, by taking advantage of ROT_TWO


We had a dev in our team who would love to use these one liner cryptic expressions and challenge the minds of code reviewers.

It was all fun until he left and others had to refactor bunch of code. One liners like that often have very little benefits IMHO.


As a Perl guy none of that is even slightly distressing to me


This is neat!

  img = ["r", "g", "b"] * 10_000_000
  iterator= iter(img)
  groups= zip(iterator, iterator, iterator)
I've been writing Python for 15 years and it had never occurred to me that I could do it like that. Honestly I think I'd have reached for more_itertools.chunked.

(FWIW In R I think I used to set an array dimension on the vector and process rows.)


Unpacking used to be more powerful. I was mostly oblivious to the Python community when I started working on a project and discovered that you could unpack tuples in function heads, like Erlang, and started using that frequently.

Then I discovered I was using Python 2 and it was EOL, and that Python 3 had dropped support for that “because no one uses it.”

Maddening.


Python is slowly morphing into Perl... :-D


Someone™ should make a Python-to-Python compiler which converted all normal constructs into weirdness like this.


For me, one of chatGPT's most frequent uses is to convert all of these one-liner statements into logical multi-line code that my feeble brain can process.


> Inverting variables:

Non-native speaker here and even after 20+years every once in a while it bites, but I know that as 'swapping variables'. Where does the inverting come from? Related to the 'inverting a tree'?


It's possible they are trying to avoid making people think of memory swapping which is existing jargon and usually means something more complex.


I also know it as “swapping” variables and I’m a native speaker.


most of these are loooong time there. list[:] replacement is since ever. (used for reversing a list, when there wasnt .reverse method)

But my beloved one, is the sequence of these changes, in time:

ver.134: x = ('a', 2)

some change in requirements made 2nd value unneeded, so it became:

v167: x = ('a',)

which is one day, "optimized" to:

v193: x = ('a')

Funny thing is, and x[0] will still deliver 'a' as it was before. Now if that 'a' is not constant but some parameter, and it's not 1 character long string..


The third one is fundamentally different though. You probably know all this, but I thought it'd be a good idea to mention to prevent any confusion for other readers.

Your second example:

    >>> x = ('a',)
    >>> print(type(x), len(x))
    <class 'tuple'> 1
    >>> print(type(x[0]), x[0])
    <class 'str'> a
x is now a tuple with 1 element, which is the string 'a'. (You don't need the parentheses, by the way: 'x = a,' works just as well. The comma is necessary though! Some people think tuples are created with parentheses, but parentheses are actually not required, except in the special case of the empty tuple.)

Third example:

    >>> x = ('a')
    >>> print(type(x), len(x))
    <class 'str'> 1
x itself is now the string 'a'. The parentheses didn't change anything; you can further 'optimize' to x = a without any change in meaning. x[0] still works because, as you mention, by accident x is a string with one character. It doesn't work in other cases:

    >>> x = ('gotcha')
    >>> x[0]
    'g'
    >>> x = (3)
    >>> x[0]
    Traceback (most recent call last):
      File "<pyshell#59>", line 1, in <module>
        x[0]
    TypeError: 'int' object is not subscriptable


Oh man, I seriously just wretched at the idea of putting an import in a lambda. That was sick, and awesome.


Python has so much potential for doing the most horrible things. I love it. I love seeing what people come up with.


Until you find them in production. Then you git blame and go take a long look in the mirror thinking about what you've done.

In the early aughts the terseness of python seemed great. Now as I age I find i have less time for it.


I feel the exact same way. Python gave me so much flexibility and escape hatches when I did dumb design things as a novice. Now as a veteran I just want a boring, strong static language with clear boundaries and behavioural expectations


I just always found it nice as a prototyping language. If you have an idea of how you want to do something or solve some problem, create a working prototype in python. Then create a more polished version in another language.


I don't think terseness is the issue. It's the fact that it lets you do anything, even if it's a terrible idea. `__subclasses__` is my favourite example of that.


数字 is numeral. It doesn't make sense as a translation for numbers in that function.


Fun syntax gotcha that I saw recently was a YAML MAC address that were all numbers like 11:22:33:11 which Python was parsing as a sexagesimal number so some libraries were throwing type errors


God, is this going to be what I see in my next python coding interview?


Idea for a new linter; throw a warning if it sees extreme non-intuitive uses of Python. Given the trajectory of Python improvements over the past few years, this linter may not be far off.


If so, then I'd be happy to fail.


Lol, your definition of “valid syntax” is wayyy different from mine. I wouldn’t ever use crap like that, it’s too unreadable to be “valid syntax” even if the interpreter can handle it. Can we all collectively agree to include our own brains in the definition of “valid code?”


Valid code != best (or even good) practices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: