So you think you know pointers?

drblast · on Sept 28, 2010

I wonder if pointers confuse people because C's syntax for pointers is confusing. The confusing examples always involve arrays, where a statically allocated array's behavior is different from one that's dynamically allocated and not necessarily what you would expect as a beginner, or where the vagaries of operator precedence obfuscate the pointer arithmetic.

I think the C++ programming book I first used made it worse by equating pointers and arrays from the start, making it seem like they were interchangeable.

At least for me, no other languages including assembler seemed to generate this amount of confusion over such a simple thing.

cynicalkane · on Sept 28, 2010

C style arrays are the lowest level model for flat arrays that I can imagine. Maybe it's more accurate to say that computers are confusing, and C is just being C.

drblast · on Sept 28, 2010

But computers aren't confusing; the idea of a block of N bytes memory containing M-byte sized elements is dead-simple, and the same whether it's statically or dynamically allocated.

The semantics associated with accessing those elements in C is anything but, especially for a beginner.

cynicalkane · on Sept 28, 2010

Arrays are "dead simple", but addressing their elements can get very complicated, especially if you might want to address the array itself.

shadowfox · on Sept 28, 2010

Can you explain?

kqr2 · on Sept 28, 2010

Further clarification from the C FAQ:

http://c-faq.com/aryptr/aryptrequiv.html

http://c-faq.com/aryptr/aryvsadr.html

http://c-faq.com/aryptr/aryptr2.html

jaimzob · on Sept 28, 2010

The Butt-Ugly Fish Book has a great section on this: http://www.amazon.com/Expert-Programming-Peter-van-Linden/dp...

In fact it's filled with great sections - go read it if you haven't.

pietrofmaggi · on Sept 28, 2010

If you enjoy this kind of puzzle "The C Puzzle Book" if full of them with a clear explanation: http://www.amazon.com/Puzzle-Book-Alan-R-Feuer/dp/0201604612

It's a funny little book that can keep you busy during compilation time.

And the "C Reference Manual" (http://www.careferencemanual.com/) is the best, up-to-date reference to understand how this small language can be abused.

caf · on Sept 28, 2010

If it had declared

     struct { int x[4] } s;

...then &s, &s.x and s.x would all evaluate to the same address, but with three different types.

d0m · on Sept 28, 2010

Damn, I failed the last one.. thinking it would only have been a pointer further.

Robin_Message · on Sept 28, 2010

That seems like an odd feature. I mean, useful in some cases where the size of the array is statically determined, but kind of fiddly and not the same across such unusual things as function calls.

cybernytrix · on Sept 28, 2010

gaaaaaa this made it to HN's frontpage?

CamperBob · on Sept 28, 2010

printf("%p\n", (void*) (&x + 1));

News flash: it's easy to confuse people such as myself who don't know, and don't care, whether the & unary operator is above or below the + binary operator in the precedence hierarchy. Anyone who writes an expression like this without parentheses has been educated beyond their wisdom. Ric has more at 11.

Robin_Message · on Sept 28, 2010

But if you want to read code that says that, you either have to know, or go look it up. If someone writes code like that, I figure they are probably hacking a kernel or something, so they know the precedence rules and expect anyone reading the code to know them too.

Me, I looked it up. He only said not to compile it, not that you couldn't learn something at the same time.

pietrofmaggi · on Sept 28, 2010

Adding some parentesis to C code to clarify the precedence of any operation seems a no-brainer to me.

As usual you marginally write code for the compiler (or to impress you boss on how well you understand the C standard). The Very Important Reader of the code is the guy who fave to read it in the comings years to improve/fix something (and usually is the same soul who originally wrote the stuff).

So make a favor to yourself and write code you'll enjoying read at loony nights (as usual happens when something don't works).

zwetan · on Sept 28, 2010

except that here '&' is for the reference operator

and any C/C++ programmer who have spent few days with the language will never be confused by that

see http://www.cplusplus.com/doc/tutorial/pointers/

---- This reference to a variable can be obtained by preceding the identifier of a variable with an ampersand sign (&), known as reference operator, and which can be literally translated as "address of". ----

drblast · on Sept 28, 2010

"Address of" is confusing, and is what confused me as a beginner, because there is no notion of type associated with an address.

If (&x) returns "the address of x," you'd be inclined to say that ((&x) + 1) is one byte higher than &x, even though the C compiler will calculate the actual answer based on the size of x.

If x is an array as in this example, you'd have to know that C does this arithmetic based on the entire length of the statically defined array, rather than a single element as it would for a dynamically defined array with a pointer to it.

This is completely non-obvious, at least to me.

albertzeyer · on Sept 28, 2010

Well, this is something which is very obvious to me. If you have a pointer to an int and increment it by one, it makes much sense to me that it advances the pointer to the next int and not just by one byte. Otherwise the usage of pointers as iterators would also not work.

Splines · on Sept 28, 2010

Me too. I suppose "+ 1" is ambiguous enough that the compiler is interpreting it differently based on context, but it just feels wrong.

drblast · on Sept 28, 2010

I was taught in CS101, as I think just about everyone is, that the & operator means "the address of."

So parsing &x + 1 into English becomes:

"The address of x plus one."

Except that, more explicitly, we should say:

"The address of x plus one times the size of the type of x."

What you should say is so far removed from the C code appears to say (add 1 to something) that confusion is inevitable.

That the second term should depend on the implicit type information from the first term, especially when the "address" operator has precedence over the + operator, seems backward to me.

To me an address is just a number, it carries no type information. Translate to s-expressions:

(+ (address-of x) 1)

Does it make sense that the 1 would change value based on the result of "address-of"? Not to me. I think it's bad design. Now, if it were taught differently:

(+ (typed-reference-to x) 1)

Then it makes more sense.

zwetan · on Sept 28, 2010

well sure it is confusing if you never used C (as most of the other language does not give access to pointers at all)

but when you see something tagged as 'C', 'pointers', 'challenge' I guess you can fairly assume it is an "advanced subject" for people already experienced in C no ?

psyklic · on Sept 28, 2010

Can you provide an example, from any language, where a unary operator has a lower precedence than a binary operator?

mfukar · on Sept 28, 2010

Precedence does not matter, semantics do: pointer arithmetic is well defined, but taking the address of an arbitrary expression is not (and should not be, ever).

ryanf · on Sept 28, 2010

-3^5 = -(3^5)

Not to defend that guy or anything.

pvg · on Sept 28, 2010

Unary - still has higher precedence in C (over the bitwise XOR operator).

In case you mean exponentiation, these expressions would always be equal regardless of the precedence of the operators.

lepton · on Sept 28, 2010

(-3)^4 != -(3^4) when ^ means exponentiation.

pvg · on Sept 28, 2010

It does for the expression given, silly. And again, in mathematical notation, unary - would take precedence over exponentiation so it's completely unclear what that person's example is supposed to mean and why it has upvotes.

ryanf · on Sept 28, 2010

I was actually originally going to say -3 * 5 = -(3 * 5), as a joke. But from what I could find online, it looks like unary minus traditionally falls below exponentiation in the order of operations for math. See http://mathforum.org/library/drmath/view/53194.html and Wikipedia: "In written or printed mathematics, the expression −3^2 is interpreted to mean −(3^2) = −9."

CamperBob · on Sept 28, 2010

I think we have a QED here. :)

ryanf · on Sept 28, 2010

Obviously there are some cases where you should use parentheses. The reference operator just seems like a strange place to draw the line. I mean, to me it seems like the least confusing part of the quoted line.

CamperBob · on Sept 28, 2010

Yeah, because it's impossible to be a skilled C programmer without knowing this stuff by heart.

/rolls eyes

ryanf · on Sept 28, 2010

There are plenty of confusing order-of-operation things in C, especially around pointers, but is "&x + 1" really one of them? I basically know nothing about this subject and it's still pretty clear that the & would take precedence. Would you balk at that expression if x were an int pointer?

Poiesis · on Sept 28, 2010

I don't like guessing. "Yeah, I think it has high precedence" doesn't cut it any more than "Yeah, I think that is supposed to non-null" does. I try not to make assumptions.

Does this mean I write code so that it can be read by someone with little experience? Yes, generally. Does this mean I am constantly having to verify my assumptions, as if I don't know anything? Yes, it does. There are some things that I'll take a pass on verifying, but if elevates a problem and everything else checks out, I'm going to seart looking at the parts that I "know" are fine.

CamperBob · on Sept 28, 2010

This particular case isn't that interesting because if the expression were equivalent to &(x+1) the compiler would complain about taking the address of a non-lvalue.

But ryanf, while not defending "that guy" (me), made my point for me: if I have to stop and think about whether -3^5 == -(3^5) because some hotshot didn't consider it manly to use parentheses, nobody wins.

If I were writing that expression I'd use (yes, completely unnecessary) parentheses around &a out of habit... the same habit that wouldn't let me write -3^5. Either way, it doesn't burn any more cycles.

zwetan · on Sept 28, 2010

no, but you will certainly never be a skilled C programmer if you don't reckonize a reference

even me a noob in C I can see that it is indeed a reference (as I said I m a noob =))

silentbicycle · on Sept 28, 2010

Sure. In J:

       i.2+$i.3
    0 1 2 3 4

That's "i. 2 + $ i. 3" (iota 2 plus shape_of iota 3), which parses as i.(2 + $(i. 3)). iota returns an array from 0 to n, + adds, shape_of returns the dimensions of a array/matrix/n-dimensional matrix/etc. + is a binary operator ("dyad"), i. and $ are unary operators ("monads", no relation to Haskell).

silentbicycle · on Sept 28, 2010

(Oops. Misread your comment, and missed the edit window. Nevermind.)