I disagree with your point 2 very strongly. A lot of people have a mental block when it comes to 1-line functions, and they really shouldn't, they can often be very helpful for a lot of reasons.
Generally, the most useful tool in refactoring is to use extract method relentlessly, even into 1-line functions. More often than not, a well named function with well named parameters is one of the absolute best and most reliable ways to make your code as close to "self-documenting" as possible. But there are additional advantages as well.
Spinning things off into function calls often makes refactoring easier. Because it discourages bad habits like local variable reuse (using variables as scratch pads) and excess mutable state. It also encourages a more functional style, which has a lot of benefits especially in multi-threading but also in the ability to reason about and determine the correctness of code. It can also make it a lot easier to see when you need to spin off functionality into other classes/modules. For example, if you have a member field and you have a ton of little one line snippets everywhere that anything happens with that field, that usually means you need to extract it out into a class and have those one-line snippets be various methods on that class. When all that code is inline, this can be harder to see.
Additionally, there are lots of mental blocks that happen when you have inline "one liners". There's a bias to avoid "cluttering them up", even when it's necessary. For example, when you need to add error checking and error handling. When you've spun out the code into a method call, you tend not to have that block, you just add the error handling code because it's trivial, and going from a 1-line method to a 5-line method isn't a big deal. But bloating up another function by taking a sweet one-liner and replacing it with 5-lines of code can seem like clutter, so often times you just have an aversion to doing so. Additionally, sometimes those "sweet one-liners" can be more readable when they're broken up into multiple lines, and this is again something that is trivial and straightforward when it's spun out into its own method but will see resistance when the code is inline. This is true even if the code remains a single statement broken up into multiple lines to enhance readability.
A lot of the time these patterns aren't obvious. Because the end result of spinning off one-liners is often not one-line method definitions, it's short methods that are a couple lines with several lines of extra error detection/handling. Or it's separate utility classes to encapsulate handling of special member fields.
The idea of ensuring that you always have "meaty" functions is unhelpful, what's important is thinking about things from a perspective of abstraction, modularity, readability, and correctness.
If you have a one-line function that is used one place (or even a few places) and it doesn't mean it is easier to understand just because you made it look like English. More likely you just obfuscated it: update_string_with_value(string,val) is NOT easier to undestand than string.append(val).
I really and truly hate code where people think they need to take the programming out of programming. This goes all the way back to when they learned it was a good idea to comment their code "next = ++i // increment index then assign".
> Spinning things off into function calls often makes refactoring easier.
You're breaking a rule in keeping things simple. Don't anticipate. You are refactoring ahead of the refactoring, most likely making the wrong decision in the process.
And you're also coding defensively against your fellow developers. This tends to be a problem in low- to average-skill workplaces. Instead why you just follow cheakins and show your junior developers what they are doing wrong? That is how I learned to program from some amazing developers. It wasn't from them coding defensively against me, but from the tap on the shoulder I got asking me why I did something on my last checkin.
And I'm not advocating "meaty" functions. They should be divided on logical pieces of functionality that aren't so small where every line qualifies, but not so large as to span more than a page or more do a single thing. Before they even hit the page mark, most likely other reasons to will dominate though like the single responsibility principle or something else.
>is NOT easier to understand than string.append(val).
But this depends entirely on the context of the code. I like to think of this in terms of "abstraction levels". Mixing abstraction levels harms readability and so if your code is doing some low level operation then string.append(val) should be obvious from the context and the name of the function it belongs in. If however the code is still at an abstract enough level then you want to wrap your implementation detail in a name that is appropriate for that level of abstraction.
As a rough example, in a function that's popping from some data source and pushing into a local structure as a part of some larger unit of work, wrapping your collection.append(val) would be harming readability. But if you're "enrolling" a "student" into a "class" you should be doing class.enroll(student) even if the implementation is just class.students.append(student_id)
> As a rough example, in a function that's popping from some data source and pushing into a local structure as a part of some larger unit of work, wrapping your collection.append(val) would be harming readability. But if you're "enrolling" a "student" into a "class" you should be doing class.enroll(student) even if the implementation is just class.students.append(student_id)
After calling class.enrol(student), I would expect the student to be enrolled, not just added to a list of students.
> After calling class.enrol(student), I would expect the student to be enrolled, not just added to a list of students.
Why do they have to be different? That's really his point. In this situation, enrolling just requires adding their student id to the list of students in the class. But while doing 'class.students.append(student_id)' accomplishes that, it is not clear reading that line that it actually enrolls the student in the class, rather then just being one step of a larger process, or something else all-together.
If they are not different then you're half way to going full Active Record (which I'd consider an anti-pattern). There is likely much more to enrolling a student than adding them to a class, and most of that logic should not go in class.
"is NOT easier to undestand than string.append(val)."
Disagree 100%. But of course as everything else in our industry this is super subjective. I'd much rather a line of combinators which read like English than a collection of function calls with wacky parameter calls. YMMV on this though and I understand that.
I also want tiny functions and those functions to be moved far away. I want the "english" names as close to the Real Work as possible. If I really care what the activity is I can use the IDE to drill in. But again, I understand YMMV on this one.
People who argue this topic as if there's a ground truth are the true problems. Everyone has they're own take.
> "People who argue this topic as if there's a ground truth are the true problems. Everyone has they're own take."
Exactly. It's like design patterns, algorithms, and data structures. There are lots of techniques useful in programming. The trick is having the experience and judgment to know when to use the right one.
I feel like you're arguing against something else. It should be obvious that I'm not saying "ALWAYS USE 1-LINE FUNCTIONS". My point is that a phobia against small functions is both extremely common and typically harmful to code quality, for all the reasons I described.
If it were easy enough to give someone a simple checklist of guidelines and have them produce high quality code, coding would be a lot easier, but it doesn't work that way. Coding well requires years of experience that produces the wisdom and perspective to know when and to use different techniques. It's what tells you "it's unnecessary to comment an i++ line", it's what tells you when and how to use one data structure over another, and so on.
And no, I'm not "refactoring ahead of the refactoring", I'm refactoring where it makes sense, even down to a single line. That's the point. To properly compose things, and to avoid the bias against small functions. Not to atomize a program into infinite function calls, of course.
Most of the code I've come across that is below the quality I'd like tends to have excessively large function definitions, precisely because there is such a widespread antipathy against splitting things up "for no reason" and against small functions. When it improves readability, when it improves the composition and modularity of the system, when it makes it easier to update, modify, improve logging and error handling, etc. those are all good reasons for extracting methods. The rare times I've seen code that required excessive jumping around to figure out what anything did it wasn't because of too many small methods, it was because of overzealous application of bad design patterns (e.g. call a factor method to produce a configuration object then instatiate some other factory and pass the configuration object to it to get a factory that produces what you want, that kind of nonsense). That's indicative of a failure of judgment.
My point is, don't be afraid of small functions when they make sense. There are too many people who are afraid of small functions because, like you, they have these strong biases against them based on prejudice. How do you know when a small function is a good idea? Judgment and experience, of course, same as always.
> If you have a one-line function that is used one place (or even a few places) and it doesn't mean it is easier to understand just because you made it look like English. More likely you just obfuscated it: update_string_with_value(string,val) is NOT easier to understand than string.append(val).
We agree your sample has been obfuscated.
/// <summary>
/// Case Insensitive IRC Name
/// </summary>
public struct CIIrcName {
private string Data;
public static implicit operator CIIrcName( string s ) { return new CIIrcName() { Data=s }; }
public static implicit operator string( CIIrcName n ) { return n.Data; }
public string ToLower() { return (Data??"").ToLowerInvariant().Replace('{','[').Replace('}',']').Replace('|','\\'); } // Silly scandinavians :<
public static bool operator==( CIIrcName lhs, CIIrcName rhs ) { return lhs.ToLower() == rhs.ToLower(); }
public static bool operator!=( CIIrcName lhs, CIIrcName rhs ) { return lhs.ToLower() != rhs.ToLower(); }
public override bool Equals( object obj ) { return obj is CIIrcName && (CIIrcName)obj == this; }
public override int GetHashCode() { return ToLower().GetHashCode(); }
public override string ToString() { return Data; }
}
Every single function of this is a one-liner. Are any of them obfuscated? (IRC servers sometimes talk about the same user or channel in multiple cases, this was a quick drop-in to prevent these from being mistaken as different users/channels by my IRC client. The "Silly scandinavians" comment references the fact that, yes, IRC thinks '[' is lowercase '{'. And yes, I've seen that matter in practice.)
Most of the changes I'd make here have nothing to do with method length:
1) Conversion (at least back to string) should probably be made explicit at this point (was originally implicit for ease-of-use when replacing string s)
2) ToLower could be made private (only used inside the type) and have the comment replaced with the above blurb (instead of being a comment to myself that only I'll properly decode.)
3) Rename CIIrcName to CaseInsensitiveIrcName?
4) Rename Data to OriginalCaseName?
> And you're also coding defensively against your fellow developers. This tends to be a problem in low- to average-skill workplaces.
My experience has been just the opposite. Low/average skill tends to mean less asserts, less static analysis, ignored warnings instead of warnings-as-errors, no unit tests, no thread safety annotations... forget my coworkers - I add these things to code defensively against future-me, and I've found it super effective on larger codebases.
> This goes all the way back to when they learned it was a good idea to comment their code "next = ++i // increment index then assign".
We agree this is bad too. A different perspective though - those are perfectly reasonable comments when you're new enough to programming that you don't remember what "++i" does! But it's something to grow out of and remove.
> "My experience has been just the opposite. Low/average skill tends to mean less asserts, less static analysis, ignored warnings instead of warnings-as-errors, no unit tests, no thread safety annotations... forget my coworkers - I add these things to code defensively against future-me, and I've found it super effective on larger codebases."
Bingo.
And this is another great reason to have more, smaller methods: it's far better for unit testing. When you have big, chunky function defs it can be a chore to figure out how to reach into the method and test one particular aspect of it. When your code is well composed into functions that are logically consistent, you can write unit tests that target each piece individually.
I can't really comment about your specific use case, but I would wonder why you are having to do these conversions and comparisons on the fly so much that you a full class dedicated to them and you don't have the the name normalized ahead of time. But I don't really know.
But like I've said in all my comments. There are times and places for all these maxims and rules to be broken. That's why programming is difficult, especially that is why finding the right abstraction so that it looks easy is difficult. The best programmers I know always seem to find that right balance and you look at their code and go "I could do that" when in reality you couldn't.
> but I would wonder why you are having to do these conversions and comparisons on the fly
I receive a weird mix of normalized and unnormalized input over the network from servers and software I don't control, and want to keep the unnormalized versions for display purposes.
I could normalize on construction - might save a tiny bit of performance? - but then I'd need to store and keep in sync two separate strings (one for display, one for comparison.) Not that I have any real mutation going on that would violate that invariant in practice.
> so much that you a full class dedicated to them and you don't have the the name normalized ahead of time.
The general pattern is: Parse the network message (containing mixed normalization), then immediately lookup (or create) the target channel/user - allowing no duplicates due to mixed normalization.
With this I can simply construct e.g. Dictionary<CIIrcName,...>s and not need to remember to normalize every time I want to query by a given key. Only about 5 members that reference this type but there's a good 20-30 key lookups/equality comparisons using those members, handling different network messages among other things. That list will probably expand.
I could've wrapped dictionary access instead, but this seemed simpler (as dictionaries aren't the only thing I'm using, and none of the existing code wrapped dictionary access.)
I'd normalize on instantiation. It is both a performance win, smaller code, and much simplier. Chanel names don't change so you dont have to keep them in sync in any real sense, and if they do, the setter has a very trival job of setting both the real and normalized name.
I disagree with the use of lots of small 1 line functions, especially when doing algorithmic work.
As the article mentioned, the thing you spend most of your time doing is reading. Using lots of small functions means you have to bounce all over your codebase to figure out whats actually going on. In procedural languages a complex 'meaty' function often needs to be understood in one conceptual bite. Needing to bounce all over the codebase to figure out what those little functions are actually doing is hugely distracting while I'm trying to read.
I often start those big functions with a block comment explaining what I want the computer to do ("// The overall goal is X, which needs Y and Z to be true. There are 4 cases to consider: A, B, C and D through which these invariants need to be true..."). Then I write the code. The code itself will be delineated with comments calling back to the documentation ("// Case B - the wendigo. Note we're being careful here to make sure that x is always < y"). If I find myself duplicating blocks inside that function, I'll pull those shared blocks out. But single lines? When I'm reading the code later I'm going to want to track exactly what happens through the function at a mechanical level. I want to be able to see the invariants, and track exactly how each value flows through. And for that, the less scrolling and bouncing around my codebase I need to do the better.
Often separate cases are cleanly distinct enough that it makes sense to extract them into their own functions, but sometimes they're not. And thats ok too.
In a sense you want to split things up based on how you expect to read it. If I have lots of tiny functions that are only used once or twice, I'm probably going to forget what they actually do and need to read them to understand the code that calls them. In that case, the code should just be inlined.
Sure, but that totally depends on context. A function called `renderFooter()` is great, even if its just 1 line. I know exactly what thats going to do.
But a call to `setImpulse(spaceship, 0, somevar)` might just set member variables ximpulse and yimpulse, or it might make a bunch of calls to the physics engine, or both, or something else entirely. If I'm trying to understand some code that calls setImpulse I'll probably end up looking it up just to be sure. So in that case, if it does just set the member variables I'd prefer to just write `spaceship.ximpulse = 0; spaceship.yimpulse = somevar;`.
Completely agreed. Ultimately its about readability, and replacing code with meaningful names is always a boon for readability. I don't understand people who have an aversion to adding more abstraction--every time you can replace any section of code (even one liners usually have multiple parts to them) with a meaningful name is an increase in readability as your code-specification moves closer to the language-specification of the project.
Of course there are bad abstractions. A simple test is anything that doesn't immediately refine your code into something more closely resembling the language-spec. The goal should be code with just the right amount of abstractions such that they map to this language-spec with the least amount of friction possible.
If one wants to treat all expressions as functions, they're far far better off with a proper functional language built on lambda calculus and the power of full function composition primitives.
Most projects fail badly trying to twist OOP to get there. The code patterns are so predictable, its not even a surprise anymore.
That’s the pro-short functions side of the debate, but there are a few counter-arguments that you glossed over.
More often than not, a well named function with well named parameters is one of the absolute best and most reliable ways to make your code as close to "self-documenting" as possible.
It might make a single function’s implementation nicely self-documenting, but what really matters is how readable and maintainable the overall code base is.
When we’re reading the code calling that one-liner function, in most languages you can’t see the names of the parameters, only the order in which they are provided. Sometimes the meaning will be obvious anyway, or it won’t matter, for example if you have a commutative function with two parameters. Sometimes the meaning could be made obvious but hasn’t been, for example the infamous boolean parameters where the call winds up being
do_something_with(true, false, true);
because no-one defined more specific types/constants to use instead. And sometimes, there is no inherently “natural” order to the parameters, so replacing a one-liner that was unambiguous with a function call where the parameter order is unclear to the reader introduces uncertainty.
Another potential danger with using lots of very small functions is that while each function individually may be nice and clear, the number of relationships between those functions is much greater. This can cause a great deal of frustration to a reader who has to keep jumping to another part of the code to decipher each “clearly named” one-liner function. It can also make it harder to examine the behaviour in context when testing or debugging, for example because log functions don’t have enough information available to produce complete, self-contained messages, or because every time you hit a breakpoint in the debugger you have to walk 6 levels up the call stack to figure out what’s going on at the time.
Finally, I suspect the benefits of one-liners potentially expanding to include error handling are mostly illusory. How often can that one-liner really take any useful recovery action at such a low level, and how often will it just need to return some type of error value or raise some type of exception anyway, so that code further up the call stack with more information or resources available can deal with the problem effectively?
The idea of ensuring that you always have "meaty" functions is unhelpful, what's important is thinking about things from a perspective of abstraction, modularity, readability, and correctness.
OK, but readability and correctness are also global properties, not just local ones, and the benefits of modularity are closely related to how much complexity is encapsulated and hidden away by the extra abstraction, but one-liner functions rarely increase the level of abstraction very much.
Generally, the most useful tool in refactoring is to use extract method relentlessly, even into 1-line functions. More often than not, a well named function with well named parameters is one of the absolute best and most reliable ways to make your code as close to "self-documenting" as possible. But there are additional advantages as well.
Spinning things off into function calls often makes refactoring easier. Because it discourages bad habits like local variable reuse (using variables as scratch pads) and excess mutable state. It also encourages a more functional style, which has a lot of benefits especially in multi-threading but also in the ability to reason about and determine the correctness of code. It can also make it a lot easier to see when you need to spin off functionality into other classes/modules. For example, if you have a member field and you have a ton of little one line snippets everywhere that anything happens with that field, that usually means you need to extract it out into a class and have those one-line snippets be various methods on that class. When all that code is inline, this can be harder to see.
Additionally, there are lots of mental blocks that happen when you have inline "one liners". There's a bias to avoid "cluttering them up", even when it's necessary. For example, when you need to add error checking and error handling. When you've spun out the code into a method call, you tend not to have that block, you just add the error handling code because it's trivial, and going from a 1-line method to a 5-line method isn't a big deal. But bloating up another function by taking a sweet one-liner and replacing it with 5-lines of code can seem like clutter, so often times you just have an aversion to doing so. Additionally, sometimes those "sweet one-liners" can be more readable when they're broken up into multiple lines, and this is again something that is trivial and straightforward when it's spun out into its own method but will see resistance when the code is inline. This is true even if the code remains a single statement broken up into multiple lines to enhance readability.
A lot of the time these patterns aren't obvious. Because the end result of spinning off one-liners is often not one-line method definitions, it's short methods that are a couple lines with several lines of extra error detection/handling. Or it's separate utility classes to encapsulate handling of special member fields.
The idea of ensuring that you always have "meaty" functions is unhelpful, what's important is thinking about things from a perspective of abstraction, modularity, readability, and correctness.