Describing 'throwing a hashmap' at the problem as 'premature optimisation' is a ...

didntcheck · on Nov 14, 2023

Yep. I'm surprised anyone hears a problem like this without immediately thinking of maps and sets. I was expecting the naive/"poor candidate" solution to just be memory-inefficient uses of those data structures. The O(n^2) solution is something that you might do when you're just starting to learn programming, but anyone ready to apply for jobs should see that it's not only extremely inefficient, but IMO requires more cognitive effort than just throwing the data into maps and letting the standard library do the heavy lifting (even if that is still not optimal). This is basically an Algo 101 "what not to do" introductory example

rewmie · on Nov 14, 2023

> Yep. I'm surprised anyone hears a problem like this without immediately thinking of maps and sets.

If you want to import a day's worth of access logs into a hash map, you need to take into account not only the time but also the space complexity of this sort of solution.

Depending on how much traffic a service handles and also on the misguided arbitrary beliefs of a reviewer, importing a day's worth of accesses into a hash map can be a gross mistake and deemed a no-hire with prejudice.

This exercise is even dumber if we take into account that something like importing access logs into something like sqlite, querying data, and discarding the database is something that might be as efficient (if not more) and can take a hand-full of lines of code to implement.

Also, personally I'd consider a sign of incompetence complaining about using a proper database to ingest and query data efficiently but also believes that the time complexity of a task which is expected to run once per day is a hire/no-hire deal-maker. Computational complexity matters for code running in a hot path. Beyond that, either you're playing trivial pursuit or failing to understand the basics of what you're doing.

xen0 · on Nov 14, 2023

> If you want to import a day's worth of access logs into a hash map, you need to take into account not only the time but also the space complexity of this sort of solution.

Good thing space complexity is routinely discussed in these interviews.

And the idea that computation complexity only matters for 'the hot path' is way too dismissive.

For instance; a quadratic loop over a 10 million line log file is just a catastrophe (100 trillion iterations), even for something run once a day. Such a log file would happily fit into memory (it's mere gigabytes) and a hashmap would be perfectly adequete.

> Depending on how much traffic a service handles

Yes, it depends. The interviewee is expected to ask, or at least state what assumptions they're making.

vincent-manis · on Nov 15, 2023

I'm sorry, I thought of sorting as my first approach, and didn't think of maps at all. But what do I know? I only taught Algorithms and Data Structures (CS3) 3 times a year for 10 years.

didntcheck · on Nov 15, 2023

So you thought of the O(n log n) solution, not the O(n^2) one? Great. You're not who I'm talking about

marcos100 · on Nov 14, 2023

Exactly. At least you have to show that you know what you're doing and it's deliberate. Depending on the seniority, I expect some kind of justification in the comments, like "it's O(nˆ2), but since the input will be very small, it's ok".

In real life people do a lot of O(nˆ2) code without realizing, and usually it's just some unnecessary loop inside another loop. I want to know that you care about some things.

zzyzxd · on Nov 14, 2023

I have grinded leetcode and worked at FAANG, so I understand the common signals interviewers look for. But that's only because I spent a lot of time on FAANG style interview prep. I don't know why the interviewer had to give that awkward silence -- The candidate, who is likely under abnormal pressure, may start thinking about a bunch of other things "is there a bug in my code? did I miss some edge case?" I am one of those people who become very nervous in a live coding interview and usually can't perform as good as I do in my daily work. There was one interview when I embarrassingly forgot some basic syntax of a language that I used every day for years.

I don't understand why this has to be a red flag. What's wrong with just saying "it works, but can you make it faster"? As an interviewer, I say this a lot and I never judge candidates on this at all.

TheCoelacanth · on Nov 14, 2023

Not even large inputs. More like medium or even the larger end of small. A quadratic-time solution can often start to be noticably slow with just a few thousand items.

A realistic dataset for the problem they descibed could easily be tens of millions of records even if you're not a Google-sized site.