Hacker News new | past | comments | ask | show | jobs | submit login

i wonder why he's such an ass about it, and totally adamant that it's impossible when multiple players already do this fast. ego?



I think technically he’s correct (I haven’t worked on media decoding code, but I understand how common video encoding formats work). If you have a long video with only a single key frame at the beginning then to step back you would need to, starting from the beginning of the video, decode every frame up to the previous frame you wanted to jump to in order to apply frame deltas, also assuming you have some sort of frame counter to determine when you’ve reached the target frame. In the worst case this does require a lot of compute, but this is an edge case if you primarily care about common video formats with normal encoding settings. I assume seeking backwards is also painfully slow on videos encoded in this manner, so why stepping back 1 frame is out of the question when compared to seeking backwards, I don’t fully understand, it must have something to do with precise frame counts being unavailable on some hardware decoders for some formats (and there being no good workaround) so you _may_ not actually go back 1 frame.

I don’t see any reason it couldn’t be supported for a set of formats with reasonable encoding/decoding settings, and provide some error message for other formats if a user attempts to step back, e.g. reverse frame stepping unavailable for current video due to format/encoding/decoding settings.


> video with only a single key frame at the beginning

I've literally never seen a video like that in my life, but I'd still expect it to work. Just decode everything starting from frame 1. My desktop can decode H265 at 1,666 fps. I can wait.

https://docs.nvidia.com/video-technologies/video-codec-sdk/1...


That blows up pretty quickly though? Even a 10 minute video will take in the worst case ~20 seconds to decode at that rate.

Not really an excuse not to have it (since most video wont be encoded in such an insane way), but the developers owe no obligation to users to implement it.


Theoretically. Practically, 10+ minute videos with just a single i frame at the beginning do not exist.


> If you have a long video with only a single key frame at the beginning then

...you can't support the scrub bar efficiently either, so no one encodes video that way.

Typically to go to a frame you find the last IDR frame before it (and in reasonable encodings those are frequent enough) and decode forward until you get to the frame of interest. Doing that every time the user presses the single frame back button really doesn't seem that bad, and neither does holding onto some extra reference images for at least like 1080p frames. (8k video and such starts getting more expensive but maybe even then start doing all some references after the first press of the frame back button in this GOP or some such.)

It's of course work to do, and I'm not super motivated to send them that patch, and there's the question of it it would be merged and maintained indefinitely, but what folks are asking for is technically possible.

> I don’t see any reason it couldn’t be supported for a set of formats with reasonable encoding/decoding settings, and provide some error message for other formats if a user attempts to step back, e.g. reverse frame stepping unavailable for current video due to format/encoding/decoding settings.

Yeah, this. That's likely more or less what they already do with the scrub bar.


> It's of course work to do, and I'm not super motivated to send them that patch, and there's the question of it it would be merged

That's my issue; he calls for people to send patches, but anyone capable of writing such a patch is also probably going to see that he's not positive on the matter, and that his "patches welcome" is really pretty passive aggressive in this instance. At least, that's how it comes off to me. I would expect that, should I submit such a patch, it would simply be rejected on the basis that "it is not a general solution".


I learned this the hard way.

One other team at my workplace insisted that they can’t make their product compatible with our product, because it would take a team and half year. I knew that it’s a lie, but we convinced them to “allow” us to make for them. I finished - alone - in four days.

It was never merged. It was purely political. It was never about whether it’s possible or not.


There's also a middle ground: Painstakingly describe the solution first, along with its downside of not being general in the same way as some of the existing features (I guess for example seeking back 10 seconds) are not, and ask whether a patch implementing this solution would be welcome before implementing it.


Oh, I already tried that, and it didn't work.

https://forum.videolan.org/viewtopic.php?f=12&t=103604&p=407...

I wanted to report a big about VLC's extraordinarily badly designed "Magnification/Zoom" user interface, so first I searched the forum to see if there was any other discussion about it, which there naturally was.

So I painstakingly wrote up an extremely detailed description of a bunch of interrelated bugs related to zooming and how it terribly interacted with other features like rotation, in response to the VLC development team brushing off another user complaining about its terrible "Magnification/Zoom" user interface, and they brushed me off too because they were too lazy to read it.

They told me to just submit a bug report, but I pointed out that I was describing a several interrelated bugs, which would require submitting many bug reports, which they would have known if they had actually bothered to read what I painstakingly wrote in great detail with step by step instructions about how to reproduce the bugs and suggestions for improvements, so I obviously wanted to discuss them all first to see if they were even worth my time submitting multiple bug reports about, or if all my efforts reporting bugs and trying to fix them and submit patches would be a waste of time, brushed off and ignored like they did to the other users who described the bugs and usability problems they were experiencing.

Jean-Baptiste Kempf himself replied "If you did shorter posts, maybe people will read them..."

To which I replied "if you did less arrogant responses to long posts, maybe people wouldn't give up on trying to help you."

And of course most of the pathologically terrible bugs I described are still there, a dozen years later. And Jean-Baptiste Kempf still continues to act that way.

More details:

https://news.ycombinator.com/item?id=41281153

HN user KingMob's post perfectly summarized my discouraging experience from a dozen years ago, about a set of bugs and usability problems relating to the horrible "Magnification/Zoom" interface:

https://news.ycombinator.com/item?id=41280375

>KingMob 5 hours ago | unvote | parent | context | flag | favorite | on: Mpv – A free, open-source, and cross-platform medi...

>It's because the developer is misconstruing a non-technical decision they made as a technical limitation. The commenters are trying to point this out, which misses the reality that the developer probably isn't going to budge from their requirement of universal support.

>That dev's rationalization also sends a signal to any commenter with the technical chops to submit a PR, that it will probably be rejected for not supporting 100% of the codecs. I have no doubt people who could do it, over the years looked at that thread and concluded it would be a waste of their time.

Jean-Baptiste Kempf still continues to act that way, and still hasn't even admitted to those bugs and usability problems, let alone fixed them or accepted patches from anyone else who did. He just discourages qualified developers from collaborating, and brushes off legitimate requests from users who can't code but fucking well know other video players don't suffer from those problems.


To be fair, as a maintainer I also dread walls of text from super motivated people about details to which I assign very low priority. I’m never an asshole about it, though.


To also be fair, to "Painstakingly describe the solution first" absolutely requires a wall of text to enumerate all the multiple layers of interacting bugs, and give step-by-step instructions for reproducing them.

At least I put in the effort to search the discussion group for an existing thread about the problems I had, and contributed to that thread by supporting other users and validating their complaints, instead of opening yet another redundant thread.

The reason I went into so much detail was that the VLC developers were ALREADY acting like assholes by brushing off other people's shorter less detailed descriptions of the same problems, with glib quips like "The holy grail already exists... built in to OS X."

The zooming built into OS X definitely doesn't solve the problems that they refuse to admit exist with their astoundingly terrible "Magnification/Zoom" interface, so I described the problems for their benefit in the same detail I would appreciate in bug reports on my own open source software, in response to their rudely and curtly brushing off other users with the same problems, who don't all have a background in user interface design and software development and writing bug reports.

If the holy grail already exists and solves the problem, then they should REMOVE the horrible unusable "Magnification/Zoom" feature that breaks even worse when you dare to rotate or flip the video, or better yet they should have never allowed that broken "feature" to be merged into VLC in the first place, because of its ridiculously poor design and implementation quality (like drawing and tracking the gui with gigantic fat pixels in un-scaled, un-rotated video pixel coordinates, instead of full resolution screen overlay coordinates, and ignoring the flip/rotation for mouse tracking so you can't see what you're pointing at, which is negligent and insane).

Ironically, VLC accepting and distributing features like the "Magnification/Zoom" interface certainly undermines their arguments that they don't want to accept other patches because of quality and reliability and usability issues. If they refuse to fix it, they should remove it instead, it's just so bad.

And if I didn't bother going to the effort of describing the problems in detail with step-by-step instructions to reproduce them, I'm afraid that Jean-Baptiste Kempf is so thin skinned and arrogant that he would have brushed off my bug report for that reason too. Just like he CONTINUES to rudely and passive-aggressively brush off and ignore other people's perfectly valid bug reports to this day, 12 years later. He's not going to suddenly change.


that's still a whole lot of yapping that as an end user I don't care about. i can frame scrub forwards and backwards in multiple other apps. right? very weird response from the vlc team in that original thread.


The back and forth in itself feels so weird to me, with so many hurt feelings:

- the devs expressed in no uncertain terms that they don't want to do it (the first answer is just perfect)

- every third comment is about "we know you don't want to do it, but as users why should we care ?"

Well, if you don't care about the devs, on what base are you asking them to care about your specific problem ?


> Well, if you don't care about the devs, on what base are you asking them to care about your specific problem ?

Caring about the user's requirements is part of the dev's job description. Caring about the dev's... anything is not in the user's job description. (one advantage commercial software has: it really does help when there's an interface between the dev and the user in the form of customer support. or a commercial incentive to actually work on what the user wants.)


> job description

Money getting involved would indeed simplify the question.

Here no money is changing hand, so coming up with an angle that's motivating enough for the devs is IMHO the only option. Either bring up an aspect they're not considering that changes the equation for them, or come up with a solution that isn't plaggued by the issues they are afraid to deal with.

That's where I see listening to the devs and caring about their issues to be the only path forward, short of contributing as a dev oneself..


> Caring about the user's requirements is part of the dev's job description.

For OSS project, it's better to assume that the user persona for the software is the devs or the maintainers. The dev-user relationship you expect is actually the vendor-client in commercial software.


> I think technically he’s correct (I haven’t worked on media decoding code, but I understand how common video encoding formats work).

He’s technically simply wrong (I have worked on media decoding code, hell I’m working on a related project today). His player supports seeking back by 10 seconds or whatnot, but he insists that somehow to implement precise seeking to the previous frame, you need to seek from the very beginning of the video, no two ways about it:

> There is not a slight technical difficulty. On a logical level, this feature is algorithmically impossible, except for the extreme: You can decode all frames up to the previous one. But this would be far too slow for anything except really short videos.

It’s obvious bullshit, if you encounter one of those pathological videos (that don’t really exist except in his mind or in some test suites) you just give up after a reasonable amount of time, same as how you give up if you can’t seek back ~10s in a reasonable amount of time.

And players do give up seeking all the time already, not just on these hypothetical one-I-frame-per-hour videos, but real world videos with messed up pts/dts with no reasonable way to go back a short interval.


Wouldn’t creating in-memory key frames for every nth frame resolve substantial computations on a frame-by-frame basis?


There is no reason to start caching previous frames until AFTER the user has paused and pressed the "back frame" key. Only THEN does it need to rewind to the previous i-frame and re-render and cache frames. And there is no measurable cost to remembering the timestamp of the last i-frame, so you know where to rewind to.


you'd have to "rebase" all the other frames to be derived from those


You wouldn't have to any more than you need 0 through N frames in memory to calculate frame N+1. Whatever your decoding state completes at frame N can be considered a key frame.


decoding state can be bulkier than a key frame and opaque to the CPU if hardware acceleration is used

I wonder if caching semi-compressed frames would be more efficient in either case (CPU or GPU)


It doesn’t have to be every frame though. Pre calculate to 25%/50%/75%, then as I’m playing, save key frames for more incremental points, and if I start scrubbing, calculate more around that region.

Edit: this doesn’t have to happen synchronously either, it can occur in a background thread or passively.


Or in the less-than-worst case you could cache the created full frame if no native full frame is encountered for X seconds and then in the worst case you don't have to go to the beginning of the video, but to that cached intermediate full frame?


all that's correct, but it's besides the point since other players are able to do this.


As outlined in some other thread, mpv is not able to stream eg to ChromeCast, unlike VLC. Maybe VLC supports certain things that make the previous-frame thing harder. I suspect it is so, but I don't have insight into the detailed architecture of VLC, unlike I assume the VLC developers. Do you?


Why is their architecture making it harder on them relevant to the question if it's possible? Because the root question is if it's possible, and there's multiple existence proofs that it's possible. Maybe the VLC developers are just tired. I don't blame them. They had to do a whole refactor to get Chromecast support working, and they got no thanks for that. Or maybe it just wasn't enough thanks and they don't feel like doing another refactor. Chromecast support is quite tricky, I've dug into the protocol.

Anyway, I'm not in control of their development, I'm just pointing out that seeking backwards is possible.


He mentions in the thread that he had to delete posts offensive to the developers. Maybe that's why?

In fairness to him, he offers the posters an opportunity to propose a technical solution and responds to all the posts that do it. It is interesting that nobody in the thread went to check in the code of mpv, smplayer, etc. to see how it's done there. Surely this would be the best response to his request for technical suggestions?


> It is interesting that nobody in the thread went to check in the code of mpv, smplayer, etc. to see how it's done there. Surely this would be the best response to his request for technical suggestions?

Maybe because these users just can't code?


> Maybe because these users just can't code?

It would be still interesting that the intersection between the set of users who claim on this forum it's possible and the set of users who can code is empty.


The users noticed that other players can do that, so it wasn't hard to deduce this is possible. You don't need to know how to code to notice that someone, somewhere had done something


I haven't looked into any claims or followed these links, so I don't know what the limitation is.

But abstractly, it's absolutely possible to write a program that decodes frame at a time and displays them slowly.

Now, whether there is some architectural difficulty based on design decisions within their player, or any player, I don't know.

Edit: I guess downvoters have never worked with a video decoder api before? I just read the link and it seems like the rationale is to not seek one frame backwards because you'd have to seek to a key frame and waste some work? That's not the same as it not being possible.


He also accuses other users whose (benign) posts weren't deleted of CoC violations so I'm not going to assume his judgement for deletion was reasonable unless I see the deleted posts.


> He mentions in the thread that he had to delete posts offensive to the developers. Maybe that's why?

Maybe. As those posts have (allegedly) been deleted, it is now impossible to say. It seems probable though. I do find it interesting that he didn't delete the post, spewing actual verbal abuse at the people who dared to propose possible solutions in good faith.

> It is interesting that nobody in the thread went to check in the code of mpv, smplayer, etc. to see how it's done there. Surely this would be the best response to his request for technical suggestions

He has flatly ignored and refused to address, that these other players can do this at all. He makes only mention of "video editors". Well, and YouTube -- cherry picking the easiest case to attack (on grounds of single file format).

At the end of the day, what he needs is an algorithm, which can then be applied against the VLC codebase. For example:

* track timestamp of latest keyframe

* track nframes since latest keyframe

* optionally, keep some sort of unique id to positively identify this keyframe

- now, scrub back to last keyframe (if time accounting is sloppy for this format, overscrub by some amount, the run forward to the keyframe. If overscrubbing is significant, this is where you could compare the keyframe against the reference, to ensure you aren't way far back and needing to run forward further)

- okay, you've found your keyframe; advance (nframes - 1)

- profit

If he comes back and says "that's not fully general", that's true --- but the people asking for this don't care if it's fully general; it's suitable for common use cases and that's what they want. Let it work where it will work. Give up where it won't.

If he comes back and says "sure, that could work, but I don't have time, send a patch", well, okay, that's understandable.

What's actually happening is he's coming back and saying that won't work at all, that it won't support the majority of cases, will take too much compute, etc. and that's just flat out not true. You can do it selectively for the common cases. He might not want to, but that's different from can't.

Like, consider a scenario where you're playing back realtime video over a network connection. You won't necessarily be able to seek forward in that scenario -- you might not have enough video buffered, or hell, the connection could be plain interrupted. Imagine if they just didn't implement forward seek because the solution could not be fully generalized...

And who is going to spend time coding such a thing up, knowing that it is likely to be rejected as "not fully general"?


From what I can see, he's neither an ass nor adamant that it's impossible. He claims it's impossible to universally support the jump within certain technical constraints (which VLC supposedly limits itself to).

The only time I saw him being unpleasant in the thread is when people ignored his explanations and acted very entitled. Correct me if I'm wrong on anything here, but VLC is an independent free software project developed in no small part by volunteers; they have every right to choose their technical direction and compromises, and it seemed the people insulting them were in no position to demand anything from them.


To do this well you need to keep the old frames around...or go back to the previous keyframe and re-render. That might be hard if your design is playback-optimized.


Re-rendering shouldn't be hard, it's just a specialized version of seeking. VLC has seeking.

The claim that you would have to decode all previous frames in the entire video is... completely baffling to see coming from the dev. He's arguing a stupid technicality that a video might not have keyframes. That's not a reason to omit the feature entirely.


The strange thing is that the same argument is true for seeking in general.

Going back from frame 500000 to frame 499999 is in the limiting case as complex as seeking from 1 to 499999, and in most cases far better.

I think the forum thread would be better answered "you do it, I don't need this feature" which is basically the gist of it and is a completely fair answer.


It's not even about doing it, once added you have to maintain it, and then tomorrow a new format arises that makes this more a hassle, or some memory issue in an existing format is fixed in a way that changes it's memory profile and now VLC will crash with an out of memory.

Seriously, I don't get these people that have infinite demands from open source developers and contribute zero.


It's because the developer is misconstruing a non-technical decision they made as a technical limitation.

The commenters are trying to point this out, which misses the reality that the developer probably isn't going to budge from their requirement of universal support.

That dev's rationalization also sends a signal to any commenter with the technical chops to submit a PR, that it will probably be rejected for not supporting 100% of the codecs. I have no doubt people who could do it, over the years looked at that thread and concluded it would be a waste of their time.


If someone really wanted to do it they will be contributing already continually and at some point start working on this with the expectation that they are the one maintaining it. And it would be fine because by then developers would already trust that person since they've been reliable.

It's not about "technical chops", it's about being constantly available, reliable person that shows to contribute day after day. If you don't do that why should a dev make the scope of their work bigger if they won't be able to keep putting the same quality of work?


> why should a dev make the scope of their work bigger if they won't be able to keep putting the same quality of work?

They shouldn't...but that's not at all what the primary developer said.

The issue is the primary dev seems to require reverse seek functionality to work with all codecs, and since some obscure codecs can't efficiently support it, they're not interested. Their challenge to others to submit a PR is counteracted by all the signs that they might provide a high-to-impossible barrier to approval.

It's not clear that even a trusted contributor would be able to sway this person's mind. Most likely, contributors either agree, or keep quiet on the issue if they disagree.


It's telling that there is no other VLC developer commenting in that thread - neither to support the decision and calm things down nor to go agains Remi's decree.


You already have to maintain seeking code, stepping back one frame is just a specialized case for your seeking code.


You already have to maintain code for seeking by time. There's not necessarily any easy way to convert between time and frame count.


VLC knows the timestamp of the current frame. From that information it can seek to a frame that is before the current frame, possibly by just subtracting the inverse of frame rate from the current frame's timestamp and if seeking to that time results in seeking to the same frame, try again a bit older.

I'm relatively sure this can be implemented in terms of timestamp-based seeking. Quite possibly the metadata of the frames is already in the memory, further simplifying the process.


> just subtracting the inverse of frame rate from the current frame's timestamp

Variable frame rate (VFR) videos break this approach. It might seem like an esoteric edge case, TV and movies aren't VFR after all, but VFR is extremely common in videos from smartphones.


> TV and movies aren't VFR after all

There are TV shows that have telecined film segments and also interlaced VFX sections. While this wasn't broadcast as VFR the best way (as in highest quality result) to convert to a fully progressive frame sequence for display on modern displays would end up recombining the two fields for telecined segments (keeping the framerate) while doubling the framerate during deinterlacing for the VFX segments.

But VFR is also irrelevant to the problem at hand since it doesn't make it harder to find the next keyframe before the current frame - you need an index for that anyway.


This is why I also suggested a fallback in case it lands up in the same frame, but yes, it also needs checking that the next frame is indeed the original frame.

I expect it to work unless the frame rate is wildly varying, e.g. 60 fps and 60 spf in the same video. I guess one reasonable use case would be video that's triggered by motion, though. It would still work for almost every video.


You misunderstand. The point is you have to go backwards to find a keyframe, then render forwards from that.

Going backwards might be hard because if you structured your code in certain ways you may not be able to go backwards efficiently. You can "seek", but how far back to you want to seek? A second? Two seconds? current-X frames?

key frames may be in a standard cadence, but they may not be. So again, how far back do you seek to go back one frame? And keyframes may be abstracted away from the player itself, since really, the codecs are the ones that deal with that stuff. For example, I believe mjpeg doesn't do frame differencing (I'm probably wrong about that).

The ideal implementation would save the last X frames then re-render once you go back like X/2 frames. But again, it depends.


I don't misunderstand. The program already has seeking code, and if it needs to aim back slightly so that it can be more precise that's not a massive change to how it already works.

Efficiency is not as important as having the feature at all. "Go back 5 seconds and then run forward to the right frame" is a sufficient algorithm, as long as it can track and combine multiple presses of the previous-frame key. Improvements can come later. Maybe buffering, maybe tracking keyframes, maybe other things. But this is a big case of letting the perfect be the enemy of the good.

If it fails to find a keyframe, that sucks, but 99% of the time it'll work.


Since I can drag the bar backwards in VLC and have it resume playback apparently this is already implemented? This would be a very narrow use of that.


yeah that's a weird edge case that's not really worth considering. it's obviously something that can be done technically even if some edge cases are not performant.


Again: There is no reason to start caching previous frames until AFTER the user has paused and pressed the "back frame" key. Only THEN does it need to rewind to the previous i-frame and re-render and cache frames. And there is no measurable cost to remembering the timestamp of the last i-frame, so you know where to rewind to.


Just a ridiculous idea, could a reversed version of the video be kept around so that if you hit frame backwards it's frame forwards operation on the reversed video so you can show that frame, if you continue playback the head goes to the closest second of the forward video


This would require completely re-encoding the video or keeping all decoded frames around. This is what he referred to as the "unbounded memory" solutions.


That makes sense, thank you!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: