I'd like to first clarify that this is still in the super-early stage of development; none of this is shipped or finalized yet. The feature hasn't even requested approval to ship at which point these kinds of issues would be brought up. We take web-compat very seriously. If this breaks even a small percentage of pages, it won't ship. Part of the shipping process is ensuring we have at least a draft spec (W3C,WHATWG) at least some support from other vendors.
Sorry, the explainer text came off more dismissive than I intended. I wanted to get something implemented so we could start experimenting and see how this would work in the wild. #targetText= is a a first attempt at syntax, any criticisms or data on how this might break would be appreciated.
From my (limited) understanding of how fragments are used like this today, the way this would break a page is if the page itself was using "targetText=..." and parsing the result. This is something we can and will measure and see if we have a naming collision. For pages that use a "#var=name" type fragment, we could append "&targetText=...".
> That said, it seems easy to make this backwards compatible: if the existing #blah syntax is a valid link, that should take precedence.
That's indeed how it works. The majority of the compatibility concerns appear to be with apps that a using _custom_ parsing of the fragment to perform app-specific logic, which is a valid concern.
Why is this not being pushed to become part of the web spec first? That seems more reasonable than pushing a feature with the possibility of breaking pages that are spec compliant.
It will. The specification process needs to be informed by implementation and experimentation. When implementing a feature we'll learn all sorts of things and hit bumps that will help guide the design. Once we have a working implementation, it helps to be able to use it to answer things like:
- How does this perform on existing pages
- How does this feel from a user perspective
- How can I write pages that user this
Specification-up-front is very theoretical in the absence of an implementation. IMHO, where it's (rarely) happened in practice, the specification is often unimplemented. The value of a specification is that it allows other browser vendors to add an interoperable implementation.
Again, I'd like to stress, this is in the very early-stage experimental phase. We aren't dropping a feature that'll break the existing web.
Edit: To clarify, "implementation" does not necessarily mean shipped to users.
> Edit: To clarify, "implementation" does not necessarily mean shipped to users.
Literally just today we got an article about Web Bluetooth on the front page of HN, a user-shipped capability in Chrome that is still in the Draft phase of standardization.[0]
Beyond that, we have the Web USB API, which is also in the draft phase, but is of course shipping to users in Chrome, on by default.[1]
Beyond that, we have HTML imports, which have been rejected from the standard but are still shipping to users in Chrome.[2]
I don't doubt your intentions, but if you think that Chrome is going to wait for standardization on this before it ships, you are not paying enough attention to the teams you're working with. And once Chrome ships a feature to users, the web standards body has basically two options: accept Google's vision of the standard wholesale, or change the standard and break websites that are already using Google's implementation.
I would be more confident and trusting of the process you describe if there was some kind of official commitment from the Chrome team that this feature will stay behind a browser flag until the standardization process is completely finished. But I think some of the reason you're getting immediate pushback to an extremely early draft of the spec is because developers don't trust Google not to ship this on-by-default once it reaches a WD stage.
I feel like the actual, underlying issue worth bringing to the standards body is: how do we address arbitrary content on a web page, even when it moves around or changes over time? Aside from these neat Chrome links, this would enable some super interesting features, such as the ability to add persistent and shareable annotations to webpages.
This is my only gripe. Please work with a spec first, so that the feature flows naturally through to other vendors. Especially because it's changing the way URLs work.
You're going to get people generating millions of these links around the web. That's a long term legacy of hyperlinks generated by the citizens of the web. Although the fallback is obviously pretty harmless, what about a future feature? If this doesn't work out, but some other feature down the line does, suddenly thousands of these links start working in weird ways. Or worse, the feature doesn't happen at all, because of the legacy of broken links. I know that's pessimistic, but URLs are the foundation of the web, changing how they work should be funneled through the spec.
Historically, web specs often start more descriptive than proscriptive. Building a spec without a reference implementation is a great way to build an unimplementable spec.
Thank you very much for replying on this thread. It's absolutely a very useful feature and - when done in a standardized and privacy conscious way, I think it would absolutely be an enrichment for the web platform. (Can we extend the same for images, too, btw?)
I think the reason this sparked concern is because (by using fragments) this intrudes into a field that was previously under full author control.
I think clear guarantees about which aspects of a webpage are the responsibility of authors and which are under browser control are important - only going by real-world usage and assuming everything not directly used is free for the taking is not enough here.
E.g., I think there are failure modes for SPAs that are not easily found with a usage search. [1] Additionally, this would make it harder to know for new applications which kinds of fragment identifier are "safe" to use and which are not.
There seem to be some existing specs that deal with the same problem [2].
Maybe those could be a starting point for the feature to to forward without interop/responsibility problems?
I’m not sure what internal resource exist or what access is available to your team, but I would think that Google’s search indices would be the best resources on the planet for analyzing existing URL fragment design patterns. It seems like you could classify various implementations and run tests against each of the major groupings so that you can be confident that xxx.xxx% of sites that currently use fragments will be supported by this design.
It’s an exciting possibility to link more directly to resources. I hope it is implemented in such a way that all browsers can follow those links with parity in the future. If that is the case I have a few hundred thousand outlinks that could be refined for clarity.
There was a time several spa frameworks used explicit hash routing rather than url-like push state but I don’t have sources handy. I thought angular was one of them but they may have done away with this behavior.
I work in analytics so I’ve seen things like UTM params get couched into the hash, breaking parsing, so at least it’s something to keep an eye out for.
At Microsoft, I worked on a popular Web app that for various reasons had to resort to using the # to enable deep linking. If you put anything in the hash it will break things. We assumed an empty hash unless it was something we set.
Just to clearify your phrasing. Are you saying that it would be ok of this breaks less than 20million webpages? Because that is still a hell of a lot of pages.
I'd like to first clarify that this is still in the super-early stage of development; none of this is shipped or finalized yet. The feature hasn't even requested approval to ship at which point these kinds of issues would be brought up. We take web-compat very seriously. If this breaks even a small percentage of pages, it won't ship. Part of the shipping process is ensuring we have at least a draft spec (W3C,WHATWG) at least some support from other vendors.
Sorry, the explainer text came off more dismissive than I intended. I wanted to get something implemented so we could start experimenting and see how this would work in the wild. #targetText= is a a first attempt at syntax, any criticisms or data on how this might break would be appreciated.
From my (limited) understanding of how fragments are used like this today, the way this would break a page is if the page itself was using "targetText=..." and parsing the result. This is something we can and will measure and see if we have a naming collision. For pages that use a "#var=name" type fragment, we could append "&targetText=...".
I'm not tied to any particular syntax here so if I'm missing why this is a monumentally bad idea, please file a bug on the GitHub repo: https://github.com/bokand/ScrollToTextFragment/issues