One of the reasons I keep coming back to Firefox is that Firefox seems to already be doing this. At least it searches all parts of the URL and the title, so I'm 99% successful at getting the right URL in the awesomebar when I'm looking for something.
Chrome is just horrible when it comes to this, and I can never get back to previous pages when searching via the addressbar.
On launch this was one of our banner features in Chrome -- full text search over your browsing history. I wrote some of the code. I believe we (Google's Chrome team) implemented (and contributed back) SQLite's full text search support exactly to make this feature.
I don't know why it was removed (after my time) but I do know it had a lot of problems.
1) Most users were unaware of the feature, and it's hard to find a way for them to discover it.
2) The index costs a lot of disk space, which makes #1 worse. There's a whole bunch of tradeoffs around how much storage to use vs how much history to keep. Pages that self-reload can cause the index to bloat endlessly (a real bug we had).
3) Having an index of pages locally is not sufficient to make a useful search engine. There's a lot of ranking involved in making google search good. Similarly we tried to show "snippets" of page text that showed why we showed the results we did and that itself requires a lot of effort to be useful.
#2 and #3 are just bugs that can be fixed with more effort, but it's hard to motivate that effort in the presence of #1.
(FWIW, the sibling comment about how it was removed to favor google.com searches isn't plausible to me, that's not how the Chrome team works.)
I'd be happy if it did it just for bookmarked pages. I have often wanted that in bookmark search and it could be incorporated directly into the bookmark searchbar.
Me too! In fact, when I think about it, bookmarks are such a nebulous thing. It's like I want them to do two diametrically opposed jobs.
On the one hand, when I have a bookmark for a site like HN, I want it to give me a quick way to get to the front page and see the latest posts so I can read and discuss them. In this case, the bookmark is serving like a pointer to an address whose contents are subject to change.
On the other hand, I bookmark pages with specific information I want to keep so I can refer back to it later. This is where bookmarks often fail me. Websites are transient by nature and bookmarks are extremely vulnerable to link rot. What I think I'd really want is a bookmark that saves an archive of the page so that I'll always have access to the information in its original form, even if the page changes or the site goes down. In this case, the bookmark is functioning more like a constant than a variable.
I'm aware that browsers such as Safari have the ability to save a page as a web archive file which includes all the data needed to render the page in its original form. The problem is that this feature completely punts on the issue of managing a set of these web archives, delegating that task to the Finder. I want a UI that is more integrated into the browser. If I bookmark a site it should save and manage the archive automatically. Full text search on all my bookmark archived pages should be built into the address bar. Perhaps in order to avoid the conflict between the two uses of bookmarks described above there could be a difference between favourites and bookmarks.
I don't know, does anyone else have any thoughts about this stuff?
We built Memex for that combination of use cases in mind.
So right now you can already full-text search your bookmarks, and filter by time, domain and tags. Already on the mid-term roadmap we plan to enable full-html/text snapshots of visited pages, both locally and on-demand. For the latter we are potentially working with the Internet Archive.
Agree with #3, I built a similar tool with Rails/Elastic search/React couple of years back. It never returned the most relevant results on top. Realized Elastic search can only do that if I could add back-links for ranking and for adding back-links (I could be wrong) I would have to crawl the entire internet.
One of the other responses here was from a (former?) chrome dev and they said the feature was implemented, had a lot of issues that could be eventually fixed but cost too much dev time to be worthwhile, and nobody used the feature anyways. While the dev said he didn't know why it got dropped, it was likely the lack of use and excess cost.
Well just try it yourself, it’s really that simple.
On the other hand, Chromium (the open-source version of Chrome) does this too so maybe it’s simply a UX decision rather than an evil plot to send more data to Google, though I personally believe that it’s the latter…
Well if you make DuckDuckGo your standard search engine you only get search term suggestion without actually executing the search if you don't choose one of the terms or hit enter with a non-url in the bar - not sure if there is anything negative about this feature
I already have it set to not show any search suggestions since if I wanted that I would use the search box. I want it to not send the data anywhere if I hit enter (by accident, because I pasted what I thought was going to be a domain name but turned out to be something else) and it isn't a possible domain name or ip address. Not super high impact (accidently pasting a password I assume would generate a DNS lookup anyway) but it fells like a completely gratutious invasion of privacy when I have already indicated in multiple ways that I do not want to search from the address bar.
After searching my history for years using the awesomebar, I am pretty confident Firefox also keeps track of the queries themselves, and the choice you make in relation to the query.
Say there's two websites I visit regularly with similar names? If I usually load one after typing three characters, but load the other one after typing four characters, Firefox will present the former first in the first case, and the latter first in the second case.
The "show full history" menu option in History menu they mentioned is Mac only. On Windows, you can see the "show full history" option by holding the Go Back button at top left (hover your mouse on it you'll see a tooltip suggesting this).
The sad part is, that option just opens the built-in History page, just like pressing Ctrl + H, which I believe that's not what we wanted at all.
I think Chrome just redefined "full" while it regularly deletes history entries older than three months.
The one thing I do miss in Firefox, though, is the chronological listing. I find my memory of the rough time, and sequence of events leading to an item, is usually stronger than my memory of the specific content.
No matter what firefox does well, it isn't going to keep converts as long as their default background is so blindingly white.
I still don't understand why firefox refuses to slightly gray out their background by default. Chrome puts less strain on my eyes and that's far more important than anything firefox does well.
Every time I try to stick with firefox, my eyes get strained and eventually I go back to chrome.
I’ve tried to find a solution to bookmark searching for a while. I’ve never found a product I liked or trust. Lately I’ve been manually adding bookmarks to a custom google search engine. I’m considering building an extension that will add them directly or sync chrome bookmarks. I figure google already knows what I’ve searched, so I feel much less sketchy about it.
Would this extension be interesting to anyone? It would be very simple, open source, and have no middle man. It would send links directly to a google CSE via their API.
I would have privacy concerns, because I'm not using Google all the time for my searches and there is no need to give them even more data.
The only solution I'd accept is one where all data is stored and indexed locally and there are good guarantees that deleted entries are actually deleted and/or wiped from the indices.
I’m not suggesting that you have to use google for the originating search. But you would need to use CSE to index any content you wanted to search for later. Arguably you could do this with a new google account if you’re concerned.
A few years ago I would have been excited about this, but I personally won't be giving Google any more data unless I'm forced to. I'd love a self-hosted and local-first option, but either way I wish you good luck on your potential project.
Yup. I’ve read about the product but haven’t tried it. I’ll likely check it out.
My issue with these services is (and maybe this one is different) is that I have to run 3rd party software and extensions and I never know how long these companies/products will stay around.
Yeah as mentioned the tool allows you to do bookmark full-text search.
It's open-source, so it will stay no matter what.
We build for resilience and for us as the WorldBrain.io company not needing to stick around in order for the service to survive. We see ourselves as the stewards of this tool, not the sole benefactor or proprietor.
I get the point of this, but me personally, I prefer to have aspects of me forgotten/gone rather than remembered, stored and searchable in the future. Yes, it's true that, about once every two months, I am looking for something that I swear I came across on the Internet at some point. However, the rest of the time, I'm able to re-find it just by doing another search, whether on search-engine-of-choice, or a search box on particular-website (e.g. socnet, stackoverflow, reddit, github, hacker news...).
OH! Not so fast. After a test drive, I think the Results page needs to have a button to use the same search on the web if no (useful) results are found.
Nice! I hacked something together to do this 12-odd years ago. I had something to save the pages I visit to folder, and then a desktop search tool dtSearch that I bought to index them. It was invaluable when I really needed it, but too awkward to be really useful. Often it ended up being easier to find the page again with Google if I remembered something unique about it.
This extension looks like it could finally make it convenient enough to be more commonly useful, and privacy focused enough that I'm willing to try it. Great work! Are you planning to charge something for this in the future, or for extra features? I would definitely be willing to pay for it.
Oli here, from the team developing Memex.
Memex is open-source, so the browser extension will always stay free to use.
What we will charge for are some of the services that require us to host stuff. Like backups, multi-device syncing, API calls etc. We will run it as a completely modular pricing model, where you can upgrade on only those features you need. We don't like those usual 3 tier model, where you have to upgrade to the 'monster mega plan' in order to just get one feature :)
But you can also completely self-host that, as we will make the server software open-source as well.
Hope Memex can be useful to you.
We are running a crowdfund to support its development, where we offer some good discount on the future features in return: worldbrain.io/pricing
The landing page emphasizes the word "focussed", with double S's, which didn't look right to me. Apparently both "focused" and "focussed" are acceptable, with the single S version "focused" being highly preferred [0].
I nice idea, but an open source browser plugin with all local storage would fit my needs better.
I prototyped something roughly like this several years ago. I wrote a simple Firefox plugin that communicated with a locally running server written in Closure with a Clojurescript web app for browsing that used the same server backend. I stopped working on the because services like Evernote do a better job, at the loss of some privacy.
Edit: I didn’t intend to imply that Evernote reads or uses user data.
Good news :)
This tool is open-source, and runs fully locally (except, as pointed out the feature that enables you to share quotes, because for that stuff you unfortunately need a server, still)
We custom built a search technology on IndexedDB and Dexie.js, which is capable of indexing around 5 years of your personal web-research locally in the browser.
This seems a very interesting project. I'll see how this works in practice.
It starts to get icky when you notice the devs have a business plan to outsource the indexing to the cloud, but there's a commitment to keep the servers parts open source too, so that you can self host.
Yeah indeed, having stuff in the cloud is not ideal when it comes to privacy and centralisation.
One of our core values is privacy and data ownership. So we do our best to make our business not dependent on (analysing and selling) your data, and instead provide you with service value you're willing to pay for.
We are built with interoperability in mind, that will allow you to switch providers of Memex and Memex Cloud without frictions, in case there are breaches of trust, or simply better service.
We follow these values by currently building for offline first usage, where your data is locally indexed and searchable primarily.
With our search technology you'll be able to get up to 5 years of your research done in the browser.
For the cloud part, it is unfortunately not yet possible to do performant search on encrypted data, otherwise it would not be such a big issue to have your index in the cloud. Equally unfortunate is that it comes with a lot of drawbacks to replicate all your data on all nodes, as opposed to have a central point to query. Especially when we are looking at phone usages.
There it is really not practical, so there is a need to have some sort of cloud - UX is still very important. Most people can't be bothered with the drawbacks of decentralised and distributed systems (yet). We hope to get that switch in multiple smaller steps that guide (non-technical) users through a smooth transition to a Memex system that is as distributed as possible. (Check out Dat https://datproject.org/, a technology we likely use to make that first step possible)
And as you already noted, this stuff will be self-hostable. We see ourselves as a service provider first and want to serve people who can't/don't want to run their own server.
A bit like the Wordpress model.
You can read more about our approaches to running this business in our vision post: worldbrain.io/vision
I was wondering about mobile usage too, and agree that having a server somewhere available for queries is the best solution.
Since this deals with such private data, having a possibility for self-hosting is the correct solution.
Wouldn't it be a better alternative to use an autosave plugin for the bookmarks and use a local indexer like DocFetcher or OpenSemanticSearch through the browser? Then you can also search other resources and files.
But a normal web search already has this trait, which is presumably how you ended up on the garbage site in the first place. If you eventually found your result, a more narrow search can help you re-find it a few months down the road.
Yeah indeed, just full-text search can let you end up with a lot of garbage.
This is why it is so important, that you can search for various other "vague memories" to narrow down your search.
What you often remember about an article is stuff like: Did I bookmark it, when did I visit it, did I like/share/cite it on social media
You can already filter by time, tags, domains, bookmarks, and soon also if you liked/shared or even seen it in your newsfeed, or on a friends wall, on Twitter and Facebook.
We gradually expand it so you can search with as much of your associative memories as possible.
Chrome is just horrible when it comes to this, and I can never get back to previous pages when searching via the addressbar.