Hacker News new | past | comments | ask | show | jobs | submit login
Atuin replaces your existing shell history with a SQLite database (github.com/ellie)
551 points by thunderbong on May 6, 2023 | hide | past | favorite | 193 comments



I made a script that uses atuin to get previous commands related to your current command line - latest commands ran in the same session, in the same directory, in other sessions, latest commands for the same executable etc. then feeds it into GPT and streams the replies to fzf so you can choose the best autocompletion (or it can fix problems in the line you've written already as well). On Wezterm and Kitty it can also get the terminal screen contents error messages and so. Because of the streaming reply the first autocomplete line is ready quite soon after the keyboard shortcut is pressed.

Have been putting off pushing it to Github, think I'm gonna do that today.


Well it's published. Wasn't ready to publish and didn't have time to clean it up but I'll get back to it in the coming days, there's also some extra features on the way but disabled currently. Works quite well for me though.

https://github.com/TIAcode/LLMShellAutoComplete

Forgot to add needs tiktoken, openai and fzf. If someone knows how to do that command line query and replace on other than fish/nushell, please let me know.


Isn't that like... ultra slow?


It's usually ~two seconds or so for the first line to come out of OpenAI. You can see the example video in the github, though I think that was little slow.

If you have big atuin file though, creating indexes for session and cwd are good ideas so we can get the request out quickly..

edit: but much faster than searching or asking LLM for command line parameters instead


Copilot is highly usable and I'm pretty sure it works the same way.


Copilot also runs a small model on the users machine to quickly provide simple predictions.


Interesting, I didn’t know that.


That sounds awesome! We've been meaning to set up a "community showcase" section. I'd love to feature this! If you ever publish it, post a discussion!


Thanks! I'll do that after I've had more time to finish some features and clean the code up in few days. But I've released a crude version at https://github.com/TIAcode/LLMShellAutoComplete


That sounds awesome, if somewhat overengineered.

Though the idea of mistakenly putting credentials in the cli and it ending up in GPT is bothering me a bit.


Probably paranoia, but I’d be uncomfortable feeding it a bunch of domain and server names, along with any other more interesting params that might sneak in.


Does it incorporate the return code of the commands to get an approximate good/bad rating? I wonder what percentage of the CLI mistakes I make return zero anyway because it's a valid command that I simply misused.


It doesn't, I think it should but not sure how to add it to the prompt yet - I have a feeling that if I'll just add them after or before the command line, GPT will at least occasionally add hallucinated return codes to the autocompletion. Maybe I'll just add the unsuccessful codes or something, but it needs some testing.


Cool. Neat tool idea.


For me one of the most common issues is when I write a regex but I have some of the escapes the wrong way around etc. So I’m giving a valid regex, but it’s not matching what I tried to match. And then I have to change it around a couple of times before it works.

One reason for this is that different programs have different rules for what should be escaped when, when you write a regex. For example, I think grep is a bit different from vim in this regard


Hm... Now that I think about it, regex tools on the command line haven't been a goto recently. I used sed, awk, perl -e, et. al. constantly from the late 90s until maybe 2015 but since then I'm more likely to pop open a ipython repl or whatever and avoid those weird inconsistencies altogether. Also, developing on the old scp->LAMP stack setup required more shell script glue than the more automated contemporary setups I've been using.

I'd probably suck at it. No matter how frequently I use regex in :ex commands, I always screw up the escapes, substitutions, etc.


Please link, thanks!


I’ve been using atuin for a while and have been thinking about something like this. Please do post to GitHub!


>"...then feeds it into GPT..."

explain this please


It feeds the information from atuin database as a prompt for OpenAI, like: "Latest calls for the same executable:\ncmd1\ncmd2" (I should work on my prompts, doesn't actually look that optimal, oh well). Then at the end gives the current command line and asks for few options how to finish / replace the line, with some extra requests for GPT (like don't write anything else except the command line etc).


I have been using sqlite database since 2017 [1], has over 100,000 items in the database at this point. The database is almost 0.5GB, but I also use Full-Text-Search capabilities of sqlite. 2 years ago I have built a mac application [2], that syncs items via iCloud, and it only works on the macOS.

I would highly recommend anyone who spend a lot of time in terminal to improve their shell history by using atuin or similar tools, cannot tell how many times it actually helped me to find some important information there, about how I did one thing or another.

-

- [1] https://www.outcoldman.com/en/archive/2017/07/19/dbhist/

- [2] https://loshadki.app/shellhistory/


At 0.5GB that's 5k per entry -- what are you storing for each?

For comparison, my (non-work) history since 2012 (plain text) is 181k entries, and takes 25MB. I store the command along with when and where you ran it. (https://www.jefftk.com/p/logging-shell-history-in-zsh)


Ah, a fellow packrat! I have every command I ever typed into a shell since around 2005, and my history weighs in at 1 CD or 650MB (as of a couple of years ago)

I'm probably being wasteful of space because I store each session in a separate file. I used to do a lot of data analysis at the shell back in the day, and found it useful to audit sequences of commands afterwards for mistakes, or to turn them into scripts.


This is so insane that I love it. Do you also save your belly button lint since 2005? Or nail clippings? :)


I'm only a digital packrat. Bits are so much cheaper to hoard, even deciding to throw something away is often more work.


This is more like saving your old notebooks and drafts, only that they don't take any meaningful space. Or like having a revision control system.

Do you rebase your git repos regularly to delete commits older than 6 months?


Do you regularly back up your history?


Oh yes. It gets backed up along with everything else.


As a lot of people mentioned. This is FTS index. So it is definitely way more blown up. Plus I do save a lot of additional information with it: pwd, session id, shell used, exit codes, whole command obviously. And to support icloud, also additional information for icloud entity id. And now when you point out, 5k per entry is a lot of data. But I am on with that. This information really important for me.


Not the op, but I'd guess it's the full text search index.


For 100k entries you can grep them instantaneously; there's no need to maintain an index.


Grep only works if you want an exact string match. If you want to find words out of order or support features like stemming, fts is necessary.


Maybe I have some sort of disease, but while reading "find words out of order or support features like stemming" the regexs for that immediately flashed before my eyes, so I think "necessary" is a little strong there.


FTS is not the same as regex.


I don't think I said it was. I was addressing the specific use cases mentioned. If there's another use case you think is important in searching command line history, feel free to describe it.


> feel free to describe it

Didn't they already? eg stemming


Most stemming use cases are trivially solved with a regex. That's the point he was making. The difference between a beginner and expert with regexes is quite a lot.


Ahhh, interesting point.

"We could learn advanced regexes... or we could just use FTS5".

Hard call. :)


Maybe! Full-text search is great for text. Command lines have some things in common with text, but they definitely aren't normal text. E.g., punctuation is much more significant. Stemming may not be appropriate. Case matters. Word boundaries are different, and many of the significant lumps aren't really words.


Well, I suppose what's trivial for me might be advanced for you :)


For regexes, definitely. ;)


With a small enough corpus, full text search does not require an index to be instantaneous, and 100k entries is easily small enough for that.

Additionally, everything you describe can be phrased as a regular expression.


Sometimes it's nice to not manually write a regexp to find all of the variants of every word or deal with arbitrary ordering of substrings. And if you're using SQLite and fts5 is installed, why not just create a virtual full text search table with one command and use that? With a small enough corpus, it's a meaningless distinction to bikeshed about the implementation: the easiest solution to build is the best. 500MB of disk space for a pet project that gives you convenience is a terrifically small amount of storage. I have videos that I recorded on my phone that take up more than double that.


Not defending the idea of a db history but no db schema is going to beat plain text's 1 byte 0x0A per line delimiter.


*cough* compressed row data *cough*


*cough* log rotate and gzip *cough* :D


Touché. :)


Wow so it looks like I'm in the minority in having shell command history disabled? I have a small number of commands (20) that go into the live history of login shells for convenience, but nothing gets saved when I log out.

If there's something I do repeatedly, I make an alias or a function for it.


I think you have an unusual take on this. Why do you disable history?


People's history files are a great place to look for passwords and other secrets, mainly. I suppose that risk could be reduced by having the history file encrypted on disk, but I don't know of any shell that does that (can't honestly say I've really looked though).


I get a ton of value out of having my shell history available (both for search but also to try to reconstruct steps of what I did yesterday when my memory is hazy)

I guess you could set up an entropy scanner and flag history lines that have high entropy, but that might not be enough (low-entropy secrets) and might be bothersome (lots of false positives / things that are technically secrets but that you don't care if they're in your shell history).


The problem I see is that if I ssh into 5 different remote hosts in a day, most of the commands are not executed on my host and thus not part of the local (or shared, distributed) history.

I suppose this could be solved with either:

- Some kind of modified ssh that sends back the commands to my host

- Some kind of smart terminal that can analyze commands to build up the history

Any ideas on how to practically solve this problem?


Every terminal has text input, but keylogger might be the simplest solution.


A keylogger doesn't see the results of tab completion.


You could install atuin on all your remote hosts, but that's not always practical


I guess we need a PostgreSQL backend.


same concern. I will probably use atuin local because it seems so cool. but the beauty of the shell is that it is by default so universal and portable and small, so I don't like the idea of dependencies for my use of it on remote machines. mentally I've gotten used to the idea that my local shell env is a very different beast than "normal " shell.


There's also McFly which does the same thing.

https://github.com/cantino/mcfly

I've only used McFly and found it to be pretty great. My only complaint is the default search mode is SQL strings, so you have to use `%` for wildcards. I wish it was a more forgiving, less exact search.

Has anyone used both and could compare them?


+1 for this request. McFly user, keen to know how/if atuin is better...


+1 for this request. McFly user, keen to know to difference


No one ever addresses the most important problem: how to separate commands that one would like to retain (preferably indefinitely) vs garbage commands (like cd, ls, cat, etc.) that should better be wiped in a few days.


With bash's HISTIGNORE, I can consciously prefix my command with a space to prevent it being added to history.

ls I usually don't care about, but there are directories I regularly cd to, so it would be nice to have those in history.

I can think of a neat heuristic, which is that I often cd to an absolute or home directory, so if the path starts with / or ~ I'll possibly want to cd there again in the future. Changing to a relative path on the other hand, I tend to do more rarely and while doing more ephemeral work.


1. I don't always know beforehand if the command I am about to execute is garbage I'd like not to save.

2. I just don't want to be conscious about that every time I write a command. I'd rather edit history after I've finished some work. But that's just too tedious to do manually, I'd like to have some pre-configured heuristics applied automatically, like "never save cd/ls to history", but provide a way to overrule that rule in rare situations.

3. Absolute/partial/symlinked paths - are another separate problem :'(


* HISTCONTROL=ignoreboth

* prefix with space the commands you don't want to keep

* edit your existing .bash_history by prefixing all commands with space (then reload with history -r)

* after session exit or on next login, edit the new commands at the end (vi ~/.bash_history && history -r)

* use comments on the commands to make it easy to search (use keywords)

* group command lines by category (e.g. package manager, git, ssh, backup, find, dev)


While that’s certainly useful to cull trivial commands, it can also behoove the user to remove commands done wrong.

Coming back several months later to be gifted with several, very similar commands of which only one is right can be frustrating. The history records the failed tries as well as the successes.

Mind, the errors give a place to start, but if it’s far enough removed from the original event it may well be ambiguous enough to send you to a search engine anyway, especially if you have the memory of a goldfish like I do.


> Coming back several months later to be gifted with several, very similar commands of which only one is right can be frustrating. The history records the failed tries as well as the successes.

The correct command is almost always the last one, so as long as your search results are chronological this shouldn't be an issue?


During search, we do remove duplicates. It's not a bad idea though and I'll see how we can support it


This is also sub-optimal, as it causes another problem: some of the commands are part of a bigger sequence (the most import property here is that items inside sequences are ordered, the order of commands matters!), so by blindly deleting duplicates - you break sequences.

In SQL there are sessions and transactions. In shell history - we don't have such entities and this sucks. One could configure their bash/zsh to save history into separate files, but you can't teach them later to source them properly (retaining session awareness).


Oh, we don't actually delete anything - just deduplicate for search.

Sequential context is something we're building very soon. The idea: you search for "cd /project/dir", press TAB and it opens up a new pane in the tui. This will show the command +/- 10 commands. You can then navigate back in time.

This could indeed be useful for managing that one setup command you always have to run in this project dir but never remember the name of


> Oh, we don't actually delete anything - just deduplicate for search.

Good to hear, but the point stands: so you deduplicate only for the view, not in the source, and thus the source remains contaminated with duplicates (at very least they cost some disk space and increase seek time).

As for the view: so, since you deduplicate the commands - you can't lookup the context (commands executed before & after)! Because each time the now deduplicated command was executed in the past - it had its own context!


As the sqlite schema below indicates, each command has a unique id and a timestamp. Whether you want duplicates removed depends on what you want to know. It might be nice for the UI to expose a time context, which would retain duplicates. (Maybe it does! By coincidence, I just installed this yesterday, and I hardly know anything.)

CREATE TABLE history ( id text primary key, timestamp integer not null, duration integer not null, exit integer not null, command text not null, cwd text not null, session text not null, hostname text not null, deleted_at integer,

unique(timestamp, cwd, command) ); CREATE INDEX idx_history_timestamp on history(timestamp); CREATE INDEX idx_history_command on history(command); CREATE INDEX idx_history_command_timestamp on history( command, timestamp );


Why put effort into removing them? They use a trivial amount of disk space.


Disk space is the least of the problems garbage entries in command history cause.

It's not just commands that are actual for the current session (like ls/cd/cat), it's also incorrect commands or just commands not worth retaining still being saved among with the useful commands.

The most precious is user's time. And when you fzf part of the command with the regularly saved history and would like to re-execute something important - you'll first get a long list that you'll first have to filter to find the command you were seeking.

So to counter your question with another: why store garbage?


A typical command I run is never going to be something I look up again, so I would prefer to optimize for writing instead of reading. Dumping every command to a file adds no friction to my regular work, while attempting to categorize garbage commands would add a lot of friction.

Also, when I do want to reference my deep history I often find that seeing the full list of what I was doing is helpful at getting myself back into the frame of mind I was in when I ran the commands originally, which can be more valuable than seeing exactly which commands I ran.


> A typical command I run is never going to be something I look up again, so I would prefer to optimize for writing instead of reading. Dumping every command to a file adds no friction to my regular work

Then why write history at all? Just discard it.


Because even with the friction of my history not having been pruned or otherwise tidied it's still a really valuable resource when I do need it!


To improve the signal to noise ratio.

Maybe 3 in every 1k of my command lines is noise. Finding anything useful in there takes effort, so I rarely use anything older than a few days.


This is the feature I've been looking for:

"log exit code, cwd, hostname, session, command duration, etc"


you can mostly do that in bash:

    export PROMPT_COMMAND='if [ "$(id -u)" -ne 0 ]; then echo "$(date "+%Y-%m-%d.%H:%M:%S") $(pwd) $(history 1)" >> ~/stuff/logs/bash-history-$(date "+%Y-%m-%d").log; fi'
makes files like

    stuff/logs/bash-history-2023-05-06.log
that contain

    2023-05-06.11:42:38 /home/dv  3737  2023-05-06 11:42:38 cat .bashrc
then you can make some commands to grep through this


Or use the excellent bash-preexec plugin that atuin itself relies on to achieve this in a cleaner way: https://github.com/rcaloras/bash-preexec


Xonsh has this. It’s super cool after using it for a while.


There's a PR opened (that I still need to review...) that adds even more fields such as shell type and opt-in environment variables to this list.

One day, you could search for that AWS cli command run specifically on AWS_ACCOUNT_ID=foo


I always wondered if one could feed history into some abstract parser (or nowadays, maybe a LLM) to extract the most frequent "idioms":

    ls <dir> | grep <pat> | less
vs

    ls ~ | grep .bak | less
    ls .config | grep *rc | less
you get the idea


this might be a way to improve the default nix toolset in a way worth globalizing, it seems like it has the same philosophy. Discovering missing tools... it's like discovering new primes :)


What's it like with many shells/panes in multiplexers open? I often find my history from one or another either lost or not available across different ones.


With atuin, it's available immediately on other local session (panes, windows, tabs etc). There's also remote sync, so after some configurable amount of time, it's even available on other devices.


+ if you don’t like this, you can have it filter by session by default!


Thank you for this awesome tool!


Spectacular, thanks!


Works great! To me the reliable instant history sync between tmux panes is one of the best features of atuin. I tried many things to get this working with vanilla bash and it always seemed flaky.


You can get that with bash with this config:

    # append to .bash_history and reread after each command
    export PROMPT_COMMAND="history -a;$PROMPT_COMMAND"
    # append to .bash_history instead of overwriting
    shopt -s histappend


Except that this doesn't save commands which haven't finished yet, and it never saves the current pending command when the shell is killed (e.g. the user closes the terminal window or logs out or the SSH connection is broken).

Sometimes the rare, long-running commands are the most valuable.

If I set up history software, it should preserve all history, and as early as possible.


I believe it’s a setting. You can choose to save history and either merge history on session exit or immediately.

At least it is a setting for zsh


You’d think this would be even more robust with a sqlish storage backend than the usual text files used for history.


Appending to files should be atomic, so I don't think you'll run into any corruption issues just by appending.


Syncs instantly between them

You can optionally filter by the current window session, current machine, or commands across all machines you have atuin installed on


It looks like it's not possible to export the history to a format the shell can import[0], so if I wanted to try this out, my commands would be locked there.

It looks interesting, so if I'm overlooking a way to do this with fish, I'd experiment with it.

[0]: https://github.com/ellie/atuin/issues/816


You can use sqlite[0] to export the database, or if you want a ui, use datasette[1]. On my mac, the database is stored at ~/.local/share/atuin/history.db

[0] https://sqlite.org/cli.html#export_to_csv [1] https://docs.datasette.io/en/stable/csv_export.html


We’d like to be able to also export history, but atuin doesn’t actually stop your old shell history from functioning either

So you could uninstall it, and you won’t have lost a thing


Sounds good, thank you! I'll start playing with it.


If you just want to experiement with it for say a couple of weeks, does it really matter that the commands from those few weeks disappear if you decide not to continue using it?

Presumably you already have a long history of commands from past years in the original format. And if your command line usage is similar to mine, then most of the commands you will use in the future will be covered by your existing history.

So if a few weeks of experimenting with atuin ends with you deciding not to use atuin, then probably you will be fine going back to the old history files that do not include those weeks of activity.


It’d be pretty trivial to do this yourself though, shell history files are very simple


I like it, if for no other reason than the Terry Pratchett reference.


the turtle moves!


People may be interested in RESH, I use it and love it.

https://github.com/curusarn/resh


I found atuin introduced friction where there wasn't any before, so I stopped using it and reverted to ripgrepping zsh_history.


Do you remember what that friction point was?


I'm not the parent commenter, but the friction point for me is the slowness when typing the first few characters in an interactive search (I have a large history). I think the searches are synchronous with each keystroke, right? It would feel a lot faster if each keystroke could cancel an in progress search instead of waiting for it to finish.

In fzf, there is no noticeable lag when typing.


There is a noticeable delay experienced from when I press the up arrow and the list appears. This was the largest friction point. Then, once the list appears, it needs to be filtered to reach a desired pattern. Narrowing down the selection and experiencing real-time shifting in results distracts from the goal of history finding. I found these hurdles lead the results that still were not as accurate as simply ripgrepping against zsh_history.


I have been using this, which is similar, for a few years:

https://github.com/larkery/zsh-histdb


I always wondered why Linux devs didn’t embrace SQLite wholesale for every single tools. Except maybe the kernel because I don’t know anything about the kernel.


This is so, so good. Can't believe I hadn't heard of it. Got the same instant "Eureka!" moment I did with ripgrep, fzf and z.


I think Magical shell history is a better title.


While sqlite is pretty cool technology, it doesn't quite deserve the title of "magic".


The joke was because of the Terry Pratchett reference.

https://discworld.fandom.com/wiki/Great_A%27Tuin


It took me way too long to realise that's where the name was from...


Great idea I’d like to see more full text indexing of the home directory. Shell commands, configs, code files all deserve to be indexed.

Compared to Windows & MacOS – lack of cli and gui full text indexing is a real setback for linux/unix

apropos , locatedb and this are solid domain-specific attempts. Hopefully this expands to indexing a lot more content.


Is there a way to highlight matches? Also, does someone know how to change the date format to full? Instead of „5 days ago“. Finally, an observation: I cannot use my Emacs keybindings to kill the line or a word (backwards), when I am in search.


Excellent! I had been thinking of building something along the same lines when I switched from `zsh` to `fish`, as I had been missing `zsh_stats`. Now I don't have to and can focus on my other side project!


It sounds like it might not work as well for bash. https://github.com/ellie/atuin/issues/909


This one is much more simple and to the point , without needing sqlite. https://github.com/curusarn/resh


Since we're setting up a database, has anyone seen a project to store stdout/stderr in tables too?

Seems more useful in the GPT age, when it could provide an opportunity for conversations about the whole task you worked on. (As well as finding commands based on their output.)


I have a problem where my shell history seems to be… inconsistent or different if I have multiple terminal tabs open.

Is there a way to say, “I don’t care about which terminal a command is written to, just log them all chronologically in the same file”?

Would this SQLite approach work that way?


Why would you want syncing? Different machines have different file layouts and different installed executables. Any common commands are either coincidental or part of some fleet operation better managed through an actual remote management system.


I'm a software engineer, not a sysadmin. My use of the shell is to run my software and git, not setup systems. YMMV


Well, my laptop and my desktop are set up pretty similarly to each other. Syncing is useful for me :shrug:


I stopped using Atuin because you need one additional click to go up the history, and the configuration didn't provide an easy way to change that behavior.

Edit: apparently, they made it possible to disable that behavior and the documentation is much better


I use atuin to replace cmd+r history but not up arrow history, wasn't hard at all to configure


Are there ways to migrate existing shell histories with timestamps into atuin?


yep they have docs on importing https://atuin.sh/docs/commands/import


Yep! We support a whole bunch more too (fish, resh, zsh-histdb, etc), need to update the docs


https://github.com/ellie/atuin/blob/main/atuin-client/src/im...

> // we can skip past things like invalid utf8

No, you can't! Thanks to some bizarre escaping that happens when ZSH (and BASH, I think) dumps commands to the history file, any command with non-latin1 characters will break here and won't be read, moreover - silently! The other possibility is that you'll import wrong characters.

Grep the ZSH source for "unmetafy", you'll see; I have it extracted here: https://github.com/piotrklibert/zsh-merge-hist/blob/master/u...


I decided to implement this using bash. I have a working prototype running on one machine. If anyone wants to help me refine it, let me know and I'll post what I've done on GitHub.


Please do post. I'd love to have a smaller bash alternative to this tool.


- Save a copy of your current .bash_history file. You can import its contents into the sqlite "commands" table later.

- Add both files in this gist to $BASH_HOME.

https://gist.github.com/chmaynard/dbcaed11534dc54bdc90856d18...

- Install .bash-preexec.sh in $HOME (see https://github.com/rcaloras/bash-preexec).

- Append this command to your shell startup file (e.g. $BASH_HOME/.bashrc):

    source $BASH_HOME/bash_history.sh
- Open a new bash window. You should see the message "bash-preexec is loaded."

Enjoy! Please share your feedback in the gist comments section.

(Memo to self: Package this up in a repo with a proper README.)


For people who have tried this, how good is the latency from when I press ctrl-r [first letter] to when I start seeing results? If it's not instant it's just going to frustrate me.


I notice 0 latency (except when using the opt-in skim feature - we haven't properly optimised that yet). Latency is very important for us too. Granted, if you don't use an SSD, then you might encounter some startup lag


Maybe ctrl+r stress uses history command.


I'm interested in this - I've just installed.

Is there a way to import all my existing history from `.zsh_history`? It's a real pain to have to start from scratch.


I’m trying to compare this to mcfly, which does that, and also uses a sqlite db

https://github.com/cantino/mcfly


apparently it’s `atuin import`


Is there a Windows/Powershell port coming anytime soon? Honestly this is the one tool missing from my macOS workflow that I can't cross plat yet.


I think it actually works on Windows/powershell for now. We can't guarantee it because we can't test it, but we have a windows user who is always submitting fixes for windows


I tried it out and I couldn't really figure a way to `atuin init` on Powershell, as it requires one of the supported shells as a parameter.

I've also seen a couple of PRs such as [1] that consider Windows support a dead-end as of 2021, so I assumed it to be a dead-end as well. Maybe this user is on WSL? That'd seemingly work as it can get bash/zsh or anything running there.

[1] https://github.com/ellie/atuin/issues/135


cool tool. i think improvement in this area for non-shell-specific solutions is always good.

one thing i haven't seen yet (correct me if i'm wrong...) is an easy way to get all this stuff to magically appear on a new machine you've ssh'd into for the first time. i've hacked up my own in the past but that's got issues with tunneling and multi-hops. anyone know a solution to this? maybe a feature request?


Not 100% sure how that could work, but definitely very interested in making it happen! I don’t think there’s an issue for it yet, so feel free to open


Has anybody seen any huge performance regressions using atuin?

I installed it and simple commands take noticeably longer. Is there any way to make it faster?


We've heard some users using hard drives or networked FS and having performance issues. sqlite relies on mmap and random access of pages, which can suffer on higher latency drives


How does this compare to McFly?

https://github.com/cantino/mcfly


I see two major differences:

- how active the development is[1]

- having a distributed option (McFly does not have one)

1. Seeking co-maintainers: I don't have much time to maintain this project these days. If someone would like to jump in and become a co-maintainer, it would be appreciated!


Are you the dev of mcfly or atuin? (I keep wanting to call it Alduin. Gamer here.)


At this point databases and filesystems should just merge to offer complete functionality as a standard.



Probably came too early.


First of all, great name.

Second, I am in awe of how good your documentation is and how well you communicate about atuin to the world at large.

Does Atuin offer any features to toggle the capture of commands into its DB? Being able to opt-in or opt-out of Atuin history on a per-command basis would be pretty useful, especially because there is also the atuin sync feature.

I usually work with sensitive information inside a tmux session because, in the default bash configuration, most commands run in tmux never make it into bash history (I believe the last pane to exit is the only one that does make it). It seems I would have to manually go in and drop rows from the DB if I set up Atuin.

One of my products, bugout, has a command called "bugout trap". Not trying to push bugout here, but thought Atuin might benefit from some of the lessons we learned:

1. Because bugout trap is opt-in (you have to explicitly prefix your command with "bugout trap --", it also allows users to specify tags to make classifying commands easy. This is really useful for search - e.g. you can use queries like "!#exit:0 #db #migration #prod" to find all unsuccessful database migrations you attempted in your production environment.

2. bugout trap has a --env flag which gives users the option of pushing their environment variables into their history. This is really useful for programs that use a lot of environment variables. The safest way to use this is to first trap commands into your personal knowledge base with --env, then remove or redact any sensitive information, and only then share (in case you want to share with a team).

3. We thought that sharing would be useful for teams to build documentation on top of. Even we ourselves have very little adoption of that use case internally. We use it to keep a record of programs we run in our production environment (especially database migrations).

4. bugout trap also stores data that a program returns from stdout and stderr - this has been INCREDIBLY useful. I do want to add a mode that makes the capture of output optional, though, as currently bugout trap is unusable with things like server start commands which run continuously.

5. In general, I have found that command line history is very personal and private for developers so collaborative features are going to rightly be seen with skepticism.

Hope that helps anyone building similar tools.

    $ bugout trap --help
    Wraps a command, waits for it to complete, and then adds the result to a Bugout journal.

    Specify the wrapped command using "--" followed by the command:
            bugout trap [flags] -- <command>

    Usage:
    bugout trap [flags]

    Flags:
    -e, --env              Set this flag to dump the values of your current environment variables
    -h, --help             help for trap
    -j, --journal string   ID of journal
        --tags strings     Tags to apply to the new entry (as a comma-separated list of strings)
    -T, --title string     Title of new entry
    -t, --token string     Bugout access token to use for the request


How can I import my previous history into atuin?


You can use “atuin import”

The docs need updating as we support far more data sources now!

https://atuin.sh/docs/commands/import


I want this logo on a sticker.


We do have some stickers available! https://notionforms.io/forms/user-stickers


But at what cost?


What kind of cost? Obviously, backups on the author's server are not something I'd ever do, but the "offline only mode" is there, apparently.

TBH I was thinking about doing this for a while now. History of my shell, which now is 3.8MB in size, is one of my competitive advantages as a developer (and a very nice thing to have as a power user). It accumulated steadily since ~2005, and with fzf searching the history is much faster than having to google, as long as I did what I want to do now even once in the distant past. I even wrote a utility to merge history files across multiple hosts[1], so I don't have to think "where did I last use that command", as I have everything on every host. The problem with this, however, is shell startup time. It started being noticeable a few years ago and is slowly getting irritating. The idea of converting the histfile into sqlite db crossed my mind more than once due to this.

[1] https://github.com/piotrklibert/zsh-merge-hist


This is interesting, thanks I'm checking it out. I was thinking this could be a flash in the pan thing that would then be annoying to maintain, but obviously global history of everything you've done is definitely a super boon to productivity. How do you handle maintaining this as you transition through jobs, machines, etc? 3.8MB is obviouslly trivially small in size so you could store it on a potato, but like what is your workflow around maintaining these one off ad hoc "developer boost" type tools?


> How do you handle maintaining this as you transition through jobs, machines, etc?

Currently, the tool reads ~/.mergerc, which is a JSON file with a list of SSH hosts to SCP history to and from. As long as the history file is in the same place (it tends to be on hosts that I setup, and otherwise I check in default locations) and the host has an entry in ~/.ssh/config, the tool will work. It's really just a wrapper for a few SCP invocations plus a history file (extended) format parser.

Changing servers is just a change in the config file, but it's also helpful for changing jobs, because I can quickly add a bit of filtering before the merging happens. I had to erase some API keys and such a few times, adding `filter` call here: https://github.com/piotrklibert/zsh-merge-hist/blob/master/s... took care of it.

> what is your workflow around maintaining these one off ad hoc "developer boost" type tools?

Good question. I don't have such workflow, at all. When I commit to write something like this, I try to make sure that it has a scope limited enough so that it can be "completed" or "done". In this case, the tool builds on SSH/SCP and a file format that hasn't changed in the last 20 years (at least). So, once I had it working, there was nothing much to do with it after that. The only change I had to do recently was changing `+` to `*` in the parser, because somehow (not sure how, actually) an empty command made it into the file. But that's all I had to do in 5 years time.

I'm not as extreme, but suckless.org philosophy appears to work well here. Here's another example: https://github.com/piotrklibert/nimlock - it's a port, done because I wanted to do something in Nim, but it worked for me for years and I suspect it still works now (after going full remote I stopped needing it). There's nothing much that could break (well, Wayland would break it, but I don't use it), and so there's not much you need to do in terms of maintenance.

As for language choices - these are basically random. I made the zsh-merge-hist in Scala simply because I was interested in Scala back then. I have little tools written in Nim, OCaml, Racket, Elisp, Raku - and even AWK (pretty nice language actually) and shell. That's another reason why making the tools any more complex than what's absolutely necessary would be a problem: the churn in the ecosystems tends to be too high for me to keep track of, especially since I'd need to track 10 of them.

EDIT: I forgot, but obviously the most important "trick" is not giving a shit if these things work for anyone else but me :D

> I'm checking it out

If you have Java installed, `./gradlew installDist` should give you `./build/install/bin/zsh-merge-hist` executable to run. The ~/.mergerc (on the host the tool runs) should look like this:

    {
      "hosts": ["host1"],
      "tempDir": "/tmp/zsh-merge-hist",
      "sourcePath": "mgmnt/zsh_history"
    }
where `sourcePath` is a path to history file relative to the home directory.


One such cost could be database size. I currently have 45k history entries in my database and it sits at roughly 15MB in size due to database indices along with the additional data we store


But it's a cost shared by all open shells, right? Well, even if it was 15MB per shell session, it should still be worth it if the startup and searching is faster.

For comparison: I use extended ZSH history format, which records a timestamp and duration of the call (and nothing else), and I have ~65k entries there, with history file size, as mentioned, 3.8MB. It could be an order of magnitude larger and I still wouldn't care, as long as it loads faster than it takes ZSH to parse its history file.


Yes, this is 15MB on disk size for your whole machine. We agree that the trade off is worth it for the utility and speed improvements


It's FOSS, with optional free service to host your history (encrypted with local key), I think there's a paid option above a free tier

You can also host your own backend


There's currently no paid option. Obviously one should assume such a free service can't be sustainable so you'd be correct to think we should have one.

Currently we rely on Github sponsors as well as our own additional funding


With the server, is it primarily a gateway to the hosted SQLite databases?

eg receives incoming shell history to store in the backend, and maybe do some searches/retrievals of shell history to pass back? eg for shell completion, etc

If that's the case, then I'm wondering if it could work in with online data stores (eg https://api.dbhub.io <-- my project) that do remote SQLite storage, remote querying, etc.


We currently use postgres. The server is very dumb, verifies user authentication and allows paging through the encrypted entries.

There's a PoC that allows it to work with SQLite too for single user setups - and we are thinking of switching to a distributed object store for our public server since we don't need any relational behaviour.


Interesting. Yeah, we use PostgreSQL as our main data store too. The SQLite databases are just objects that get stored / opened / updated / queried (etc) as their own individual things. :)

One of our developer members mentioned they're learning Rust. I'll point them at your project and see if they want to have a go at trying to integrate stuff.

At the very least, it might result in a Rust based library for our API getting created. :D


> we are thinking of switching to a distributed object store for our public server

As a data point, we're using Minio (https://github.com/minio/minio) for the object store on our backend. It's been very reliable, though we're not at a point where we're pushing things very hard. :)


there's this little gotcha you might want to be aware of: https://github.com/ellie/atuin/issues/752#issuecomment-14518...


Can you tell me if my understanding of this issue is correct?

Let's say I run a command where I've pasted in a credential from my password manager: ` some-cli login username my-secret-password` (note space at beginning)

Normally this would prevent the command from getting saved in any meaningful way in my bash history, so that if I later run a malicious script, it can't collect secrets from my bash history.

With the bug here, it sounds like atuin would prevent that entry from being stored in the sqlite store, but it would still be in my shell history?

If so, this is really significant, and would stop me from using Atuin. Not letting users know about this behaviour is incredibly negligent, and honestly erodes my trust in Atuin to consider user security in general.


correct


It sounds serious, but there's not much info in that issue of what's going wrong, why it's going wrong, etc. (?)


it's not serious for most people I guess, but if you rely on bash's HISTIGNORE and don't disable bash's built-in history mechanism when you adopt Atuin, then this is as serious as you are paranoid


er, s/HISTIGNORE/HISTCONTROL/ above


> English | 简体中文

> You may use either the server I host

Right. And isn't this what home dirs are for?


Yeah you could totally sync your shell history if you’re using a NFS share or something, but that’s going to affect way more than just your .bash_history

Why is our localisation relevant/quoted?


I think the implication is that you're controlled by the Chinese government because your software was translated to Chinese?


Well, all the client side code is open source and compilable, and all history is fully encrypted before being uploaded.

So even if we were being controlled, you still can be confident that we can't do anything with your data - all we can see is how active you are, that is until someone finds a way to quickly break xsalsa20poly1305.


either I'm getting old, or there is a lot of energy being spent on quasi-useless "improvements" these days


It's always fair to be critical of these things. However the energy we spend on this is our concern.

At the end of the day, Ellie and I work on this because these features actually improve our workflows. The directory search feature is probably my favourite, and the sync feature is the key feature Ellie wanted to begin with.


On the getting old part, there's definitely a point where someone has enough bagage thay most additional tools are solving a problem they already worked around or solved a different way. For someone new to the field, this tool is on the same footing as the rest and could fit them better.

As an aside, the part I like the most about our field is the ability for a single or two devs to build themselves the tools they exactly need, and potentially share it to the community with low friction.


You must be getting old because you're unable to see the irony in spending the time and energy to make a comment decrying the quasi-usefulness of how others choose to spend their energy.


> being spent on quasi-useless "improvements"

i am not a new kid on the block either but I was looking for such a tool for a very long time. I think distributed shell history across all of my servers is a big win.


Have you considered that you might not be the target audience? Atuin’s ability to sync across computers while being able to separate the context the command was used in has been incredibly useful to my team.


> Atuin replaces your existing shell history with a SQLite database

How can that work. Nobody is going to pay any ransom for just a shell history, and there are ways to get it out of a SQLite database. Wouldn't it be simpler just to encrypt the original .bash_history?


I think the advantage of the sqlite database is that you retain more context for any given command (e.g. what the current working directory was, ...) in a structured way (it is a database after all).

That stored context can then be used to query the database (e.g. filter the history to only show commands that were executed in the cwd).

These queries are the point of using sqlite, not anything security as far as I can tell.


I just don't see how sqlite is up to this; the problem clearly requires PostgreSQL.


No you know what it requires? A 100 person company with 20 microservices and a ceph cluster. /s

I am not sure how many writes per second you have on your shell history, but sqlite is not only up to the task, everything else is overkill.

Additionally not having to run a database, but having your history in a file has advantages for that usecase as well.


sqlite is fine for this usage.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: