Skip to content

Use `ripgrep-all` / `ripgrep` to improve search in Dolphin

Wednesday, 2 October 2024  |  Jin Liu

In the next release of Dolphin, the search backend (when Baloo indexing is disabled) will be faster and support more file types, by using external projects ripgrep-all and ripgrep to do the search. Merge Request

What are ripgrep and ripgrep-all?

ripgrep is a fast text search tool that uses various optimizations including multi-threading (compared to grep and Dolphin's internal search backend which are single-threaded).

ripgrep-all, quote its homepage, is "ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.".

How to enable it

Install the ripgrep-all package from your distribution's package manager (which should also install ripgrep). Then Dolphin will automatically use it for content search, when Baloo is disabled.

If your distribution doesn't provide ripgrep-all, you can also try installing ripgrep. Then Dolphin will use it for content search, but without the additional file type support.

Limitations

  • It only works in content search mode, and when Baloo content indexing is disabled. File name search still uses the internal backend.

  • It only works in local directories. When searching in remote directories (e.g. Samba, ssh), the internal search backend is used. Although we can run ripgrep in remote directories through the kio-fuse plugin, testing shows it can be 3 times slower than the internal backend, so it's not used.

  • It doesn't work on Windows. Although both ripgrep and ripgrep-all have releases for Windows, I personally don't have Windows experience to integrate them. Merge request to enable it on Windows is welcome.

Customization

You can change the command line with which Dolphin calls the external tools. Copy /usr/share/kio_filenamesearch/kio-filenamesearch-grep to ~/.local/share/kio_filenamesearch/, and modify the script there. The script contains comments on the calling convention between Dolphin and it, and explanations on the command line options.

One option you might want to remove is -j2. It limits the number of threads ripgrep (and ripgrep-all) uses to 2. Using more threads can make the search much slower in hard disks (HDD). I tried to detect HDD automatically, but it's not reliable, so I went with a conservative default. It's still faster than the internal backend, but if you have an SSD, you can remove the option to unlock the full speed of ripgrep.

You can also use a different external tool. (E.g. the silver search (ag). Or a full-text search engine other than Baloo) Just make sure it outputs paths separated by NUL. Usually a -0 option will do that.

More customization

You can even modify the script so that you can specify different external tools in the search string. For example, you can insert the following code before the original code that calls ripgrep-all:

...(line 1-33)
    --run)
        if test "$2" = "@git"; then
            exec sh -c 'git status -s -z|cut -c 4- -z'
        fi
...

Then if you search for "@git" in a git directory, it will show you changed files.

FAQ

If a malicious app can write to ~/.local/share/kio_filenamesearch/, it probably can just delete all files in you home directory, without involving Dolphin at all. A script executed by Dolphin doesn't have more power than a script executed by the malicious app itself.

Also, there are already a lot of places in your home directory that a malicious app can create a script in, and it will be executed later without you noticing. E.g., ~/.bashrc, ~/.config/systemd/user, ~/.config/autostart, to name a few.

The threat is real, but I believe the solution is to prevent apps from writing to arbitrary places in your home directory without your consent. If your apps are sandboxed (e.g. via Flatpak) so they can't write to ~/.config or ~/.local by default, and you only use trusted apps like Dolphin and Kate to manage files in these places (so you trust them to not modify files behind your back), then the scenario in the question is unlikely to happen.

Future works

There are quite a lot to improve in Dolphin's search (when not using Baloo). The content search should also search in file names. The search string is currently interpreted as a regular expression, but a fuzzy match or shell globbing seems to be a more sensible default (probably with regexp as an option). Hopefully future works will address these issues.

Comments