pdfsearch (0.1.dev8+g7bcad92.d20240904)
Installation
pip install --index-url pdfsearch
About this package
PDFSearch
PDFSearch is a small utility that mostly acts as front end to various search engines for (mostly) PDF-files.
Table of Contents
Features
The default search engines implemented are
- pdfgrep
- recoll
- ag / grep (ag: The Silver Searcher with a fallback to grep if ag is not installed)
- find
A preview for the selected search result is shown, if the file is identified as pdf, text-based files or images.
Search results are shown in a list with a little bit of context. Using the arrow keys, the next/prev search results can be selected and the corresponding file will be previewed at the found page (or position for text files).
It also implements file tagging and adding comments to files (entire files, not pdf annotations).
Installation
git clone <this repository> pdfsearch
cd pdfsearch
pip install .
Usage
After installing PDFSearch the pdfsearch
command will be available in the commandline.
(If not, restart your shell and/or add ~/.local/bin to your PATH variable)
When starting PDFSearch, an empty preview and empty result list is shown with a search input on the top left.
Enter the desired search term and press enter or the search button. The default search location is ~/Documents but this can be changed both permanently and temporarily.
To temporarily change it, there are two options, either use the File->Change Directory
menu or
use View-Toggle Directory Chooser
(or drag the left edge to the center of the window) to reveal a file tree and
select the directory to search (The selection is not the 'deepest' directory, it is the one that is highlighted).
To permanently change the initial search directory, see Configuration.
It is possible to search with one or multiple search engines. To select the desired engines, use the toolbar at the top of the window. Select or deselect the ones you want.
Configuration
There are some configuration options to customise PDFSearch.
The configuration file is located in ~/.config/pdfsearch.cfg
They should be rather obvious from their name, but the most important ones are listed here:
[Search]
timeout: The timeout for search engines.
After this timeout the engines kill the program (e.g. pdfgrep)
and the results until this point will be shown.
engines: A comma separated list of engines that are selected at the beginning.
path: The initial search path when opening PDFSearch.
[Results]
limit_num: Set this to an integer to limit the number of results to show in the list. (default: 100)
(This does not affect the search engines, which will search the entire directory regardless)
[Optics]
dirtree_shown: Show the directory tree at startup
tags_shown: Show the tag page at startup
To get the default configuration file use
pdfsearch --write-default-config
and check ~/.config/pdfsearch.cfg
.
Note: This will fail if your config file does exist and is not empty.
You can create a fresh config file with default values in another location with
pdfsearch -c <path-to-configfile> --write-default-config
which will create the config file specified in <path-to-configfile>
.
License
pdfsearch
is distributed under the terms of the MIT license.