119 11 1 It is not clear to me exactly what you are asking but if you want to do grep on PDF files, have a look at pdfgrep. (On a Debian-like system, you can install it with apt-get install pdfgrep .) - John1024 Jan 17, 2018 at 8:00 I know pdfgrep, it is for other purposes. Other Packages Related to pdfgrep. depends. recommends. suggests. enhances. dep: libc6 (>= 2.14) [amd64] GNU C Library: Shared libraries. also a virtual package provided by libc6-udeb. Grep on PDF files Packages are available in Debian and Fedora Linux for pdfgrep. For testing, I grabbed version 1.2 of the Open Document Format specification. Running the following command found the matches for "ruby" in the specification. Adding the -H option will print the filename for each match (just as the regular grep does). It is one of the most used Linux utility commands to display the lines that contain the pattern that we are trying to search. Normally, the pattern that we are trying to search in the file is referred to as the regular expression. Installing Pdf grep For Ubuntu/Fedora sudo apt-get update -y sudo apt-get install -y pdfgrep For CentOS Pull Ubuntu 16.04 and run a bash shell: sudo docker run -it ubuntu:16.04 bash. Update and install pdftk from container prompt: apt update apt install pdftk. On a new terminal run: sudo docker ps -a. Commit the image using the CONTAINER ID of ubuntu:16.04 to a new image with pdftk installed: pdfgrep-2.1.2-2.el8.x86_64.rpm Description pdfgrep - Tool to search text in PDF files Pdfgrep is a tool, that works similar to grep, to search text in PDF files. It tries to be compatible with GNU grep, thus many of the favorite GNU grep options are supported. Pdfgrep can search many PDFs at once, even recursively in directories. A great distinction between grep and pdfgrep is that pdfgrep operates on pages, whereas grep operates on lines. It also prints a single line multiple times if more than one match is found on that line. Let's look at how exactly to use the tool. Installation. For Ubuntu and other Linux distros based on Ubuntu, it is pretty simple: INSTALLATION. Linux x64, macOS and Windows binaries are available in GitHub Releases. Linux Arch Linux. pacman -S ripgrep-all. Nix. nix-env -iA nixpkgs.ripgrep-all. Debian-based. download the rga binary and get the dependencies like this: apt install ripgrep pandoc poppler-utils ffmpeg. If ripgrep is not included in your package sources, get it Here is the command to install pdfgrep utility in Linux. # Ubuntu/Debian $ sudo apt update $ sudo apt install pdfgrep # Redhat/Fedora/SUSE $ sudo dnf -y install pdfgrep. Once you have installed pdfgrep, you can easily search for a text with the following command. $ pdfgrep pattern pdf_file_path. For example, here is the command to search for Ubuntu 20.04, amd64, kernel version Linux 5.6.-1018-oem. pdfgrep has an option --unac . But if I install pdfgrep with sudo apt-get install pdfgrep, command --unac will report "pdfgrep: UNAC support disabled at compile time!" 認識pdfgrep:grep之類的正則表達式搜索PDF文件. pdfgrep 嘗試與有意義的 GNU Grep 兼容。 支持幾個您最喜歡的 grep 選項(例如 -r、-i、-n 或 -c)。 您可以使用它來搜索 PDF 文件內容中的文本。 儘管它不像 grep 那樣預先安裝,但它在大多數 Linux 發行版的存儲庫中都可用。 Using pdfgrep The pdfgrep command can be used to search for patterns in PDF files in a single step. However, it may not be available on our Linux distribution by default, and we'll need to install the pdfgrep package to be able to use it. Once we've got everything set up, using it is very easy: Using pdfgrep The pdfgrep comma
© 2025 Created by PML.
Powered by
You need to be a member of Personal Mechatronics Lab to add comments!
Join Personal Mechatronics Lab