Search Inside DOCX Files with ripgrep and pandoc
DOCX files are binary ZIP archives and rg can’t search them directly. The fix is to pipe each file through pandoc, which converts DOCX to plain text (or Markdown) on the fly before rg searches it. Prerequisites Install both tools if you don’t have them: winget install BurntSushi.ripgrep.MSVC winget install JohnMacFarlane.Pandoc Search all DOCX files recursively Get-ChildItem *.docx -Recurse | % { echo "---" echo $_.Name pandoc $_.FullName -t markdown | rg -i 'my search string' } Argument Meaning *.docx Glob pattern, match only .docx files -Recurse Walk subdirectories recursively % Alias for ForEach-Object, runs the block for each file $_.FullName Full absolute path of the current file (required by pandoc) -t markdown Tell pandoc to output Markdown; preserves heading structure so you can see where in the document a match falls -i Case-insensitive search in ripgrep 'my search string' The search query (use single quotes in PowerShell to prevent string interpolation) Show context around matches Use -C to print lines before and after each match: ...