regex - Powershell search matching string in word document -
i have simple requirement. need search string in word document , result need matching line / words around in document.
so far, search string in folder containing word documents returns true / false based on whether find search string or not.
#error reporting set-strictmode -version latest $path = "c:\morlab" $files = get-childitem $path -include *.docx,*.doc -recurse | where-object { !($_.psiscontainer) } $output = "c:\wordfiletry.txt" $application = new-object -comobject word.application $application.visible = $false $findtext = "crhpcd01" function getstringmatch { # loop through *.doc files in $path directory foreach ($file in $files) { $document = $application.documents.open($file.fullname,$false,$true) $range = $document.content $wordfound = $range.find.execute($findtext) if($wordfound) { "$file.fullname has $wordfound" | out-file $output -append } } $document.close() $application.quit() } getstringmatch
#error reporting set-strictmode -version latest $path = "c:\temp" $files = get-childitem $path -include *.docx,*.doc -recurse | where-object { !($_.psiscontainer) } $output = "c:\temp\wordfiletry.csv" $application = new-object -comobject word.application $application.visible = $false $findtext = "first" $charactersaround = 30 $results = @{} function getstringmatch { # loop through *.doc files in $path directory foreach ($file in $files) { $document = $application.documents.open($file.fullname,$false,$true) $range = $document.content if($range.text -match ".{$($charactersaround)}$($findtext).{$($charactersaround)}"){ $properties = @{ file = $file.fullname match = $findtext textaround = $matches[0] } $results += new-object -typename pscustomobject -property $properties } } if($results){ $results | export-csv $output -notypeinformation } $document.close() $application.quit() } getstringmatch import-csv $output
there couple of ways want. simple approach since have text of document lets perform regex match on , return results , more. helps in trying address getting some words around in document.
we have variable $charactersaround
sets number of characters match around $findtext
. though output better fit csv file used $results
capture hashtable of properties that, in end, output csv file.
be sure change variables own testing. using regex locate matches opens world of possibilities.
sample output
match textaround file ----- ---------- ---- first dley air services limited dba first air meets or exceeds term c:\temp\20120315132117214.docx
Comments
Post a Comment