Reputation: 377
I have a list of more than 10,000 excel files in location A.
I need a free utility (since I need to use windows task scheduler to run this every x minutes) which will scan these files and search for a particular string/content in the file.
The file should also be created/modified after a specific date.
If the file meets these criteria then it must be copied to Location B.
I found a free program called Ultrafile Search
which managed to identify the files but it does not run with parameters etc so i cannot use it in the scheduler. I also tried WinGrep
which seemingly does what I require but it keeps on freezing all the time probably due to the amount of files. A command-line utility or a batch file script would be great. Thanks
Upvotes: 1
Views: 1251
Reputation: 70923
Just a basic skeleton. Adapt as needed.
This is a hybrid batch/jscript file (save as .cmd
). It uses robocopy
to first select the files in the indicated date range. The generated list is passed to the javascript part to filter for content of the excel file. The javascript acts as a filter and echoes only the file names with the indicated string inside. This list is retrieved/processed by a for
command (in the sample it simply echoes the file name to console)
Of course, for it to work, it is necessary that excel is installed.
@if (@This==@IsBatch) @then
@echo off
rem **** batch zone *********************************************************
setlocal enableextensions disabledelayedexpansion
set "sourceFolder=%cd%"
set "dateStart=20140709"
set "searchString=testing"
set "dateFilter=robocopy "%sourceFolder%" "%temp%" *.xls /l /njh /njs /ndl /nc /ns /maxage:%dateStart% "
set "contentFilter=cscript //nologo //e:Javascript "%~f0" /search:"%searchString%" "
for /f "delims=" %%a in (' %dateFilter% ^| %contentFilter% ') do (
echo Selected file : [%%a]
)
endlocal
exit /b
@end
// **** Javascript zone *****************************************************
// retrieve the string to search for
var searchFor = WScript.Arguments.Named('search');
// instantiate needed components
var fso = new ActiveXObject('Scripting.FileSystemObject');
var excel = new ActiveXObject('Excel.Application');
// iterate over the stdin reading file names to process
while ( ! WScript.StdIn.AtEndOfStream ){
var fileName = WScript.StdIn.ReadLine();
fileName = fileName.replace(/^\s*/g,'');
fileName = fso.GetAbsolutePathName(fileName);
if (fso.FileExists( fileName )){
if (excelDataSearch( excel, fileName, searchFor )) {
WScript.StdOut.WriteLine( fileName );
};
};
};
// if not needed, close the instantiated excel application
if ( excel.WorkBooks.Count === 0 ) excel.Quit();
// leave
WScript.Quit(0);
function excelDataSearch( excel, workbook, searchString ){
var returnValue = false ;
try {
// search for workbook application
//var wb = GetObject( workbook );
var wb = excel.Workbooks.Open( workbook, false, true );
// iterate the worksheets of the workbook searching the string in its data
var eWS = new Enumerator(wb.WorkSheets);
for (; !eWS.atEnd() ; eWS.moveNext()){
var ws = eWS.item();
if ( ws.UsedRange.Cells.Find(searchString) ){
returnValue = true ;
break;
};
};
// close workbook
wb.Close( false );
} catch(e){
// some kind of problem with objects
// WScript.Echo( e.description );
};
return returnValue;
};
Upvotes: 0
Reputation: 55682
PowerShell
solution that
str
, and if so copies the file to a new directory (c:\test)
$newpath = "c:\test"
$str = "Test"
$xl = New-Object -comobject Excel.Application
$xl.visible = $false
$excelSheets = Get-ChildItem c:\temp -recurse -include *.xls* | where-object {$_.lastwritetime -gt “1/1/2010}
foreach($excelSheet in $excelSheets)
{
$workbook = $xl.Workbooks.Open($excelSheet)
$ws = $workbook.workSheets.Item(1)
$strxl = $ws.Cells.Item(1,1).Value2
$workbook.close()
if ($strxl -eq $str) {Copy-Item $excelsheet $newpath}
}#end foreach
$xl.quit()
$xl = $null
[gc]::collect()
[gc]::WaitForPendingFinalizers()
Upvotes: 1