Reputation: 1195
Hi I am trying to find a way to determine a constant in a string and then extract a set amount of characters to the left of that constant.
e.g -
I have a .txt
file, somewhere in that file there are the numbers 00nnn
examples of the numbers would be 00234 00765 ....
So I use
@echo off
findstr /i "00" *.txt > Listfile.txt
end
To find all the strings with the constant 00
Now I have
00013 Jonas Jonas
2015-12-09 12:36:41 Bell (waterproof)
- Technical Account
00014 Jonas Bell
- Technical Account
00019 Jonas Jonas
2016-09-12 09:11:12 T16032611 Technical Account
00055 - Jonas Jonas
2016-04-29 08:05:14 T16041312 Technical Account
00057 Jonas Jonas
2016-04-04 14:36:50 T15123112 Technical Account
00067 Jonas Jonas
2016-06-24 09:33:35 T15123112 Technical Account
00570 Jonas T16041312 Technical Account
00571 Jonas T16041312 Technical Account
00572 Jonas T16041312 Technical Account
00573 Jonas T16041312 Technical Account
00574 Jonas T16041312 Technical Account
00575 Jonas T16041312 Technical Account
00576 Jonas T16041312 Technical Account
00577 Jonas T16041312 Technical Account
00578 Jonas T16041312 Technical
Next I tried :
@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
(
FOR /f "delims=" %%a IN (test.txt) DO (
SET "line=%%a"
SET "digits=5!line:~-0,5!"
FOR /L %%z IN (0,1,5) DO SET "digits=!digits:%%z=!"
IF NOT DEFINED digits ECHO(!line:~0,5!
)
)>newfile.txt
GOTO :EOF
However my problem with this is that there are spaces in the strings and how would I extract the numbers when some start at "digits=5!line:~-0,5!"
and others at "digits=13!line:~-8,13!"
as an example.
Upvotes: 2
Views: 120
Reputation: 67216
@echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%a in (test.txt) do (
set "line=%%a"
for /F %%b in ("!line:*00=!") do echo 00%%b
)
The input data should have one 00nnn
number per line, so I reformatted your example data this way:
00013 Jonas Jonas
2015-12-09 12:36:41 Bell (waterproof) - Technical Account 00014 Jonas Bell
- Technical Account 00019 Jonas Jonas
2016-09-12 09:11:12 T16032611 Technical Account 00055 - Jonas Jonas
2016-04-29 08:05:14 T16041312 Technical Account 00057 Jonas Jonas
2016-04-04 14:36:50 T15123112 Technical Account 00067 Jonas Jonas
2016-06-24 09:33:35 T15123112 Technical Account 00570 Jonas T16041312 Technical Account
00571 Jonas T16041312 Technical Account
00572 Jonas T16041312 Technical Account
00573 Jonas T16041312 Technical Account
00574 Jonas T16041312 Technical Account
00575 Jonas T16041312 Technical Account
00576 Jonas T16041312 Technical Account
00577 Jonas T16041312 Technical Account
00578 Jonas T16041312 Technical
Output example:
00013
00014
00019
00055
00057
00067
00570
00571
00572
00573
00574
00575
00576
00577
00578
EDIT: New method added using JScript
My first answer is a simple method to solve this problem using just a small Batch file. However, now that other answers had suggested to use regular expressions you should know that you don't need to mess with non-standard utilities (like grep) nor PowerShell in order to use a simple regex in a Batch file. You may use a couple lines of JScript language that comes preinstalled on all Windows versions from XP on:
@if (@CodeSection == @Batch) @then
@echo off
cscript //nologo //E:JScript "%~F0" < test.txt
goto :EOF
@end
var match, search = /00\d{3}/g, file = WScript.StdIn.ReadAll();
while ( match = search.exec(file) ) WScript.Stdout.WriteLine(match[0]);
Copy this code in a Batch file (.bat extension); this code run much faster than the PowerShell solution. You may also get the complete solution to your problem using the next line, that review all *.txt files and extract the numbers in one operation:
findstr /i "00" *.txt | cscript //nologo //E:JScript "%~F0"
Upvotes: 3
Reputation: 9266
You can use a regex (from Mark Setchell's answer) by invoking PowerShell and using the Select-String
cmdlet to do the same thing as grep
.
powershell -c "(sls '00\d{3}' YourFile).matches | select -exp value"
Select-String
(sls
) uses the regex 00\d{3}
to search for all lines containing the characters 00
followed by three digits and matches the whole number. The .matches
and select
then extract only the part of the line that matches.
00013
00014
00019
00055
00057
00067
00570
00571
00572
00573
00574
00575
00576
00577
00578
PowerShell is installed on every Windows PC; no need to install any third-party programs.
Upvotes: 2
Reputation: 207465
Install GNU grep for Windows and run:
grep -Eo "00\d{3}" YourFile
to look for "00" followed by exactly 3 digits (\d{3}
) and only (-o
) print the part of the line that matches.
Output
00013
00014
00019
00055
00057
00067
00570
00571
00572
00573
00574
00575
00576
00577
00578
Upvotes: 1
Reputation: 56180
extracting all numbers that start with 00
(assuming, there are only spaces or tabs before them):
for /f %%a in ('type *.txt^|find "00"') do echo %%a
Upvotes: 1