Reputation: 425
I have an awkward text file (hosts.txt) I need to extract a certain part of a sentence from:-
18 Jul 2019 09:30 BST
62.172.169.12
United Kingdom
H82640A745.XGPH82640
3.12.21.0
Remove
18 Jul 2019 09:29 BST
62.172.169.9
United Kingdom
H82640A744.XGPH82640
3.12.21.0
Remove
18 Jul 2019 09:26 BST
62.172.169.18
United Kingdom
H82640A740.XGPH82640
3.12.21.0
Remove
I just need the H********* number next to .XGPH82640 - so from the example I just need a list like:-
H82640A745
H82640A744
H82640A740
and so on...
I am trying to extract using tokens and delims in batch but I'm not getting any where. If I try and Skip=X number of lines it doesn't work because the first H******* number has three lines above, but from then on has 5.
I have read the SS64 on tokens and delims as I would really like to be able to figure this out myself but I'm not getting it. Especially with this text file.
At the minute I am trying to use the ":" as the delimiter but again the token numbers alter, so if it was just the first five lines
For /F "Tokens=4 delims=:" %%A In (hosts.txt) Do echo %%A
Any help would be great - thanks!
Upvotes: 0
Views: 1028
Reputation: 38654
This answer is based upon my comment and your subsequent suggestion that the lines may contain an unknown single period separated alphanumeric string instead of known one:
From a batch-file:
@Echo Off
If Not Exist "hosts.txt" GoTo :EOF
For /F "Delims=" %%A In (
'""%__AppDir__%findstr.exe" /X "^[A-Z0-9]*\.[A-Z0-9]*$" "hosts.txt""'
) Do Echo %%~nA
Pause
Directly in cmd:
For /F "Delims=" %A In ('""%__AppDir__%findstr.exe" /X "^[A-Z0-9]*\.[A-Z0-9]*$" "hosts.txt" 2>NUL"')Do @Echo %~nA
Upvotes: 1
Reputation: 49127
You could use following command line in your batch file:
for /F "tokens=1,2 delims=." %%I in (hosts.txt) do if "%%J" == "XGPH82640" echo %%I
FOR reads the file hosts.txt
line by line with ignoring empty lines.
The string delimiter is modified with delims=.
from default normal space or horizontal tab to character .
.
Of interest for this task are lines which have two dot delimited substrings whereby the second substring should be XGPH82640
. For that reason tokens=1,2
is used to get first dot delimited string assigned to loop variable I
and second dot delimited string assigned to next loop variable which is J
according to ASCII table.
If the first substring after removing all leading .
would start with a semicolon, command FOR would also ignore the line because of eol=;
is the default for end of line character. But it can be assumed that no line with XGPH82640
starts with ;
and therefore the default end of line character can be kept as is.
The case-sensitive IF condition verifies if the second dot delimited string is really XGPH82640
and not an empty string as on the lines with date/time or with country and or a decimal number as on the lines with an IPv4 address.
On a true IF condition the first dot delimited string is output to console.
Upvotes: 2