Reputation: 97
I want to select a specific string within a line in an big txt file with sed or awk. But I got always the whole line and each line is 100.000+ characters long.
I got for example:
</div><div class="follow withFollow" id="user-id-1234567890"> <a href="/app/users/id-1234567890/test/ </div><div class="follow withFollow" id="user-id-0123456789"> <a href="/app/users/id-0123456789/test/" 12345678990 1234877890 1234767890 1245456780 123456790 withFollow" id="user-id-9873456789">
The only thing I want is the numbers in:
withFollow" id="user-id-1234567890">, withFollow" id="user-id-0123456789">, withFollow" id="user-id-9873456789">
output:
1234567890
0123456789
9873456789
I tried a lot like:
sed -n '/**user-id-**/,/**">**/p' FILE
awk '/**user-id-**/,/**">**/p' FILE
awk '/**user-id-**/,/**">**/p' FILE | grep -Eo "[0-9]{1,15}" > output.txt
With the last one I got only other numbers in the same line, so not only within id="user-id-1234567890">
.
Upvotes: 1
Views: 217
Reputation: 33307
You could use grep:
$ grep -oP 'user-id-\K[^"]*' file
1234567890
0123456789
9873456789
Or if you only want to match digits:
grep -oP 'user-id-\K\d*' file
Upvotes: 3