somebadhat
somebadhat

Reputation: 779

Parse text using PowerShell and a regex

Parse text using PowerShell and a regex:

Find "display_name":"elocin_anagram" and output elocin_anagram. Repeat for all "display_name:....".

How would you do it?

There should be no file named 1.txt in your temp folder.

# set-content of text file
sc $env:temp\1.txt '_total":19,"subscriptions":[{"created_at":"2018-06-15T19:34:38Z","_id":"b7c42f6ce857162220e99533d3d6dc1ae11fac8d","sub_plan":"3000","sub_plan_name":"Channel Sub (❤ω❤)♡ ♡ ♡(elocin_anagram)","is_gift":false,"user":{"display_name":"elocin_anagram","type":"user","bio":"Games I like to play are VRChat, Beat Saber, Space Engineers, and Apex. I am also a certified Electronics Technician. You might see me make electronics and 3d model.","created_at":"2015-06-17T05:37:38Z","updated_at":"2020-05-11T05:51:58Z","name":"elocin_anagram","_id":"93742615","logo":"https://static-cdn.jtvnw.net/jtv_user_pictures/d37d128b-59b1-4015-9776-74866feb1d44-profile_image-300x300.png"},"sender":null},{"created_at":"2019-07-10T00:04:45Z","_id":"6a26c5a56b39d142a6e25ad30589a1b923fbc1bb","sub_plan":"1000","sub_plan_name":"Channel Sub(≧◡≦) ♡ (elocin_anagram) ","is_gift":false,"user":{"display_name":"LuckeaterVR","type":"user","bio":"","created_at":"2018-12-08T04:55:48Z","updated_at":"2020-04-24T01:44:56Z","name":"luckeatervr","_id":"00000","logo":"https://static-cdn.jtvnw.net/jtv_user_pictures/322ba52a-655c-42a4-8cc9-7b875debd5dd-profile_image-300x300.png"},"sender":null},{"created_at":"2020-01-16T01:23:17Z","_id":"17704f74767b5592c5fc221eca11a20579a8162c","sub_plan":"3000","sub_plan_name":"Channel Sub (❤ω❤)♡ ♡'
(gc $env:temp\1.txt | select-string '"display_name":"(.*?)"' -AllMatches).Matches.Value | % {$_ -match '"display_name":"(.*?)"' > $null; $matches[1]}
#

Results:

elocin_anagram
LuckeaterVR

Upvotes: 1

Views: 933

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627380

You may greatly simplify the code if you use a regex with a positive lookbehind:

gc $file |select-string '(?<="display_name":")[^"]+' -AllMatches | Select-Object -Expand Matches | %{ $_.Value }

Here, (?<="display_name":")[^"]+ matches a location that is immediately preceded with "display_name":" and then matches and consumes (i.e. places in the matched text memory buffer) one or more chars other than ". So, you do not need to access Group 1, the result is in Group 0, the match value.

See the regex demo.

Upvotes: 2

Related Questions