Reputation: 1309
The text file has many lines of these sort , i want to extract the words after /videos till .mp4 and the very last number ( shown in bold ) and output each filtered line in a separate file
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/**S4KWZTyt-32313922.mp4**.m3u8?hdnts=exp=1592315851~acl=*/S4KWZTyt-32313922.mp4.m3u8~hmac=83f4674e6bf2576b070c716a3196cb6a30f35737827ee69c8cf7e0c57a196e51 **1**
Lets say for example the text file content is ..
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/JajSfbVN-32313922.mp4.m3u8?hdnts=exp=1592315891~acl=*/JajSfbVN-32313922.mp4.m3u8~hmac=d3ca7bd5b233a531cfe242d17d2ea0c0167b41b90fff6459e433700ffc969d69 19
https://videos-a.jwpsrv.com/content/conversions/7kHOkkQa/videos/Qs3xZqcv-32313922.mp4.m3u8?hdnts=exp=1592315940~acl=*/Qs3xZqcv-32313922.mp4.m3u8~hmac=c30e2082bf748a6b4d1621c1d33a95319baa61798775e9da8856041951cf5233 20
The output should be
JajSfbVN-32313922.mp4 19
Qs3xZqcv-32313922.mp4 20
Upvotes: 0
Views: 270
Reputation: 291
The proposed regex is probably a better solution, but I'll leave a Python solution that writes each filtered line in a separate file. This script works if every line in the file is like that.
with open("my_file.txt","r") as FILE:
lines=FILE.readlines()
for line in lines:
num=line.split(" ")[1]
newline=line.split("videos")[2]
newline=newline[1:]
new=newline.split(".")[0:2]
with open(new[0],"w") as f:
f.write(new[0]+"."+new[1]+" "+num.strip())
f.close
Upvotes: 1
Reputation:
You may try the below regex:
.*\/videos\/(.*?mp4).*?(?<= )(\d+)
Explanation of the above regex:
.*
- Matching everything before\videos
.
\/videos\/
- Matching videos literally.
(.*?mp4)
- Represents a capturing group lazily matching everything beforemp4
.
.*?
- Greedily matches everything before the occurrence of digits.
(\d+)
- Represents second capturing group matching the numbers at the end as required by you.
You can find the demo of the above regex in here.
Command line implementation in linux:
cat regea.txt | perl -ne 'print "$1 $2\n" while /.*\/videos\/(.*?mp4).*?(?<= )(\d+)/g;'> out.txt
You can find the sample implementation of the above command in here.
Upvotes: 1