rj487
rj487

Reputation: 4634

Ruby extract string via regular expression

I have these strings:

'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'

From these two strings, I want to extract these two file names:

'2017_11/view_mission_join_player_count2017_11'
'2017_11/activily_time2017_11'

I wrote some regular expressions, but they seem wrong.

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/([^\/index.html]+)/, 1] # => "a_r"

Upvotes: 0

Views: 81

Answers (5)

Cary Swoveland
Cary Swoveland

Reputation: 110675

Based on your examples, you may be able to use a very simple regex.

def extract(str)
  str[/\d{4}_\d{2}.+\d{4}_\d{2}/]
end

extract 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
  #=> "2017_11/view_mission_join_player_count2017_11"
extract 'da_report/GY4LFDN6/2017_11/activily_time2017_11/index.html'
  #=> "2017_11/activily_time2017_11"

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163342

If you are looking for the values at the end of the string like in the format string/string followed by /filename.extension, you could use a positive lookahead for a file name.

\w+\/\w+(?=\/\w+\.\w+$)

Demo

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

This answer assumes that you want to capture beginning with the third component of the path, up to and including the last component of the path before the filename. If so, then we can use the following regex pattern:

(?:[^/]*/){2}(.*)/.*

The quantity in parentheses is the capture group, i.e. what you want to extract from the entire path.

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
puts str[/(?:[^\/]*\/){2}(.*)\/.*/, 1]

Demo

Upvotes: 0

Aleksei Matiushkin
Aleksei Matiushkin

Reputation: 121000

Regular expression is an overkill here, and i prone to errors.

input = [
  "da_report/GY4LFDN6/" \
  "2017_11/view_mission_join_player_count2017_11" \
  "/index.html",
  "da_report/GY4LFDN6/" \
  "2017_11/activily_time2017_11" \
  "/index.html"
]  

input.map { |str| str.split('/')[2..3].join('/') }
#⇒ [
#   [0] "2017_11/view_mission_join_player_count2017_11",
#   [1] "2017_11/activily_time2017_11"
# ]

or, more elegant:

input.map { |str| str.split('/').grep(/2017_/).join('/') }

Upvotes: 1

Abdullah
Abdullah

Reputation: 2111

Use /(?<=GY4LFDN6\/)(.*)(?=\/index.html)/

str = 'da_report/GY4LFDN6/2017_11/view_mission_join_player_count2017_11/index.html'
str[/(?<=GY4LFDN6\/)(.*)(?=\/index.html)/]
 => "2017_11/view_mission_join_player_count2017_11"

live demo: http://rubular.com/r/Ued6UOXWDf

Upvotes: 0

Related Questions