Reputation: 898
I have this markdown as string:
# section 1\n\n
any type of valid markdown text. /notations here\n
Sample text for testing:
abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789 _+-.,!@$%^&*();\/|<>"'
12345 -98.7 3.141 .6180 9,000 +42
555.123.4567 +1-(800)-555-2468
[email protected] [email protected]
www.demo.com http://foo.co.uk/
http://regexr.com/foo.html?q=bar
https://mediatemple.net
- list 1
- list 2
[www.asdf.com](some description)
## sec 1.1\n blah\n
# header 2\n\n
## 2.1\n\n
### 2.2\n
# some_section\n
## 3.1\n\n
I would like to split the string by section, eg the output should be a list of 3 entries of string. The first entry should be '# section 1\n\n ## 1.1\n blah\n'.
The regex i'm using is /[^#]# [\s\S]+?(?=#)/ . How do I match a string without ' #' at the end? And my regex is matching the whole string instead of the output i need.
Sample at http://regexr.com/3ev83 . Thanks.
Upvotes: 0
Views: 512
Reputation: 54293
You can use slice_before instead of a big regex :
markup = "# section 1\n\n
## 1.1\n
blah\n
# section 2\n\n
## 2.1\n\n
### 2.2\n
# section 3\n
## 3.1\n\n "
p markup.each_line.slice_before(/# section \d+/).map(&:join)
#=> ["# section 2\n\n\n## 1.1\n\nblah\n\n", "# section 2\n\n\n## 2.1\n\n\n### 2.2\n\n", "# section 3\n\n## 3.1\n\n "]
If you want to generalise the method for any header, you can just use :
p markup.each_line.slice_before(/^# /).map(&:join)
If you want to iterate over every line in each section, you can remove join :
markup.each_line.slice_before(/^# /).each do |section|
section.each do |line|
# do something with line
end
end
Upvotes: 1
Reputation: 27803
Try this,
string.split(/(?=^# )/)
And if you want to split at any heading from #
through ###
string.split(/(?=^#+ )/)
How does this work?
^
matches the begin of the line(?=...)
is a lookahead matchUpvotes: 2