Reputation: 912
I'm trying to extract headers from a markdown file using regex, i currently have a nested loop that looks at each line of the markdown and then loops through each header level. However my code fails when a header like string is found in a code block.
If the specified level is 1, it will retrieve H1s If the specified level is 2, it will retrieve H1s and H2s If the specified level is 3, it will retrieve H1s, H2s, and H3s
var content = inputData
val thePages = mutableListOf<String>()
var search = "#".repeat(level) + " "
for (line in content.lines()) {
for (i in 0..search.length-1) {
if (line.startsWith(search.substring(i))) {
thePages.add(line)
}
}
}
I've been trying to do this with regex without luck, a sample markdown string is here: https://pastebin.com/c28bt8F3
Upvotes: 0
Views: 454
Reputation: 1261
What about:
fun main() {
val content = listOf("# Header 11", "## Header 12", "### Header 13", "#### Header 14") // sample data
println(SearchExample().search(1, content)) // print level 1 only
println(SearchExample().search(2, content)) // print up to level 2
println(SearchExample().search(3, content)) // print up to level 3
}
class SearchExample {
private val regex = "^([#]+).*".toRegex()
fun search(level: Int, content: List<String>): List<String> {
return content.map { it to determineHeader(it) } // get a pair of the value and the header level
.filter { it.second in 0..level } // filter out all header levels below level and no matching stuff
.map { it.first } // get the content
}
private fun determineHeader(line: String): Int {
val result = regex.matchEntire(line) ?: return -1 // check for matching otherwise return -1
return result.groupValues[1].length // the the group (of # values) and count the length of the group value which is the amount of # chars
}
}
Upvotes: 1