kevbonham
kevbonham

Reputation: 1040

Convert string argument to regular expression

Trying to get into Julia after learning python, and I'm stumbling over some seemingly easy things. I'd like to have a function that takes strings as arguments, but uses one of those arguments as a regular expression to go searching for something. So:

function patterncount(string::ASCIIString, kmer::ASCIIString)
    numpatterns = eachmatch(kmer, string, true)
    count(numpatterns)
end

There are a couple of problems with this. First, eachmatch expects a Regex object as the first argument and I can't seem to figure out how to convert a string. In python I'd do r"{0}".format(kmer) - is there something similar?

Second, I clearly don't understand how the count function works (from the docs):

count(p, itr) → Integer

Count the number of elements in itr for which predicate p returns true.

But I can't seem to figure out what the predicate is for just counting how many things are in an iterator. I can make a simple counter loop, but I figure that has to be built in. I just can't find it (tried the docs, tried searching SO... no luck).

Edit: I also tried numpatterns = eachmatch(r"$kmer", string, true) - no go.

Upvotes: 3

Views: 1589

Answers (1)

spencerlyon2
spencerlyon2

Reputation: 9676

To convert a string to a regex, call the Regex function on the string.

Typically, to get the length of an iterator you an use the length function. However, in this case that won't really work. The eachmatch function returns an object of type Base.RegexMatchIterator, which doesn't have a length method. So, you can use count, as you thought. The first argument (the predicate) should be a one argument function that returns true or false depending on whether you would like to count a particular item in your iterator. In this case that function can simply be the anonymous function x->true, because for all x in the RegexMatchIterator, we want to count it.

So, given that info, I would write your function like this:

patterncount(s::ASCIIString, kmer::ASCIIString) = 
    count(x->true, eachmatch(Regex(kmer), s, true))

EDIT: I also changed the name of the first argument to be s instead of string, because string is a Julia function. Nothing terrible would have happened if we would have left that argument name the same in this example, but it is usually good practice not to give variable names the same as a built-in function name.

Upvotes: 6

Related Questions