Reputation:
I'm trying to get the value after matched strings:
Empregados/Avulsos 2.912,30
Empregados/Avulsos 7.310,06
Sometimes there is a string in the value.
Is this possible with regular expressions?
I was trying like this:
var match = data.replace(/\s\s+/g, ' ');
var match_two = match.match([\n\r][ \t]*Retenção Lei 9.711/98[ \t]*([^\n\r]*));
console.log(match_two);
First I replace all spaces with one. Then I try to get the value 'Retenção Lei 9.711/98'
. But the output is '2'
.
I want to make a regular expression which will always get the next word or number in these examples:
Hour: get 12:12
Data: get 24/08
Solicitação get 2.912,30
Empregados/Avulsos get 1.452,00
Palavras separadas get 2.912,30
Words:
'Solicitação',
'Retention xxx 9.999/99'
'Compensation'
'TET':
'VALUE - SOCIAL PREVÎ',
'VALUE - OTHERS',
'TOTAL TO GET',
'TABLES',
'COD GPX:',
'FXGE:',
'ALIX DC:',
'RXG AJUST',
'DATA:',
'HOUR:',
Upvotes: 1
Views: 2161
Reputation: 12239
I'll address the following problem. You have a piece of text containing words and various numbers. Given an arbitrary substring, you want to find the first occurrence of that substring and extract the first number that follows it.
For example, if the substring were 'Total'
, you would want to use this regular expression:
/Total.*?(\d\S*)/
Let me break it down:
Total
is the substring you're looking for.*
means that you're looking for any character zero or more times?
means that you want to match as few characters as possible(
opens the capturing group: these are the characters you want to extract\d
matches a digit\S*
matches anything except a whitespace character, zero or more times)
closes the capturing groupNote that .
matches any character except line-ending characters like \n
and \r
. If your text includes such characters, you'll want to replace them with the visible space character (' '
) before applying the above regular expression. If your text is assigned to the variable text
, you can do the following to replace all whitespace characters (including line-ending characters) with visible spaces:
text = text.replace(/\s/g, ' ');
One more important point is that /Total.*?(\d\S*)/
is a fixed regular expression. If you want to make a regular expression for any given substring, you'll have to compile it with the RegExp
constructor:
var re = new RegExp(substring + '.*?(\\d\\S*)');
Note that we're passing a string to the constructor, so we have to escape the slashes in specifying the regular expression. Where we wrote \d\S
in the literal regular expression, we have to write \\d\\S
in the string.
We can execute the compiled regular expression on a string with the exec method, test the result to see if it's null
, and finally get the contents of the capturing group:
var match = re.exec(text);
if (match === null) {
return '';
}
return match[1];
The snippet below implements this process in a function called getNumberAfterSubstring(substring, text)
. Click on the blue button below the code to see it run on a piece of sample text and some sample substrings.
function print(s) {
document.write(s + '<br />');
}
function getNumberAfterSubstring(substring, text) {
var re = new RegExp(substring + '.*?(\\d\\S*)'),
match = re.exec(text);
if (match === null) {
return ''; // If no match is found, return empty string.
}
return match[1]; // Otherwise return first parenthesized group.
}
var text = "Tabela 25 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut ultricies ultricies auctor. Donec sodales pharetra ante, vitae suscipit metus mollis quis. Lorem ipsum dolor sit amet, Retention Law 0.000/00 consectetur adipiscing elit. Nunc nisl dui, Compension 00,00 ullamcorper eget posuere et, faucibus ut leo. Ut tellus nisi, lobortis eget nibh id, laoreet tincidunt lacus. Integer eget libero Value - Social prevî: 715,86 ut nulla vestibulum viverra eget sit Value - Others: 715,86 amet nisi. Suspendisse potenti.\nCurabitur ligula felis, Data: 02/02/2011 scelerisque in consequat et, tempor non ipsum. Donec euismod, turpis ut accumsan lobortis, lectus felis ullamcorper nibh, et pretium lectus nisl at enim. Total to pay 863,37 Nullam faucibus massa vitae nulla ultrices, eu sollicitudin justo imperdiet. Phasellus at est scelerisque, egestas diam et, rutrum dui. Hour: 15:44:58 Nunc sagittis hendrerit dui, sit amet congue arcu efficitur eu. Praesent hendrerit ut nibh vel vehicula. Morbi mollis enim ex, at mollis libero pellentesque quis. Etiam sed bibendum nisi. COD GPS: 2100 In hac habitasse platea dictumst. Morbi ac condimentum eros, in egestas tellus.";
text = text.replace(/\s/g, ' '); // Replace line-ending characters.
text = text.toLocaleLowerCase();
var substrings = ['Retention Law', 'Compension', 'VALUE - SOCIAL PREVî',
'Total', 'Tabela', 'Hour', 'Data'];
for (var i = 0; i < substrings.length; ++i) {
var substring = substrings[i].toLocaleLowerCase();
print(substring + ': ' + getNumberAfterSubstring(substring, text));
}
Upvotes: 3
Reputation: 1334
Your RegEx capture groups don't include decimals or commas, but the numeric values have those.
var res = /Empregados\/Avulsos ([\d\.,:\/]+)/.exec(str);
if(res[1]){
var values = res[1].split(",");
}
In Regex, the \d find digits, not numbers. The numbers in your example can be made of several digits, decimal points, commas, colons, and slashes.
Just make sure to include that stuff in your regex.
Upvotes: 0