Reputation: 16671
I have data like following
12 x ATG 370 g, 12 x 720 ml, 1 Glas = 0.97, 1 kg = 2.03
versch. Sorten, 2 x 250 g, 1 Packung = 1.-, 100 g = 0.40
2 x 950 g, 1 Packung = 4.98, 1 kg = 4.47, tiefgekühlt
versch. Sorten, 2 x 500 g, 1 Packung = 0.65, 1 kg = 1.-
3,5 % Fett, 3 x 1 Liter, 1 Packung = 0.76, 1 Liter = 0.60
Krönung Balance gemahlen oder Krönung Aroma ganze Kaffeebohnen, 500 g, 1 kg = 6.44
versch. Sorten, 400 g, 1 kg = 5.60
400 g, versch. Sorten, 1 kg = 5.60
Expected Outcome
12 x 720 ml => { pack: 12, weight:720 , unit: ml }
2 x 250 g. => { pack: 2, weight:250 , unit: g }
2 x 950 g => { pack: 2, weight:950 , unit: g }
2 x 500 g => { pack: 2, weight:500 , unit: g }
3 x 1 Liter => { pack: 3, weight:1 , unit: Liter }
500 g => { pack: 1, weight:500 , unit: g }
400 g => { pack: 1, weight:400 , unit: g }
400 g => { pack: 1, weight:400 , unit: g }
I tried the following code
const re = /^(\d+x)?([\d,]+)([a-z]+)/gm;
str.split(",").forEach(v => {
const value = v.replace(/\s/g, "")
let arr = [...value.matchAll(re)];
console.log(arr[0]);
})
Results of the input string using above code
12 x ATG 370 g, 12 x 720 ml, 1 Glas = 0.97, 1 kg = 2.03
["12x", undefined, "12", "x"] ["12x720ml", "12x", "720", "ml"] undefined ["1kg", undefined, "1", "kg"]
versch. Sorten, 2 x 250 g, 1 Packung = 1.-, 100 g = 0.40
undefined ["2x250g", "2x", "250", "g"] undefined ["100g", undefined, "100", "g"]
and so on...
I am not able to figure out how to extract the desired data and if this is even possible since the occurrence of the required data is not positioned properly in the string.
EDIT ( NEW )
Wiktor Stribiżew solution works perfectly for the above cases.
New Requirement -
12 x ATG 370 g, 12 x 720 ml, 1 Glas = 0.97, 1 kg = 2.03
versch. Sorten, 2 x 250 g, 1 Packung = 1.-, 100 g = 0.40
2 x 950 g, 1 Packung = 4.98, 1 kg = 4.47, tiefgekühlt
versch. Sorten, 2 x 500 g, 1 Packung = 0.65, 1 kg = 1.-
3,5 % Fett, 3 x 1 Liter, 1 Packung = 0.76, 1 Liter = 0.60
Krönung Balance gemahlen oder Krönung Aroma ganze Kaffeebohnen, 400 - 500 g, 1 kg = 6.44
( Range )versch. Sorten, 400 g, 1 kg = 5.60
100 - 400 g, versch. Sorten, 1 kg = 5.60
( Range )Expected Outcome
12 x 720 ml => { pack: 12, minweight:720 , maxweight: 0, unit: ml }
2 x 250 g. => { pack: 2, minweight:250 , maxweight: 0, unit: g }
2 x 950 g => { pack: 2, minweight:950 , maxweight: 0, unit: g }
2 x 500 g => { pack: 2, minweight:500 , maxweight: 0, unit: g }
3 x 1 Liter => { pack: 3, minweight:1 , maxweight: 0, unit: Liter }
400 - 500 g => { pack: 1, minweight:400 , maxweight: 500, unit: g }
400 g => { pack: 1, minweight:400 , maxweight: 0, unit: g }
100 - 400 g => { pack: 1, minweight:100 , maxweight: 400, unit: g }
Upvotes: 2
Views: 101
Reputation: 627083
You can use
const arr = ['12 x ATG 370 g, 12 x 720 ml, 1 Glas = 0.97, 1 kg = 2.03','versch. Sorten, 2 x 250 g, 1 Packung = 1.-, 100 g = 0.40','2 x 950 g, 1 Packung = 4.98, 1 kg = 4.47, tiefgekühlt','versch. Sorten, 2 x 500 g, 1 Packung = 0.65, 1 kg = 1.-','3,5 % Fett, 3 x 1 Liter, 1 Packung = 0.76, 1 Liter = 0.60','Krönung Balance gemahlen oder Krönung Aroma ganze Kaffeebohnen, 400 - 500 g, 1 kg = 6.44','versch. Sorten, 400 g, 1 kg = 5.60','100 - 400 g, versch. Sorten, 1 kg = 5.60'];
const re = /(?:,\s*|^)(?:(\d+)\s*x\s*)?(\d+(?:\s*-\s*\d+)?)\s*([a-zA-Z]+)(?:$|,)/;
arr.forEach( str => {
let [_, pack, weight, unit] = str.match(re);
pack = pack || 1;
console.log(str, {'pack': pack, 'weight': weight, 'unit': unit});
})
The regex matches:
(?:,\s*|^)
- either a comma followed with zero or more whitespaces or start of string(?:(\d+)\s*x\s*)?
- an optional sequence of
(\d+)
- Capturing group 1 (pack): one or more digits\s*x\s*
- x
enclosed with optional zero or more whitespaces(\d+(?:\s*-\s*\d+)?)
- Capturing group 2 (weight): one or more digits and an optional sequence of -
enclosed with optional whitespaces and then one or more digits\s*
- zero or more whitespaces([a-zA-Z]+)
- Capturing group 3 (unit): one or more letters(?:$|,)
- either end of string or a commaSee the regex demo.
Upvotes: 1