Vinicius Segalin
Vinicius Segalin

Reputation: 45

Non-greedy match in vim

I've seen some questions about it here in Stack Overflow, but still couldn't solve my problem. I'm almost there, but need some help. The pattern I'm trying to find is:

/SELECT.*\ncoalesce.*\(\n.*\)*) x

Which means: SELECT, anything, line break, coalesce, anything, (line break, anything)*, ) x

The thing is, the pattern '(line break, anything)' should occur 0 or more times, but as few as possible, so I've tried replacing * for {-}, turning the search into

/SELECT.*\ncoalesce.*\(\n.*\)\{-}) x

But it didn't work. It keeps selecting one big block instead of several blocks as I need. If I use a number instead of - (e.g. {10}) it works for blocks with 10 or less lines, but as this number may vary, I can't do this. Does anyone have a solution? Below is an example of file I'm using:

INSERT INTO lattes.palavras_chave (palavra_chave_1,palavra_chave_2,palavra_chave_3,palavra_chave_4,palavra_chave_5,palavra_chave_6)
SELECT palavra_chave_1,palavra_chave_2,palavra_chave_3,palavra_chave_4,palavra_chave_5,palavra_chave_6 from (
SELECT
coalesce(T2.Loc.value('@PALAVRA-CHAVE-1','varchar(max)'),'') as palavra_chave_1 ,
coalesce(T2.Loc.value('@PALAVRA-CHAVE-2','varchar(max)'),'') as palavra_chave_2 ,
coalesce(T2.Loc.value('@PALAVRA-CHAVE-3','varchar(max)'),'') as palavra_chave_3 ,
coalesce(T2.Loc.value('@PALAVRA-CHAVE-4','varchar(max)'),'') as palavra_chave_4 ,
coalesce(T2.Loc.value('@PALAVRA-CHAVE-5','varchar(max)'),'') as palavra_chave_5 ,
coalesce(T2.Loc.value('@PALAVRA-CHAVE-6','varchar(max)'),'') as palavra_chave_6
,tx.data_modificacao as data_modificacao_xml
FROM lattes.curriculos_xml tx
CROSS APPLY arquivo.nodes('//PALAVRAS-CHAVE') as T2(Loc)
) x
group by 
palavra_chave_1,
palavra_chave_2,
palavra_chave_3,
palavra_chave_4,
palavra_chave_5,
palavra_chave_6;

INSERT INTO lattes.area_do_conhecimento_1 (nome_grande_area_do_conhecimento,nome_da_area_do_conhecimento,nome_da_sub_area_do_conhecimento,nome_da_especialidade)
SELECT nome_grande_area_do_conhecimento,nome_da_area_do_conhecimento,nome_da_sub_area_do_conhecimento,nome_da_especialidade from (
SELECT
coalesce(T2.Loc.value('@NOME-GRANDE-AREA-DO-CONHECIMENTO','varchar(max)'),'') as nome_grande_area_do_conhecimento ,
coalesce(T2.Loc.value('@NOME-DA-AREA-DO-CONHECIMENTO','varchar(max)'),'') as nome_da_area_do_conhecimento ,
coalesce(T2.Loc.value('@NOME-DA-SUB-AREA-DO-CONHECIMENTO','varchar(max)'),'') as nome_da_sub_area_do_conhecimento ,
coalesce(T2.Loc.value('@NOME-DA-ESPECIALIDADE','varchar(max)'),'') as nome_da_especialidade
,tx.data_modificacao as data_modificacao_xml
FROM lattes.curriculos_xml tx
CROSS APPLY arquivo.nodes('//AREA-DO-CONHECIMENTO-1') as T2(Loc)
) x
group by 
nome_grande_area_do_conhecimento,
nome_da_area_do_conhecimento,
nome_da_sub_area_do_conhecimento,
nome_da_especialidade;

Thank you.

Upvotes: 1

Views: 639

Answers (2)

Peter Rincker
Peter Rincker

Reputation: 45117

The issue is your \n.*. The .* is greedy inside your capture group. You can overcome this by switching to match your \n at the end instead like so:

/SELECT.*\ncoalesce.*\n\(.*\n\)\{-}) x

Or use two \{-}:

/SELECT.*\ncoalesce.*\(\n.\{-}\)\{-}) x

Or simplify your capture group by replacing it with \_. instead (My recommendation):

/SELECT.*\ncoalesce\_.\{-}) x

For more help see:

:h /\_.
:h /\{-

Upvotes: 2

Happy Green Kid Naps
Happy Green Kid Naps

Reputation: 1673

Note that even with \{-\}, you still have the greedy .* within your group.

Try select.*\ncoalesce\_.\{-\}\n) x

Upvotes: 2

Related Questions