PyPerl
PyPerl

Reputation: 75

Extract specific pattern from string in MySQL:

I would like to extract specific pattern from string in MySQL.

The column contains specific string like xxx-atg168d and xxx-atg444-6x. From these string, I want to extract atg168 and atg444 only. How can I perform this in MySQL?

**Input_column**  
xxx-atg168d  
xxx-atg444-6x  
xxx-atg1689d   
xxx-atg16507d  
xxx-atg444d-6x 
xxx-atg444c-6x  

**Output_column**  
atg168  
atg444  
atg1689  
atg16507  
atg444
atg444

Upvotes: 0

Views: 738

Answers (3)

spencer7593
spencer7593

Reputation: 108380

Something like this may meet your specification:

SUBSTRING_INDEX(SUBSTR( t.col ,INSTR( t.col ,'-')+1),'-',1)

This assumes that you want to return the portion of the string following the first dash character, up to the next dash character (if present). If no dash characters exist within the string, the entire string will be returned.

EDIT

Ooops. That expression also includes the trailing "d". If it's just a trailing "d" character that needs to be removed...

TRIM(TRAILING 'd' FROM SUBSTRING_INDEX(SUBSTR( t.col ,INSTR( t.col ,'-')+1),'-',1))

In the more general case, to remove any "non-digit" character from the end (not just a "d"), things get pretty ugly. We need to check the rightmost character, and see if it matches a character we wnt to keep. If it's not, we shorten the string by one character.

IF( INSTR('0123456789',RIGHT(
       @t := SUBSTRING_INDEX(SUBSTR( t.col ,INSTR( t.col ,'-')+1),'-',1)
    ,1))
, @t
, SUBSTRING( @t, 1, CHAR_LENGTH( @t )-1)
)

I made use of a user-defined variable here to avoid repeating the same expression multiple times. It's not required that we do that. The @t :=assignment can be removed, and other occurrences of @t can be replaced with the expression that was assigned to @t.

The literal '0123456789' in that expression is the set of characters that we don't want to remove from the end of the string.

Upvotes: 1

PyPerl
PyPerl

Reputation: 75

Thanks spencer for your suggestions. I edited your code to get the solution for my query. Here is the update query,

left(substring_index (substr(subid,instr(subid,'-')+1),'-',1) , char_length(substring_index (substr(subid,instr(subid,'-')+1),'-',1))-1)

Upvotes: 0

Sagar Joon
Sagar Joon

Reputation: 1417

Use SUBSTRING function , like this :

select SUBSTRING(column_name ,5,6) from table_name;

Here 5 is starting position and 6 is length of sub-string getting extracted from string.

Upvotes: 0

Related Questions