Sayandip Ghatak
Sayandip Ghatak

Reputation: 303

Put comma after every nth characters , but it should not be inside a word

I have a long string, I have to put comma after every 300 characters.

I am using this and it is working fine:

select regexp_replace(column_name,'(.{300})','\1,') from table_name ;

But the condition is we cannot put any comma inside a word. So we have to check if the 300th character is a space, then put a comma, else find the previous space and put comma there.

A similar scenario: (I am taking 10 instead of 300)

After running the query

select regexp_replace(column_name,'(.{10})','\1,') from table_name ;

Sample Input: (i.e. value of column_name):

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras viverra consectetur magna, vel vulputate sapien convallis ut.

Sample Output:

Lorem ipsu,m dolor si,t amet, co,nsectetur ,adipiscing, elit. Cra,s viverra ,consectetu,r magna, ,vel ,vulputate sapien ,convallis ,ut.

But this is unexpected.

The output should be like this:

Lorem ,ipsum dolor ,sit amet, ,consectetur ,adipiscing, elit. ,Cras viverra ,,consectetur magna,, vel ,vulputate sapien ,convallis ,ut.

Can anyone please help me on this?

Upvotes: 1

Views: 1568

Answers (2)

MT0
MT0

Reputation: 167877

I've used | rather than a comma since the sample data you've provided uses commas:

SQL Fiddle

Oracle 11g R2 Schema Setup:

CREATE TABLE table_name( data ) AS
SELECT 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras viverra consectetur magna, vel vulputate sapien convallis ut.' FROM DUAL;

Query 1:

WITH words ( id, data, word, wordno, numwords, len, grpstart, grp ) AS (
  SELECT ROWNUM,
         data, 
         REGEXP_SUBSTR( data, '^\S+', 1, 1 ),
         1,
         REGEXP_COUNT( data, '(^|\s)\S+' ),
         LENGTH( REGEXP_SUBSTR( data, '^\S+', 1, 1 ) ),
         0,
         0
  FROM   table_name
UNION ALL
  SELECT id,
         data, 
         REGEXP_SUBSTR( data, '\s\S+', 1, wordno + 1 ),
         wordno + 1,
         numwords,
         len + LENGTH( REGEXP_SUBSTR( data, '\s\S+', 1, wordno + 1 ) ),
         CASE
         WHEN len + LENGTH( REGEXP_SUBSTR( data, '\s\S+', 1, wordno + 1 ) )
              > grpstart + 10
         THEN len
         ELSE grpstart
         END,
         CASE
         WHEN len + LENGTH( REGEXP_SUBSTR( data, '\s\S+', 1, wordno + 1 ) )
              > grpstart + 10
         THEN grp+1
         ELSE grp
         END
  FROM   words
  WHERE  wordno < numwords
)
SELECT LISTAGG( words, '|' ) WITHIN GROUP ( ORDER BY grp )
         AS sentence
FROM   (
  SELECT id,
         grp,
         LISTAGG( word, '' ) WITHIN GROUP ( ORDER BY len ) AS words
  FROM   words
  GROUP BY id, grp
)
GROUP BY id

Results:

|                                                                                                                            SENTENCE |
|-------------------------------------------------------------------------------------------------------------------------------------|
| Lorem| dolor sit| amet,| consectetur| adipiscing| elit.| Cras| viverra| consectetur| magna,| vel| vulputate| sapien| convallis| ut. |

Upvotes: 3

p4n1
p4n1

Reputation: 117

You could try adding something like [^a-z] in your regular expression to exclude all characters from a-z.

Upvotes: 0

Related Questions