Dewan159
Dewan159

Reputation: 3074

Detect quotes with in quotes using RegEx

I am looking for a way to detect and drop quotes with in quotes, for example: something "something "something something" something" something.

In the above example the italic something something is wrapped in double-quotes as you can see. I want to strip the string inside from these outer quotes.

So, the expression should simply look for quotes with a text between them plus a another set of text-wrapping text, and then drop the quotes wrapping the last.

This is my current code (php):

    preg_match_all('/".*(".*").*"/', $text, $matches);
    if(is_array($matches[0])){
        foreach($matches[0] as $match){
            $text = str_replace($match, '"' . str_replace('"', '', $match) . '"', $text);
        }
    }

Upvotes: 1

Views: 95

Answers (2)

Jan
Jan

Reputation: 43169

You could leverage strpos() with the third parameter (offset) to look up all quotes and replace every quote from 1 to n-1:

<?php

$data = <<<DATA
something "something "something something" something" something
DATA;

# set up the needed variables
$needle = '"';
$lastPos = 0;
$positions = array();

# find all quotes
while (($lastPos = strpos($data, $needle, $lastPos)) !== false) {
    $positions[] = $lastPos;
    $lastPos = $lastPos + strlen($needle);
}

# replace them if there are more than 2
if (count($positions) > 2) {
    for ($i=1;$i<count($positions)-1;$i++) {
        $data[$positions[$i]] = "";
    }
}

# check the result
echo $data;
?>

This yields

something "something something something something" something


You could even hide it in a class:

class unquote {
    # set up the needed variables
    var $data = "";
    var $needle = "";
    var $positions = array();

    function cleanData($string="", $needle = '"') {
        $this->data = $string;
        $this->needle = $needle;
        $this->searchPositions();
        $this->replace();
        return $this->data;
    }

    private function searchPositions() {
        $lastPos = 0;
        # find all quotes
        while (($lastPos = strpos($this->data, $this->needle, $lastPos)) !== false) {
            $this->positions[] = $lastPos;
            $lastPos = $lastPos + strlen($this->needle);
        }
    }

    private function replace() {
        # replace them if there are more than 2
        if (count($this->positions) > 2) {
            for ($i=1;$i<count($this->positions)-1;$i++) {
                $this->data[$this->positions[$i]] = "";
            }
        }

    }
}

And call it with

$q = new unquote();
$data = $q->cleanData($data);

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163362

If the string starts with a " and the double quotes inside the string are always balanced you might use:

^"(*SKIP)(*F)|"([^"]*)"

That would match a double quote at the start of the string and then skips that match using SKIP FAIL. Then it would match ", capture in a group what is between the " and match a " again.

In the replacement you could use capturing group 1 $1

$pattern = '/^"(*SKIP)(*F)|"([^"]+)"/';
$str = "\"something \"something something\" and then \"something\" something\"";
echo preg_replace($pattern, "$1", $str); 

"something something something and then something something"

Demo

Upvotes: 1

Related Questions