Luke Collins
Luke Collins

Reputation: 1463

Using Regex to remove leading/trailing whitespaces, except for quotes

I am trying to write a regular expression which recognises whitespaces from a user input string, except for between quotation marks ("..."). For example, if the user enters

     #load     "my   folder/my  files/    program.prog"     ;

I want my regex substitution to transform this into

#load "my   folder/my  files/    program.prog" ;

So far I've implemented the following (you can run it here).

#include <iostream> 
#include <string>
#include <regex>

int main(){
  // Variables for user input
  std::string input_line;
  std::string program;

  // User prompt
  std::cout << ">>> ";
  std::getline(std::cin, input_line);

  // Remove leading/trailing whitespaces
  input_line = std::regex_replace(input_line, std::regex("^ +| +$|( ) +"), "$1");

  // Check result
  std::cout << input_line << std::endl;

  return 0;
}

But this removes whitespaces between quotes too. Is there any way I can use regex to ignore spaces between quotes?

Upvotes: 2

Views: 366

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627103

You may add another alternative to match and capture double quoted string literals and re-insert it into the result with another backreference:

input_line = std::regex_replace(
      input_line, 
      std::regex(R"(^ +| +$|(\"[^\"\\]*(?:\\[\s\S][^\"\\]*)*\")|( ) +)"),
      "$1$2");

See the C++ demo.

The "[^"\\]*(?:\\[\s\S][^"\\]*)*\" part matches a ", then 0+ chars other than \ and ", then 0 or more occurrences of any escaped char (\ and then any char matched with [\s\S]) and then 0+ chars other than \ and ".

Note I used a raw string literal R"(...)" to avoid having to escape regex escape backslashes (R"([\s\S])" = "[\\s\\S]").

Upvotes: 1

Related Questions