Reputation: 31
I've been using sep to attempt this, basically I have a text file, which contains a reasonable amount of the same line e.g.
4444 username "some information" "someotherinformation" "even more information"
I need to replace the spaces inside the quotes with underscores so it looks like this
4444 username "some_information" "someotherinformation" "even_more_information"
currently I have been able to separate out the quoted information
sed 's/"\([^"]*\)"/_/g' myfile.txt
Advice on how to proceed?
Upvotes: 3
Views: 2404
Reputation: 58430
This might work for you:
echo '4444 username "some information" "someotherinformation" "even more information"' |
sed 's/"[^"]*"/\n&/g;:a;s/\(\n"[^"]*\) /\1_/g;ta;s/\n//g'
4444 username "some_information" "someotherinformation" "even_more_information"
\n
) to quoted strings. sed 's/"[^"]*"/\n&/g;
_
. :a;s/\(\n"[^"]*\) /\1_/g;ta
s/\n//g
Upvotes: 1
Reputation: 28029
EDITED
The previous version would add unwanted spaces. This version does exactly what the OP wants.
This is probably the easiest way to get what you want.
awk -F'"' '
BEGIN {
OFS="\""
}
{
for (i = 2; i < NF; i += 2) {
gsub(/[ \t]+/, "_", $i)
}
print $0
}
' file > outputFile
Upvotes: 3
Reputation: 360105
sed -r ':a; s/^((([^"]*"){2})*[^"]*"[^" ]*) /\1_/;ta'
4444 username "some_information" "someotherinformation" "even_more_information"
or
sed ':a; s/^\(\(\([^"]*"\)\{2\}\)*[^"]*"[^" ]*\) /\1_/;ta'
4444 username "some_information" "someotherinformation" "even_more_information"
:a
- label "a" for the loops///
- perform a substitution^(
- anchor the whole search string at the beginning of the line(([^"]*"){2})*
- capture (in group 1) two sets of zero or more non-quotes followed by a quote (zero or more times)[^"]*"
- followed by zero or more non-quotes followed by a quote[^" ]*
- followed by zero or more characters that are not spaces or quotes)
- end the anchored sequence and look for a required space to replace\1
- substitute the captured group and an underscore for the matched sequenceta
- branch (transfer execution) to label :a
if a successful substitution has been done (continue to the next instruction if not - which, in this case is to end processing for this line and read the next, starting a new round of processing)This finds the first space in the last quoted string that has any spaces and replaces it. Then the next, if any, until that quoted string is finished. And so on for any additional spaces.
Then the the next previous quoted string that contains a space...and so on.
This is what the pattern space looks like at each step through the :a
... ta
loop:
4444 username "some information" "someotherinformation" "even_more information"
4444 username "some information" "someotherinformation" "even_more_information"
4444 username "some_information" "someotherinformation" "even_more_information"
Then it would step through a couple more times to look for any matches at the beginning of the line.
Upvotes: 6
Reputation: 140619
I'd actually do this in C, which makes it easier to do a character-by-character state machine than most higher-level languages.
#include <stdio.h>
int main(void)
{
int inside_quotes = 0;
int backslash = 0;
int c;
while ((c = getchar()) != EOF) {
switch (c) {
case ' ':
if (inside_quotes)
c = '_';
break;
case '"':
if (!backslash)
inside_quotes = !inside_quotes;
break;
case '\\':
if (!backslash)
backslash = 2;
break;
default:
break;
}
if (backslash > 0) backslash--;
putchar(c);
}
return 0;
}
Not tested or even compiled. Backslash handling, in particular, may very well be buggy.
Upvotes: 0