Reputation: 28159
I have some data containing spaces () and hyphens (
-
) that I'd like to convert to underscore characters (_
). In other languages (ie R) I can write something like this:
var1 <- gsub(var1, "s+|\\-", "_")
which will look for multiple characters and convert them all to some other character.
Is there a way to do this in U-SQL?
EDIT:
I tried this and it ran without error but did not change the data:
@t2 = SELECT var1,
var2,
var3.Replace("s+|\\'|\\-","_") AS var3
FROM @t1;
Upvotes: 0
Views: 1353
Reputation: 657
@someData =
SELECT * FROM
( VALUES
("tic tac-toe")
) AS T(col1);
DECLARE @pattern string = "\\s|-";
@result =
SELECT col1 AS original,
Regex.Replace(col1, "\\s", "_") AS regex_replaceSpace,
Regex.Replace(col1, "-", "_") AS regex_replaceHypen,
Regex.Replace(col1, "\\055", "_") AS regex_replaceHypenDecimal,
Regex.Replace(col1, "\\s|-", "_") AS regex_replaceBoth,
Regex.Replace(col1, @"\s|-", "_") AS regex_replaceBoth_verbatim,
Regex.Replace(col1, @pattern, "_") AS regex_replaceBoth_pattern,
col1.Replace(" ", "_") AS string_replaceSpace,
col1.Replace("-", "_") AS string_replaceHypen,
col1.Replace("-", "_").Replace(" ", "_") AS string_replaceBoth
FROM @someData;
OUTPUT @result
TO "/Replace.csv"
USING Outputters.Csv(outputHeader: true);
Upvotes: 0
Reputation: 29780
You are almost there but you are using the System.String.Replace
instead of the one for a regular expression. So change
@t2 = SELECT var1,
var2,
var3.Replace("s+|\\'|\\-","_") AS var3
FROM @t1;
to
@t2 = SELECT var1,
var2,
Regex.Replace(var3, "s+|\\'|\\-", "_") AS var3
FROM @t1;
Edit: I am not an expert on regular expression so I did not validate the expression itself.
Upvotes: 3