Reputation: 17388
Looking at this and this code:
data Work.SomeDataset;
set Work.SomeDataset;
SomeColumn = compress(SomeColumn, byte(34));
SomeColumn = compress(SomeColumn, byte(38));
run;
Am I right in thinking that this removes the characters "
(byte(34)) and &
(byte(38)) from SomeColumn? If so, is there a more efficient way of achieving this?
Upvotes: 2
Views: 161
Reputation: 63424
Yes, that removes those ascii characters. You can do this in one COMPRESS call, by concatenating, and of course you could just pass '&"'
to the function but I assume you have a reason for not doing so.
You might also want to use COMPBL to remove the extra spaces that might be created by this process, if that is a problem.
data Work.want;
set Work.have;
someColumn = compress(someColumn,byte(34)||byte(38));
put SomeColumn=;
run;
You could also use UPDATE (SQL) or MODIFY (data step) to avoid rewriting the dataset...
proc sql;
update have
set someColumn = compress(someColumn,byte(34)||byte(38));
quit;
Perhaps more efficient, perhaps not; ultimately probably no different if you cannot use a WHERE statement to filter out cases that don't need updating, but if you can (if it's rare to have this for example) it might be more efficient.
Edit: Modify looks like it's not the answer. In a quick test with 1e7 rows, MODIFY is strictly slower in real time AND has far more CPU time.
SQL UPDATE also is pretty slow, so I wouldn't use either of those unless you have a very rare update scenario (but test in your situation).
Upvotes: 1
Reputation: 12465
You can put the characters into a single call.
data have;
format a $12.;
a = "A" || byte(34) || "B";
output;
a = "A" || byte(34) || "B" || byte(36) || "C";
output;
run;
data want;
set have;
b = compress(a,byte(34)||byte(36));
run;
Upvotes: 1
Reputation: 9569
This is already fairly close to optimal, but for a slight improvement I'd suggest:
data Work.SomeDataset;
set Work.SomeDataset;
SomeColumn = compress(SomeColumn,'&"');
run;
This makes it clearer which characters are being removed and removes both in the same pass.
Upvotes: 3