Reputation: 397
Following hours of searching for what should be simple I need help.
What I want to do: Ensure that all strings are padded to the same length of 26 characters in length.
Dataset:
library(stringr)
names <-
structure(list(
names = c(
"A",
"ABC",
"ABCDEFG",
"ABCDEFGHIJKLMNOP",
"AB",
"ABCDEFGHI",
"ABCDEFGHIJKLMNOPQRSTUVWXYZ",
"ABCDEFGHIJKL",
"ABCDEFGHIJKLMNOPQR",
"ABCDEFGHIJKLMNOP",
"ABCDEFGHIJKLMNO"
)
),
class = "data.frame",
row.names = c(NA,-11L))
Step 1: Find max character length and the number of spaces to pad:
max <- as.numeric(max(nchar(names$names)))
max
n <- as.numeric(nchar(names$names))
n
pad <- max - n
pad
#add columns to the dataset to check how many characters are to be padded for each name
names$max <- as.numeric(max(nchar(names$names)))
names$n <- as.numeric(nchar(names$names))
names$pad <- as.numeric(max - n)
Step 2: Pad
names$names <-
str_pad(names$names,
pad,
side = "right",
pad = "0")
But this approach doesn't appear to be working for me. Can someone point me in the right direction? I am getting different length strings:
names max n pad
1 A000000000000000000000000 26 1 25
2 ABC00000000000000000000 26 3 23
3 ABCDEFG000000000000 26 7 19
4 ABCDEFGHIJKLMNOP 26 16 10
5 AB0000000000000000000000 26 2 24
6 ABCDEFGHI00000000 26 9 17
7 ABCDEFGHIJKLMNOPQRSTUVWXYZ 26 26 0
8 ABCDEFGHIJKL00 26 12 14
9 ABCDEFGHIJKLMNOPQR 26 18 8
10 ABCDEFGHIJKLMNOP 26 16 10
11 ABCDEFGHIJKLMNO 26 15 11
Help would be greatly appreciated.
Upvotes: 3
Views: 1928
Reputation: 9865
Using rep
and paste(..., collapse="")
(kind of pythong's join
for vec of strings) and Vectorize()
and closing-over pad
(meaning just grapping pad from argument list) one can quickly create a pad-string generator reps
.
Using paste0
one can element-wise join
the character vectors.
pad_strings <- function(char_vec, max_len=NULL, pad="0") {
reps <- Vectorize(function(n) paste(rep(pad, n), collapse=""))
lengths <- nchar(char_vec)
if (is.null(max_len)) max_len <- max(lengths)
diffs <- max_len - lengths
paste0(char_vec, reps(diffs))
}
> pad_strings(char_vec)
[1] "A0000000000000000000000000" "ABC00000000000000000000000"
[3] "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
[5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000"
[7] "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
[9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000"
[11] "ABCDEFGHIJKLMNO00000000000"
If no argument is given for max_len=
, then they are padded to the longest string. Otherwise the pad will be filled to max_len
.
Upvotes: 0
Reputation: 1624
I think you want the format function. You set the width and then justify left, right or center:
format(names, width = 26, justify = "left")
# Name
# 1 A
# 2 ABC
# 3 ABCDEFG
# 4 ABCDEFGHIJKLMNOP
# 5 AB
# 6 ABCDEFGHI
# 7 ABCDEFGHIJKLMNOPQRSTUVWXYZ
# 8 ABCDEFGHIJKL
# 9 ABCDEFGHIJKLMNOPQR
# 10 ABCDEFGHIJKLMNOP
# 11 ABCDEFGHIJKLMNO
Upvotes: 3
Reputation: 887058
Here we need just
library(dplyr)
mx <- as.numeric(max(nchar(names$Name)))
names$Name <- str_pad(names$Name, mx, side = "right", pad = "0")
names$Name
-output
#[1] "A0000000000000000000000000" "ABC00000000000000000000000" "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
#[5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
#[9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000" "ABCDEFGHIJKLMNO00000000000"
NOTE: It is better not to name objects with names that are either function names or argument names
Upvotes: 4