Keelin
Keelin

Reputation: 397

r padding strings to same length

Following hours of searching for what should be simple I need help.

What I want to do: Ensure that all strings are padded to the same length of 26 characters in length.

Dataset:

  library(stringr)

  names <-
  structure(list(
    names = c(
      "A",
      "ABC",
      "ABCDEFG",
      "ABCDEFGHIJKLMNOP",
      "AB",
      "ABCDEFGHI",
      "ABCDEFGHIJKLMNOPQRSTUVWXYZ",
      "ABCDEFGHIJKL",
      "ABCDEFGHIJKLMNOPQR",
      "ABCDEFGHIJKLMNOP",
      "ABCDEFGHIJKLMNO"
    )
  ),
  class = "data.frame",
  row.names = c(NA,-11L))

Step 1: Find max character length and the number of spaces to pad:

max <- as.numeric(max(nchar(names$names)))
max

n <- as.numeric(nchar(names$names))
n

pad <- max - n
pad


#add columns to the dataset to check how many characters are to be padded for each name

names$max <- as.numeric(max(nchar(names$names)))
names$n <- as.numeric(nchar(names$names))
names$pad <- as.numeric(max - n)

Step 2: Pad

  names$names <-
  str_pad(names$names,
          pad,
          side = "right",
          pad = "0")

But this approach doesn't appear to be working for me. Can someone point me in the right direction? I am getting different length strings:

                        names max  n pad
1   A000000000000000000000000  26  1  25
2     ABC00000000000000000000  26  3  23
3         ABCDEFG000000000000  26  7  19
4            ABCDEFGHIJKLMNOP  26 16  10
5    AB0000000000000000000000  26  2  24
6           ABCDEFGHI00000000  26  9  17
7  ABCDEFGHIJKLMNOPQRSTUVWXYZ  26 26   0
8              ABCDEFGHIJKL00  26 12  14
9          ABCDEFGHIJKLMNOPQR  26 18   8
10           ABCDEFGHIJKLMNOP  26 16  10
11            ABCDEFGHIJKLMNO  26 15  11

Help would be greatly appreciated.

Upvotes: 3

Views: 1928

Answers (3)

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 9865

Using rep and paste(..., collapse="") (kind of pythong's join for vec of strings) and Vectorize() and closing-over pad (meaning just grapping pad from argument list) one can quickly create a pad-string generator reps. Using paste0 one can element-wise join the character vectors.

pad_strings <- function(char_vec, max_len=NULL, pad="0") {
  reps <- Vectorize(function(n) paste(rep(pad, n), collapse=""))
  lengths <- nchar(char_vec)
  if (is.null(max_len)) max_len <- max(lengths)
  diffs <- max_len - lengths
  paste0(char_vec, reps(diffs))
}

> pad_strings(char_vec)
 [1] "A0000000000000000000000000" "ABC00000000000000000000000"
 [3] "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
 [5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000"
 [7] "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
 [9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000"
[11] "ABCDEFGHIJKLMNO00000000000"

If no argument is given for max_len=, then they are padded to the longest string. Otherwise the pad will be filled to max_len.

Upvotes: 0

David J. Bosak
David J. Bosak

Reputation: 1624

I think you want the format function. You set the width and then justify left, right or center:


format(names, width = 26, justify = "left")

# Name
# 1  A                         
# 2  ABC                       
# 3  ABCDEFG                   
# 4  ABCDEFGHIJKLMNOP          
# 5  AB                        
# 6  ABCDEFGHI                 
# 7  ABCDEFGHIJKLMNOPQRSTUVWXYZ
# 8  ABCDEFGHIJKL              
# 9  ABCDEFGHIJKLMNOPQR        
# 10 ABCDEFGHIJKLMNOP          
# 11 ABCDEFGHIJKLMNO           

Upvotes: 3

akrun
akrun

Reputation: 887058

Here we need just

library(dplyr)
mx <- as.numeric(max(nchar(names$Name)))
names$Name <- str_pad(names$Name, mx, side = "right", pad = "0")
names$Name

-output

#[1] "A0000000000000000000000000" "ABC00000000000000000000000" "ABCDEFG0000000000000000000" "ABCDEFGHIJKLMNOP0000000000"
#[5] "AB000000000000000000000000" "ABCDEFGHI00000000000000000" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKL00000000000000"
#[9] "ABCDEFGHIJKLMNOPQR00000000" "ABCDEFGHIJKLMNOP0000000000" "ABCDEFGHIJKLMNO00000000000"

NOTE: It is better not to name objects with names that are either function names or argument names

Upvotes: 4

Related Questions