user2300940
user2300940

Reputation: 2385

Replace colnames to substring of colname

I wonder how I I can replace the colnames of my data frame to be the unique string in the original colname?

> colnames(df.iso)
 [1] "../trimmed/100G.tally.fasta" "../trimmed/100R.tally.fasta" "../trimmed/106G.tally.fasta"
 [4] "../trimmed/106R.tally.fasta" "../trimmed/122G.tally.fasta" "../trimmed/122R.tally.fasta"
 [7] "../trimmed/124G.tally.fasta" "../trimmed/124R.tally.fasta" "../trimmed/126G.tally.fasta"
[10] "../trimmed/126R.tally.fasta" "../trimmed/134G.tally.fasta" "../trimmed/134R.tally.fasta"

Upvotes: 3

Views: 976

Answers (2)

akrun
akrun

Reputation: 887028

We can use sub with ?basename to extract the substring from the column names. Assign the output back to the column names to reflect the change.

colnames(df.iso) <- sub("\\..*", '', basename(colnames(df.iso)))

If we don't want to use basename, sub can also be used alone.

colnames(df.iso) <- sub("([^/]+/){2}([^.]+).*",
                              "\\2", colnames(df.iso))

Upvotes: 2

lmo
lmo

Reputation: 38500

Similarly to @Akrun's second answer,

colnames(df.iso) <- sub("[^0-9]+([0-9]+[A-Z])\\.tal.*", "\\1", colnames(df.iso))

Should also do the trick. His first method is likely faster, which probably won't matter here.

Upvotes: 1

Related Questions