Bananach
Bananach

Reputation: 2311

How to change values in unnamed first column

How do I change the entries of the first column in the matrix returned by read_csv if it doesn't have a header?

My variables currently looks like this:

                     PostFC       C1Mean
WBGene00001816 2.475268e-01   415.694457
WBGene00001817 4.808575e+00  2451.018711

and I'd like to rename WBGene0000XXXX to XXXX.

Upvotes: 0

Views: 594

Answers (3)

jay.sf
jay.sf

Reputation: 72919

The entries addressed are actually row names. We can access them with rownames(.).

rownames(df1)
# [1] "WBGene00001816" "WBGene00001817" "WBGene00001818" "WBGene00001819"
# [5] "WBGene00001820" "WBGene00001821" "WBGene00001822"

In R also implemented is rownames<-, i.e. we can assign new rownames by doing rownames(.) <- c(.).

Now in your case it looks like if you want to keep just the last four digits. We may use substring here, which we tell from which digit it should extract. In our case it is the 11th digit to the last, so we do:

rownames(df1) <- substring(rownames(df1), 11)
df1
#           PostFC     C1Mean
# 1816  0.36250598  2.1073145
# 1817  0.51068402  0.4186838
# 1818 -0.96837330 -0.7239156
# 1819  0.02331745 -0.5902216
# 1820 -0.56927945  1.7540356
# 1821 -0.51252943  0.1343385
# 1822  0.47263180  1.4366233

Note, that duplicated row names are not allowed, i.e. if you obtain duplicates applying this method it will yield an error.

Data used

df1 <- structure(list(PostFC = c(0.362505982864934, 0.510684020059692, 
-0.968373302351162, 0.0233174467410604, -0.56927945273647, -0.512529427359891, 
0.472631804850333), C1Mean = c(2.10731450148575, 0.418683823183885, 
-0.723915648073638, -0.590221641040516, 1.75403562218217, 0.134338480077884, 
1.43662329542089)), class = "data.frame", row.names = c("1816", 
"1817", "1818", "1819", "1820", "1821", "1822"))

Upvotes: 1

If I understand your question correctly the first "unnamed" column you describe are rownames and are not actually in you data.frame

# Example data 
df = data.frame(PostFC = c(2.475268e-01, 4.808575e+00), C1Mean = c(415.694457, 2451.018711) )
rownames(df) = c("WBGene00001816", "WBGene00001817")
df
# PostFC    C1Mean
# WBGene00001816 0.2475268  415.6945
# WBGene00001817 4.8085750 2451.0187

# change rownames
rownames(df) = c("rowname1", "rowname2")
df
# PostFC    C1Mean
# rowname1 0.2475268  415.6945
# rowname2 4.8085750 2451.0187

Upvotes: 2

Justin Landis
Justin Landis

Reputation: 2071

If the first column is actually the rownames do the following

rownames(data) <- gsub(pattern = "WBGene0000", replacement = "", x = rownames(data))

If it isn't consistent, you may want to consider the stringr package and use the substr function

But if it is actually a vector with no header column, I do not know how to reference it without knowing the structure of the data.

run the str function of the data set and see what it returns. Or do the following as a test

 colnames(data)[1] <- "test" 

Can't exactly help until we know how you have a "zero-length" variable name

Upvotes: 2

Related Questions