Lucien S.
Lucien S.

Reputation: 5345

How to deal with spaces in variables when using factor()?

I'm trying to melt and factor a csv table which looks like this:

Name        L.i  A.g.a.o.p  E.NGO  S.g.a.o.p  A   I
L i         29   7          19     5          1   21
A g a o p   7    5          5      3          0   1
E NGO       19   5          15     3          0   10
S g a o p   5    3          3      19         5   18
A           1    0          0      5          0   3
I           21   1          10     18         3   12

With this code:

mylevels <- table$Name
table.m <- melt(table)
table.m$Name <- factor(table.m$Name,levels=mylevels)
table.m$variable <- factor(table.m$variable, levels=mylevels)

The last factoring produces this:

  Name      variable value
1 L i           <NA>    15
2 A g a o p     <NA>     3
3 E NGO         <NA>     6
4 S g a or p    <NA>   -11
5 A             <NA>    -4
6 I             <NA>    -2
7 L i           <NA>     3
8 A g a o p     <NA>     4
9 E NGO         <NA>     1
10 S g a o p    <NA>    -2
11 Academia     <NA>    -1
12 I            <NA>    -4
13 L i          <NA>     6
14 A g a o p    <NA>     1
15 E NGO        <NA>    10
16 S g a o p    <NA>    -8
17 A            <NA>    -4
18 I            <NA>   -10
19 L i          <NA>   -11
20 A g a o p    <NA>    -2
21 E NGO        <NA>    -8
22 S g a o p    <NA>    15
23 A            <NA>    -2
24 I            <NA>     6
25 L i          A       -4
26 A g a o p    A       -1
27 E NGO        A       -4
28 S g a o p    A       -2
29 A            A        0
30 I            A        0
31 L i          I       -2
32 A g a o p    I       -4
33 E NGO        I      -10
34 S g a o p    I        6
35 A            I        0
36 I            I        8

I'm guessing the factoring didn't like the space in the names as variables containing spaces are replaced with . How to deal with spaces in variables for this kind of scenario?

Upvotes: 0

Views: 583

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226627

The values of table.m$variable are being assigned from the column names, which have had dots substituted for spaces (and other characters that are illegal in variable names). You can convert the dots back to spaces via

table.m$variable <- gsub("\\."," ",as.character(table.m$variable))

before your last line, then everything looks fine. Alternatively you might try reading your data in with check.names=FALSE in your read.table/read.csv call ...

Upvotes: 2

Related Questions