Sarah Decker
Sarah Decker

Reputation: 113

Data separation

I have a txt file which I want to upload to R, it looks like that:

NAMEOFTHESTUDENT1 CLASS1-NOTE1,NOTE2,NOTE3;CLASS2-NOTE1,NOTE2,NOTE3;CLASS3-NOTE1,NOTE2,NOTE3
NAMEOFTHESTUDENT2 CLASS1-NOTE1,NOTE2,NOTE3;CLASS2-NOTE1,NOTE2,NOTE3;CLASS3-NOTE1,NOTE2,NOTE3

I want to create a data frame to obtain:

NAMEOFTHESTUDENT1 CLASS1 NOTE1
NAMEOFTHESTUDENT1 CLASS1 NOTE2
NAMEOFTHESTUDENT1 CLASS1 NOTE3

NAMEOFTHESTUDENT1 CLASS2 NOTE1
NAMEOFTHESTUDENT1 CLASS2 NOTE2
NAMEOFTHESTUDENT1 CLASS2 NOTE3

NAMEOFTHESTUDENT1 CLASS3 NOTE1
NAMEOFTHESTUDENT1 CLASS3 NOTE2
NAMEOFTHESTUDENT1 CLASS3 NOTE3

NAMEOFTHESTUDENT2 CLASS1 NOTE1
NAMEOFTHESTUDENT2 CLASS1 NOTE2
NAMEOFTHESTUDENT2 CLASS1 NOTE3

NAMEOFTHESTUDENT2 CLASS2 NOTE1
NAMEOFTHESTUDENT2 CLASS2 NOTE2
NAMEOFTHESTUDENT2 CLASS2 NOTE3

NAMEOFTHESTUDENT2 CLASS3 NOTE1
NAMEOFTHESTUDENT2 CLASS3 NOTE2
NAMEOFTHESTUDENT2 CLASS3 NOTE3

Can someone help me to do this? I tried with a 'For' loop but every time the variable NAMEOFTHESTUDENT is shifting.

Upvotes: 1

Views: 112

Answers (1)

Jaap
Jaap

Reputation: 83215

Another option is to use a nested cSplit approach:

library(splitstackshape)
cSplit(
  cSplit(
    cSplit(
      dat, 'V2', sep = ';', direction = 'long'
    ), 
    'V2', sep = '-', direction = 'wide'
  ), 
  'V2_2', sep = ',', direction = 'long'
)

which gives:

                   V1   V2_1  V2_2
 1: NAMEOFTHESTUDENT1 CLASS1 NOTE1
 2: NAMEOFTHESTUDENT1 CLASS1 NOTE2
 3: NAMEOFTHESTUDENT1 CLASS1 NOTE3
 4: NAMEOFTHESTUDENT1 CLASS2 NOTE1
 5: NAMEOFTHESTUDENT1 CLASS2 NOTE2
 6: NAMEOFTHESTUDENT1 CLASS2 NOTE3
 7: NAMEOFTHESTUDENT1 CLASS3 NOTE1
 8: NAMEOFTHESTUDENT1 CLASS3 NOTE2
 9: NAMEOFTHESTUDENT1 CLASS3 NOTE3
10: NAMEOFTHESTUDENT2 CLASS1 NOTE1
11: NAMEOFTHESTUDENT2 CLASS1 NOTE2
12: NAMEOFTHESTUDENT2 CLASS1 NOTE3
13: NAMEOFTHESTUDENT2 CLASS2 NOTE1
14: NAMEOFTHESTUDENT2 CLASS2 NOTE2
15: NAMEOFTHESTUDENT2 CLASS2 NOTE3
16: NAMEOFTHESTUDENT2 CLASS3 NOTE1
17: NAMEOFTHESTUDENT2 CLASS3 NOTE2
18: NAMEOFTHESTUDENT2 CLASS3 NOTE3

Used data:

dat <- read.table(text = "NAMEOFTHESTUDENT1 CLASS1-NOTE1,NOTE2,NOTE3;CLASS2-NOTE1,NOTE2,NOTE3;CLASS3-NOTE1,NOTE2,NOTE3
NAMEOFTHESTUDENT2 CLASS1-NOTE1,NOTE2,NOTE3;CLASS2-NOTE1,NOTE2,NOTE3;CLASS3-NOTE1,NOTE2,NOTE3", header = FALSE, as.is = TRUE)

Upvotes: 1

Related Questions