psysky
psysky

Reputation: 3195

getting data from external csv file in R

This post similar with this my post matching dataset with data in csv file in R, but here another external source and structure of this external csv file, and three groups, so there is problem.

I have csv file which has only one column

,"x"
1,"11202 3322 2018"
2,"11271 3322 2018"
3,"11353 2261 2018"
4,"11353 3322 2018"
5,"11353 3380 2018"
6,"11418 2247 2018"
7,"11418 2261 2018"
8,"11418 2316 2018"
9,"11418 3322 2018"
10,"11418 3740 2018"
11,"11511 979 2018"
12,"11514 196 2017"
13,"11514 377 2017"

3 groups are indicated through a space. It is mean

group1,group2,group3
11202,  3322,  2018 

this format comes from external source and i can't change it.

There is my data.

dataset=structure(list(group1 = c(11202L, 11271L, 11353L, 11353L, 11353L, 
11418L, 11418L, 11418L, 11418L, 11222L, 11223L, 11224L, 11225L, 
11226L, 11227L, 11228L), group2 = c(3322L, 3322L, 2261L, 3322L, 
3380L, 2247L, 2261L, 2316L, 3322L, 222L, 222L, 222L, 222L, 222L, 
222L, 222L), group3 = c(2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 
2018L, 2018L, 2018L, 111L, 111L, 111L, 111L, 111L, 111L, 111L
), x1 = 1:16), .Names = c("group1", "group2", "group3", "x1"), class = "data.frame", row.names = c(NA, 
-16L))

so these groups were in external csv file

group1  group2  group3  x1
11202   3322    2018    1
11271   3322    2018    2
11353   2261    2018    3
11353   3322    2018    4
11353   3380    2018    5
11418   2247    2018    6
11418   2261    2018    7
11418   2316    2018    8
11418   3322    2018    9

i don't work with it. I must work with new group. So output dataset

group1  group2  group3  x1
11222   222      111    10
11223   222      111    11
11224   222      111    12
11225   222     111     13
11226   222      111    14
11227   222      111    15
11228   222      111    16

How to perform such match. Here three key columns.

edit

dim(dataset) [1] 16 4

Upvotes: 0

Views: 81

Answers (1)

rahul
rahul

Reputation: 591

I am assuming that you have two columns also assuming that the first column is just a sequence(as replied by you in questions I hold my assumption correct), if you have only one column then do the same operation that I have mentioned below using "," as pattern and then discard the first column of resulting data frame

data<-data.frame(col=c("1 2 3","5 6 7"))

   col
  1 2 3
  5 6 7
 out<-do.call('rbind',(str_split(data$col,pattern = " ")))
 colnames(out)<-c('group1','group2','group3')

 print(out)
 group1 group2 group3
 "1"    "2"    "3"   
 "5"    "6"    "7"   

Upvotes: 2

Related Questions