Marika
Marika

Reputation: 1

How to extract data from the UN Comtrade API via a loop?

I am very new to R Studio and I am trying to create a loop to import data from UN comtrade since I can only import a limited number of observations in each call.

I use the following function to get my data for one year at the time (in this case 2006):

dt06 <- get.Comtrade(r="all", p="0", ps="2006", 
                    rg=2, cc="0,1,24,25,5,54,63,64,67,69,71,72,73,74,75,76,764", fmt="csv")
dt06df<- as.data.frame(do.call(rbind, dt06))

I want to do the same import for the years 2000-2020, hence repeat the code:

dt07 <- get.Comtrade(r="all", p="0", ps="2007", 
                    rg=2, cc="0,1,24,25,5,54,63,64,67,69,71,72,73,74,75,76,764", fmt="csv")
dt07df <- as.data.frame(do.call(rbind, dt07))

dt20 <- get.Comtrade(r="all", p="0", ps="2020", 
                     rg=2, cc="0,1,24,25,5,54,63,64,67,69,71,72,73,74,75,76,764", fmt="csv")
dt20df <- as.data.frame(do.call(rbind, dt20))

And then combine my data into one matrix

data <- rbind(dt07df,dt06df, dt20df)

I have never created a loop in R before and was wondering if it is more efficient/possible to use the repeat or for loop?

Could someone help me out to get started?

enter image description here

Upvotes: 0

Views: 1018

Answers (1)

stefan
stefan

Reputation: 125143

You could achieve your desired result like so:

  1. First approach uses a for. Here I first init an empty list dt to which I add the data for the single years. Note: The Comtrade API returns a list itself where the data is stored in an element called data. Instead of saving the whole list I extract the data via [["data"]].

  2. Second approach would make use of lapply which is a more elegant and perhaps more R-ish way to do the loop.

In either case you end up with a list of data frames, which you could then bind together using do.call(rbind, dt).

library(rjson)

years <- 2006:2007
dt <- list()
for (i in seq_along(years)) {
  dt[[i]] <- get.Comtrade(r="all", p="0", ps = years[i], 
               rg=2, cc="0,1,24,25,5,54,63,64,67,69,71,72,73,74,75,76,764", fmt="csv")[["data"]]
}
dt_bind <- do.call(rbind, dt)

# via lapply

dt <- lapply(years, function(x) {
  get.Comtrade(r="all", p="0", ps = x, 
                    rg=2, cc="0,1,24,25,5,54,63,64,67,69,71,72,73,74,75,76,764", fmt="csv")[["data"]]
})
dt_bind <- do.call(rbind, dt)

# Result
head(dt_bind, n = 5)
#>   Classification Year Period Period.Desc. Aggregate.Level Is.Leaf.Code
#> 1             H2 2006   2006         2006               2            0
#> 2             H2 2006   2006         2006               2            0
#> 3             H2 2006   2006         2006               2            0
#> 4             H2 2006   2006         2006               2            0
#> 5             H2 2006   2006         2006               2            0
#>   Trade.Flow.Code Trade.Flow Reporter.Code   Reporter Reporter.ISO Partner.Code
#> 1               2     Export             8    Albania          ALB            0
#> 2               2     Export            12    Algeria          DZA            0
#> 3               2     Export            20    Andorra          AND            0
#> 4               2     Export            31 Azerbaijan          AZE            0
#> 5               2     Export            32  Argentina          ARG            0
#>   Partner Partner.ISO X2nd.Partner.Code X2nd.Partner X2nd.Partner.ISO
#> 1   World         WLD                NA           NA               NA
#> 2   World         WLD                NA           NA               NA
#> 3   World         WLD                NA           NA               NA
#> 4   World         WLD                NA           NA               NA
#> 5   World         WLD                NA           NA               NA
#>   Customs.Proc..Code Customs Mode.of.Transport.Code Mode.of.Transport
#> 1                 NA      NA                     NA                NA
#> 2                 NA      NA                     NA                NA
#> 3                 NA      NA                     NA                NA
#> 4                 NA      NA                     NA                NA
#> 5                 NA      NA                     NA                NA
#>   Commodity.Code                                    Commodity Qty.Unit.Code
#> 1             24 Tobacco and manufactured tobacco substitutes             1
#> 2             24 Tobacco and manufactured tobacco substitutes             1
#> 3             24 Tobacco and manufactured tobacco substitutes             1
#> 4             24 Tobacco and manufactured tobacco substitutes             1
#> 5             24 Tobacco and manufactured tobacco substitutes             1
#>      Qty.Unit Qty Alt.Qty.Unit.Code Alt.Qty.Unit Alt.Qty Netweight..kg.
#> 1 No Quantity  NA                NA           NA      NA             NA
#> 2 No Quantity  NA                NA           NA      NA             NA
#> 3 No Quantity  NA                NA           NA      NA             NA
#> 4 No Quantity  NA                NA           NA      NA             NA
#> 5 No Quantity  NA                NA           NA      NA             NA
#>   Gross.weight..kg. Trade.Value..US.. CIF.Trade.Value..US..
#> 1                NA           4166452                    NA
#> 2                NA            214243                    NA
#> 3                NA            261794                    NA
#> 4                NA          19868803                    NA
#> 5                NA         253420990                    NA
#>   FOB.Trade.Value..US.. Flag
#> 1                    NA    0
#> 2                    NA    0
#> 3                    NA    0
#> 4                    NA    0
#> 5                    NA    0

getComtrade

Source: Comtrade

get.Comtrade <- function(url="http://comtrade.un.org/api/get?"
                         ,maxrec=50000
                         ,type="C"
                         ,freq="A"
                         ,px="HS"
                         ,ps="now"
                         ,r
                         ,p
                         ,rg="all"
                         ,cc="TOTAL"
                         ,fmt="json"
)
{
  string<- paste(url
                 ,"max=",maxrec,"&" #maximum no. of records returned
                 ,"type=",type,"&" #type of trade (c=commodities)
                 ,"freq=",freq,"&" #frequency
                 ,"px=",px,"&" #classification
                 ,"ps=",ps,"&" #time period
                 ,"r=",r,"&" #reporting area
                 ,"p=",p,"&" #partner country
                 ,"rg=",rg,"&" #trade flow
                 ,"cc=",cc,"&" #classification code
                 ,"fmt=",fmt        #Format
                 ,sep = ""
  )
  
  if(fmt == "csv") {
    raw.data<- read.csv(string,header=TRUE)
    return(list(validation=NULL, data=raw.data))
  } else {
    if(fmt == "json" ) {
      raw.data<- fromJSON(file=string)
      data<- raw.data$dataset
      validation<- unlist(raw.data$validation, recursive=TRUE)
      ndata<- NULL
      if(length(data)> 0) {
        var.names<- names(data[[1]])
        data<- as.data.frame(t( sapply(data,rbind)))
        ndata<- NULL
        for(i in 1:ncol(data)){
          data[sapply(data[,i],is.null),i]<- NA
          ndata<- cbind(ndata, unlist(data[,i]))
        }
        ndata<- as.data.frame(ndata)
        colnames(ndata)<- var.names
      }
      return(list(validation=validation,data =ndata))
    }
  }
}

Upvotes: 1

Related Questions