Joseph Gagnon
Joseph Gagnon

Reputation: 2115

Convert JSON array of arrays to map of objects (Java)

I have data that was pulled from an Excel file into a JSON structure using Java, Apache POI and Jackson. The resulting JSON data structure looks like the following:

{
  "fileName" : "C:\\Users\\jgagnon\\sample_data\\PDM_BOM.xlsx",
  "sheets" : [ {
    "name" : "PDM_BOM",
    "data" : [ [ "BRANCH", "PARENT ITEM NUMBER", "2ND ITEM NUMBER", "QUANTITY REQUIRED", "UNIT OF MEASURE", "ISSUE TYPE CODE", "LINE TYPE", "STOCK TYPE", "TYPE BOM", "LINE NUMBER", "OPERATING SEQUENCE", "EFFECTIVE FROM DATE", "EFFECTIVE THRU DATE", "DRAWING NUMBER", "UNIT COST", "SCRAP PERCENT" ],
        [ "B20", "208E8840040", "5P884LPFSR2", 0.32031, "LB", "I", "S", "M", "M", 1.0, 10.0, "09/11/13", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840168", "5P884LPFSR2", 1.36, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840172", "5P884LPFSR2", 1.3924, "LB", "I", "S", "M", "M", 1.0, 10.0, "02/26/08", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "208E8840180", "5P884LPFSR2", 1.4565, "LB", "I", "S", "P", "M", 1.0, 10.0, "03/04/09", "12/31/40", null, 0.0, 0.0 ],
        [ "B20", "21PPH150166", "8P315TPMRG", 1.39629, "LB", "I", "S", "M", "M", 1.0, 10.0, "03/05/18", "12/31/40", null, 0.0, 0.0 ] ],
    "maxCols" : 16,
    "maxRows" : 14996
  } ]
}

In the data element, there is essentially an array of arrays, representing all of the rows from a worksheet. The first row array is the column headers for the data rows that follow.

I'd like to be able to restructure the data so that it's in the form of a map of objects, where each key is (for this example) the value in the PARENT ITEM NUMBER column. The map value associated with the key would be a JSON object consisting of the key/value pairs of column header and column value for the current data row.

So, using the example above, I'd end up with something like this: (I may have the JSON syntax/structuring incorrect)

{
  "data": {
    "208E8840040": {
      "BRANCH": "B20",
      "PARENT ITEM NUMBER": "208E8840040",
      "2ND ITEM NUMBER": "5P884LPFSR2",
      "QUANTITY REQUIRED": 0.32031,
      "UNIT OF MEASURE": "LB",
      "ISSUE TYPE CODE": "I",
      "LINE TYPE": "S",
      "STOCK TYPE": "M",
      "TYPE BOM": "M", 
      "LINE NUMBER": 1.0, 
      "OPERATING SEQUENCE": 10.0, 
      "EFFECTIVE FROM DATE": "09/11/13", 
      "EFFECTIVE THRU DATE": "12/31/40", 
      "DRAWING NUMBER": null, 
      "UNIT COST": 0.0, 
      "SCRAP PERCENT": 0.0
    },
    "208E8840168": {
      "BRANCH": "B20",
      "PARENT ITEM NUMBER": "208E8840168",
      "2ND ITEM NUMBER": "5P884LPFSR2",
      "QUANTITY REQUIRED": 1.36,
      ...
    },
    ...
  }
}

I'm looking for a way to convert the former into the latter.

UPDATE:

I just realized I left out an important detail.

The data in this sheet (table), is essentially keyed by the PARENT ITEM NUMBER column. However, although that column is the primary identifier, it is not always unique in the table.

There are often cases, where there are multiple rows with the same value for PARENT ITEM NUMBER. Each of these rows contains information about elements that make up the "parent" (think of them as child items). These child items are identified by the 2ND ITEM NUMBER column.

Moreover, many child items will have their own row in the table where the PARENT ITEM NUMBER is the child item number identified by 2ND ITEM NUMBER for the parent. These child items, as you may have guessed, can have their own child items, and so on.

Basically, this is a tabular representation of multiple hierarchies of related data. Some child items will appear (are re-used) for multiple parent items.

I don't know in what ways this complicates what I'm trying to do.

UPDATE:

Thanks to https://stackoverflow.com/users/51591/micha%c5%82-ziober for the initial idea. I adapted it slightly to yield a map of lists of child items. The modified code follows:

  public String convertToJson(File jsonFile) throws IOException {
    ArrayNode arrayNode = readDataArray(jsonFile);

    List<Map<String, JsonNode>> rowMaps = convertArrayToMaps(arrayNode);

    Map<Object, List<Map<String, JsonNode>>> dataMap = rowMaps.stream()
        .collect(Collectors.groupingBy(map -> map.get("PARENT ITEM NUMBER").textValue()));
    return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
  }

Here's an example of the output:

{
  "data" : {
    "MTDMN97PJ1A9" : [ {    <- 1 child
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "MTDMN97PJ1A9",
      "2ND ITEM NUMBER" : "MTDMN970144XO",
      "QUANTITY REQUIRED" : 12.0,
      "UNIT OF MEASURE" : "SY",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "01/18/19",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    } ],
    "ZCP723A1152" : [ {     <- 4 children
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "5P587UMFSD2",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 3.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8P550ZPPOOLE",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8P906WPPA3077",
      "QUANTITY REQUIRED" : 4.64444,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 2.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "05/15/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }, {
      "BRANCH" : "B70",
      "PARENT ITEM NUMBER" : "ZCP723A1152",
      "2ND ITEM NUMBER" : "8U910LKSHBL",
      "QUANTITY REQUIRED" : 2.32222,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 4.01,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "12/13/17",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    } ],
    ... many more entries
}

Upvotes: 1

Views: 1653

Answers (1)

Michał Ziober
Michał Ziober

Reputation: 38625

You can read whole JSON payload as JsonNode and retrieve data property which is an array. If we can assume that always first element is array of names and all other elements are equal in size you can create Map manually and serialise it back to JSON. One tricky part is how to generate unique key value for the result JSON. Example code could look like below:

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.node.ArrayNode;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

public class Excel2JonsApp {
    public static void main(String[] args) throws IOException {
        File jsonFile = new File("./resource/test.json").getAbsoluteFile();

        Excel2JsonMapper mapper = new Excel2JsonMapper();
        String json = mapper.convertToJson(jsonFile);
        System.out.println(json);
    }
}

class Excel2JsonMapper {
    private final ObjectMapper jsonMapper;

    Excel2JsonMapper() {
        this.jsonMapper = createJsonMapper();
    }

    private ObjectMapper createJsonMapper() {
        ObjectMapper mapper = new ObjectMapper();
        mapper.enable(SerializationFeature.INDENT_OUTPUT);

        return mapper;
    }

    String convertToJson(File jsonFile) throws IOException {
        ArrayNode arrayNode = readDataArray(jsonFile);
        Map<Object, Map<String, JsonNode>> dataMap = convertArrayToMaps(arrayNode)
                .stream()
                .collect(Collectors.toMap(
                        map -> map.get("PARENT ITEM NUMBER").textValue(), //key generator function
                        Function.identity()));

        return jsonMapper.writeValueAsString(Collections.singletonMap("data", dataMap));
    }

    private List<Map<String, JsonNode>> convertArrayToMaps(ArrayNode arrayNode) {
        ArrayNode names = (ArrayNode) arrayNode.get(0);

        Iterator<JsonNode> iterator = arrayNode.iterator();
        iterator.next();// skip names

        List<Map<String, JsonNode>> list = new ArrayList<>();
        while (iterator.hasNext()) {
            ArrayNode values = (ArrayNode) iterator.next();
            Map<String, JsonNode> map = new LinkedHashMap<>();
            for (int i = 0; i < names.size(); i++) {
                map.put(names.get(i).textValue(), values.get(i));
            }
            list.add(map);
        }
        return list;
    }

    private ArrayNode readDataArray(File jsonFile) throws IOException {
        JsonNode root = jsonMapper.readTree(jsonFile);
        return (ArrayNode) root.at("/sheets/0/data");
    }
}

Above code prints:

{
  "data" : {
    "208E8840168" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840168",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.36,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "02/26/08",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "21PPH150166" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "21PPH150166",
      "2ND ITEM NUMBER" : "8P315TPMRG",
      "QUANTITY REQUIRED" : 1.39629,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "03/05/18",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840180" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840180",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.4565,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "P",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "03/04/09",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840172" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840172",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 1.3924,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "02/26/08",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    },
    "208E8840040" : {
      "BRANCH" : "B20",
      "PARENT ITEM NUMBER" : "208E8840040",
      "2ND ITEM NUMBER" : "5P884LPFSR2",
      "QUANTITY REQUIRED" : 0.32031,
      "UNIT OF MEASURE" : "LB",
      "ISSUE TYPE CODE" : "I",
      "LINE TYPE" : "S",
      "STOCK TYPE" : "M",
      "TYPE BOM" : "M",
      "LINE NUMBER" : 1.0,
      "OPERATING SEQUENCE" : 10.0,
      "EFFECTIVE FROM DATE" : "09/11/13",
      "EFFECTIVE THRU DATE" : "12/31/40",
      "DRAWING NUMBER" : null,
      "UNIT COST" : 0.0,
      "SCRAP PERCENT" : 0.0
    }
  }
}

Upvotes: 1

Related Questions