Gerald Oakham
Gerald Oakham

Reputation: 173

Reading more than one JSON file into a memory stream

I have some c# code that happly reads the contents of a .gz file (which contains 1 .json file) into a memory stream and finds the first instance of "[" and obtains data after that. However, depending on the amount of data I have sent that creates the gz response, it may contain more than 1 .json file. How can I change my code so it either removes the header and any following JSON files "starters", or (prob a better option) reads each file in turn into a memory stream and obtains the data needed?

Current code snippet is

string responseBodyUrl = getResponseObject.response_body_url;

// get the tar.gr file and expand
WebClient webClient = new WebClient();
Stream stream = webClient.OpenRead(responseBodyUrl);
MemoryStream memoryStream = new MemoryStream();
GZipStream gzipStream = new GZipStream(stream, CompressionMode.Decompress);
gzipStream.CopyTo(memoryStream);
gzipStream.Close();
stream.Close();
memoryStream.Position = 0;
StreamReader reader = new StreamReader(memoryStream);
string memstreamjson = reader.ReadToEnd();
reader.Close();
memoryStream.Close();

// find the index of the first '[' character
int index = memstreamjson.IndexOf('[');
                                  System.IO.File.AppendAllText(@"memstreamjson.log",memstreamjson.ToString().TrimEnd() + Environment.NewLine);

// if found
if (index != -1)
 {
  // get the substring from that index to the end
  string indexedmemstreamjson = memstreamjson.Substring(index);

  // parse the JSON string as an array
  JArray arr = JArray.Parse(indexedmemstreamjson.ToString());
  
  // loop through each element of the array
  foreach (JObject obj in arr)
   {
    // get the status_code value of the JObject
    string status_code = (string)obj["status_code"];
etc

gz file input (multiple (2) files) looks like :

./ 0000755 0000000 0000000 00000000000 14512730710 007711 5 ustar root root ./0q6hnae8uz.json 0000666 0000000 0000000 00016772531 14512730710 012547 0 ustar root root [{"status_code":200,"operation_id":null,"response":"{\"id\":\"19159244156b70211d3e26b64ce38fc2\",\"email_address\":\"[email protected]\"}] ./46kbba2abt.json 0000666 0000000 0000000 00000007550 14512730710 012441 0 ustar root root [{"status_code":200,"operation_id":null,"response":"{\"id\":\"0ec0b4c07a27c60069735d321875fa78\",\"email_address\":\"sammy:sammymail.com\"}]

Many Thanks in advance

Tried the above code which finds when there is only one JSON file in the gz, but fails when there are multiples. I'm expecting to get the beginning of each JSON file and then read the read-line content to the console.

Upvotes: 0

Views: 156

Answers (0)

Related Questions