Bustikiller
Bustikiller

Reputation: 2508

Find line number of key in JSON

Given a JSON file and the path to a key, I need to be able to get the line number where the value is stored. Values that need multiple lines (for example, arrays) are currently out of the scope until I sort out how to deal with the simplest case. For example, in the following JSON:

{                        # Line 1
    "foo": {             # Line 2
        "bar": [
            "value1",
            "value2"
        ],
        "bar2": 2        # Line 7
    },
    "bar": {
        "bar": [
            "value1",
            "value2"
        ],
        "bar2": 5
    }
}

I should get 7 when looking for the key path foo.bar2.

I have a working solution assuming that the original file is formatted with JSON.pretty_generate:

parsed_json = File.read(file)
random_string = SecureRandom.uuid

parsed_json.bury(*path.split('.'), random_string)

JSON.pretty_generate(parsed_json)
    .split("\n")
    .take_while { |s| s.exclude? random_string }
    .count + 1

What I am doing here is parsing the JSON file, replacing the existing value with a random string (a UUID in this case), formatting the hash into pretty-printed JSON, and finding the line where the random string is present. The Hash.bury method used here works as defined in https://bugs.ruby-lang.org/issues/11747.

This solution is working fine (not heavily tested yet, though), but I am struggling to make it work when the original file is not formatted as a pretty-printed JSON. For example, the following file would be equivalent to the one above:

{                                     # Line 1
    "foo": {                          # Line 2
        "bar": ["value1","value2"],  
        "bar2": 2                     # Line 4
    },
    "bar": {
        "bar": [
            "value1",
            "value2"
        ],
        "bar2": 5
    }
}

but the algorithm above would still return 7 as the line where foo.bar2 is located, although now it is in line 4.

Is there any way to reliably get the line number where a key is placed inside the JSON file?

Upvotes: 0

Views: 1795

Answers (2)

yauhenininjia
yauhenininjia

Reputation: 308

Here is the easiest way I found without building your own JSON parser: replace every key entry with unique UUID (alias), then build all combinations of aliases and find that one that returns data from #dig call

keys = path.split('.')
file_content = File.read(file_path).gsub('null', '1111')
aliases = {}

keys.each do |key|
  pattern = "\"#{key}\":"

  file_content.scan(pattern).each do
    alias_key = SecureRandom.uuid
    file_content.sub!(pattern, "\"#{alias_key}\":")

    aliases[key] ||= []
    aliases[key] << alias_key
  end
end

winner = aliases.values.flatten.combination(keys.size).find do |alias_keys|
  # nulls were gsubbed above to make this check work in edge case when path value is null
  JSON.parse(file_content).dig(*alias_keys).present?
end

file_content.split("\n").take_while { |line| line.exclude?(winner.last) }.count + 1

UPD: The snippet above should not work if JSON value by your foo.bar2 keys is false. You should gsub it as well or make this snippet smarter

Upvotes: 2

joel1di1
joel1di1

Reputation: 773

I had an idea: use the same uuid trick (which is brilliant BTW), and compare your json and the file content, but ignoring every blank space.

Once you find different characters, it should be the line you want.

I wrote a program that seems to work, not heavily tested ;) (I copied the bury definition):

require 'json'
require 'readline'
require 'securerandom'

class Hash
  def bury *args
    if args.count < 2
      raise ArgumentError.new("2 or more arguments required")
    elsif args.count == 2
      self[args[0]] = args[1]
    else
      arg = args.shift
      self[arg] = {} unless self[arg]
      self[arg].bury(*args) unless args.empty?
    end
    self
  end
end

file_path = 'your/file/path.json'
path = 'foo.bar2'

file_content = File.read(file_path)
parsed_json = JSON.parse(file_content)
uuid = SecureRandom.uuid

parsed_json.bury(*path.split('.'), uuid)

compacted_json = JSON.pretty_generate(parsed_json).gsub(/\s/, '')
@file_lines = File.readlines(file_path)

index = 0
loop do
  compact_line = @file_lines[index].gsub(/\s/, '')
  break if !compacted_json.start_with?(compact_line)
  index += 1
  compacted_json = compacted_json[compact_line.length..-1]
end

puts "line is #{index+1}"

Upvotes: 0

Related Questions