Reputation: 845
For a current research project, I am planning to count the unique words of different objects in a JSON file. Ideally, the output file should show separate word count summaries (counting the occurence of unique words) for the texts in "Text Main"
, "Text Pro"
and "Text Con"
. Is there any smart tweak to make this happen?
At the moment, I am receiving the following error message:
File "index.py", line 10, in <module>
text = data["Text_Main"]
TypeError: list indices must be integers or slices, not str
The JSON file has the following structure:
[
{"Stock Symbol":"A",
"Date":"05/11/2017",
"Text Main":"Text sample 1",
"Text Pro":"Text sample 2",
"Text Con":"Text sample 3"}
]
And the corresponding code looks like this:
# Import relevant libraries
import string
import json
import csv
import textblob
# Open JSON file and slice by object
file = open("Glassdoor_A.json", "r")
data = json.load(file)
text = data["Text_Main"]
# Create an empty dictionary
d = dict()
# Loop through each line of the file
for line in text:
# Remove the leading spaces and newline character
line = line.strip()
# Convert the characters in line to
# lowercase to avoid case mismatch
line = line.lower()
# Remove the punctuation marks from the line
line = line.translate(line.maketrans("", "", string.punctuation))
# Split the line into words
words = line.split(" ")
# Iterate over each word in line
for word in words:
# Check if the word is already in dictionary
if word in d:
# Increment count of word by 1
d[word] = d[word] + 1
else:
# Add the word to dictionary with count 1
d[word] = 1
# Print the contents of dictionary
for key in list(d.keys()):
print(key, ":", d[key])
# Save results as CSV
with open('Glassdoor_A.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(["Word", "Occurences", "Percentage"])
writer.writerows([key, d[key])
Upvotes: 0
Views: 1359
Reputation: 127
Your JSON file has an object inside a list. In order to access the content you want, first you have to access the object via data[0]
. Then you can access the string field. I would change the code to:
# Open JSON file and slice by object
file = open("Glassdoor_A.json", "r")
data = json.load(file)
json_obj = data[0]
text = json_obj["Text_Main"]
or you can access that field in a single line with text = data[0]["Text_Main"]
as quamrana stated.
Upvotes: 1
Reputation: 39354
Well, firstly the key should be "Text Main"
and secondly you need to access the first dict
in the list
. So just extract the text
variable like this:
text = data[0]["Text Main"]
This should fix the error message.
Upvotes: 1