Malte Susen
Malte Susen

Reputation: 845

Pandas: TypeError: string indices must be integers

For a current research project, I am planning to read the JSON object "Main_Text" within a pre-defined time range on basis of Python/Pandas. The code however yields the error TypeError: string indices must be integers for line line = row["Main_Text"].

I have alreay gone through pages addressing the same issue but not found any solution yet. Is there any helpful tweak to make this work?

The JSON file has the following structure:

[
{"No":"121","Stock Symbol":"A","Date":"05/11/2017","Text Main":"Sample text"}
]

And the corresponding code section looks this this:

import string
import json
import csv

import pandas as pd
import datetime

import numpy as np


# Loading and reading dataset
file = open("Glassdoor_A.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df['Date'] = pd.to_datetime(df['Date'])


# Create an empty dictionary
d = dict()


# Filtering by date
start_date = "01/01/2009"
end_date = "01/01/2015"

after_start_date = df["Date"] >= start_date
before_end_date = df["Date"] <= end_date

between_two_dates = after_start_date & before_end_date
filtered_dates = df.loc[between_two_dates]

print(filtered_dates)


# Processing
for row in filtered_dates:
    line = row["Text Main"]
    # Remove the leading spaces and newline character
    line = line.strip()

Upvotes: 0

Views: 3197

Answers (1)

Anshul
Anshul

Reputation: 1413

If the requirement is to collect all the contents of 'Text Main' column, this is what we can do:

line = list(filtered_dates['Text Main'])

We can then then also apply strip:

line = [val.strip() for val in line]

Upvotes: 1

Related Questions