Jeremy.O
Jeremy.O

Reputation: 21

How do I read a single value from dataframe in Python?

I am trying to find a way to read just one value from a big dataframe in Python. I have 2 data tables in my project.

One looks like this:

Company ID  Company  201512  201511  ...  199402  199401
1234        abc      1.1     0.8     ...  2.1     -0.9
.
.
.
4321        cba      2.1     -0.4    ...  0.3     -0.1

There are about 260 months and 10,000 companies. I need to check their monthly returns one by one and see if there are 36 valid data points behind that data point. That means there is no "0" or "NaN". If there are 36 valid data points, I need to run a regression of these 36 data points against 7 factors, which are listed in another table.

The other table looks like this:

Month    Factor1     Factor2     ...     Factor6     Factor7  
201512   -0.4        1.1         ...     2.1         1.2
.
.
.
199401   0.1         0.2         ...     0.3         0.4

Now my problem is, I couldn't find a way to load just one value at a time from table 1 and create a loop for it. Can someone please advise?

Upvotes: 0

Views: 667

Answers (2)

acushner
acushner

Reputation: 9946

you don't want a for loop for this.

assuming 0 is a valid monthly return and that you only have 36 columns after Company you can easily find all companies with valid monthly return data:

df = df[df.notnull().all(1)]

if, for some unknown reason, you want to get rid of 0s, you can do a replace first:

df = df[df.replace(0, np.nan).notnull().all(1)]

edit for the comment:

you could do something like:

cols = df.columns
first_col = get_first_return_col(df)
for i in range(first_col, len(cols)):
    df = df[df[cols[i : i + 36]].notnull().all(1)]
    run_regression(df[cols[i]])

Upvotes: 0

user8658280
user8658280

Reputation:

You can iterate over rows with following code:

for index, row in df.iterrows():

Then the index would be the index of the row, and you can access the columns with lets say row["Company"] for example.

Upvotes: 1

Related Questions