dataframes have different sizes when referencing same dataframe?

Question

I have a strange problem. I have a dataset that I'm trying to select unique values from then save those values into a dataframe. After that, I want one dataframe with all the data and another with simply a numerical value. Both dataframes should be the same size.

Here's my loop and not sure why it's not working:

for uniqueFundName in HoldingCompanies['fund_ticker'].unique():
  print(uniqueFundName)
  modelY = (HoldingCompanies.loc[HoldingCompanies['fund_ticker'] == uniqueFundName]) 
  modelX = modelY['fund_ticker']
  modelX['fund_ticker'] = 1

  del modelY['fund_ticker']
  print(modelX.shape)
  print(modelY.shape)

This is my output:

GBRE.LSE
(234,)
(233, 174)
MACEX.US
(35,)
(34, 174)
ANFVX.US
(43,)
(42, 174)
LQGH.LSE
(11,)
(10, 174)
HAC.TO
(39,)
(38, 174)
JSAYX.US
(26,)
(25, 174)

The modelX is always one value less than the modelY variable. This is confusing because I'm referencing the modelY value to create the modelX column.

What am I doing wrong?

Ofek Glick · Accepted Answer

ModelY['fund_ticker'] is a pandas Series object, when trying to access it later using modelX['fund_ticker']=1, you are simply adding another value to the series in the index 'fund_ticker' with a value of 1 to the exact same Series. So the size of modelX is increasing in 1 since basically all you did was add another row to the Series.

dataframes have different sizes when referencing same dataframe?

Answers (1)

Related Questions