DOBS
DOBS

Reputation: 89

joining a table to another table in pandas

I am trying to grab the data from, https://www.espn.com/nhl/standings

When I try to grab it, it is putting Florida Panthers one row to high and messing up the data. All the team names need to be shifted down a row. I have tried to mutate the data and tried,

dataset_one = dataset_one.shift(1)

and then joining with the stats table but I am getting NaN.

The docs seem to show a lot of ways of joining and merging data with similar columns headers but not sure the best solution here without a similar column header to join with.

Code:

import pandas as pd

page = pd.read_html('https://www.espn.com/nhl/standings')

dataset_one = page[0]   # Team Names
dataset_two = page[1]   # Stats

combined_data = dataset_one.join(dataset_two)
print(combined_data)

Output:

 FLAFlorida Panthers  GP  W  L  OTL  ...  GF  GA  DIFF    L10 STRK
0  CBJColumbus Blue Jackets   6  5  0    1  ...  22  16     6  5-0-1   W2
1    CARCarolina Hurricanes  10  4  3    3  ...  24  28    -4  4-3-3   L1
2           DALDallas Stars   6  5  1    0  ...  18  10     8  5-1-0   W4
3     TBTampa Bay Lightning   6  4  1    1  ...  23  14     9  4-1-1   L2
4     CHIChicago Blackhawks   6  4  1    1  ...  19  14     5  4-1-1   W1
5    NSHNashville Predators  10  3  4    3  ...  26  31    -5  3-4-3   W1
6      DETDetroit Red Wings   8  4  4    0  ...  20  24    -4  4-4-0   L1

Desired:

                             GP  W  L  OTL  ...  GF  GA  DIFF    L10 STRK
    0    FLAFlorida Panthers   6  5  0    1  ...  22  16     6  5-0-1   W2
    1    CBJColumbus Blue Jackets  10  4  3    3  ...  24  28    -4  4-3-3   L1
    2    CARCarolina Hurricanes         6  5  1    0  ...  18  10     8  5-1-0   W4
    3    DALDallas Stars    6  4  1    1  ...  23  14     9  4-1-1   L2
    4    TBTampa Bay Lightning  6  4  1    1  ...  19  14     5  4-1-1   W1
    5    CHIChicago Blackhawks  10  3  4    3  ...  26  31    -5  3-4-3   W1
    6    NSHNashville Predators  8  4  4    0  ...  20  24    -4  4-4-0   L1
    7    DETDetriot Red Wings 10  2  6  2   6   ...  20  35   -15  2-6-2   L6

Upvotes: 1

Views: 157

Answers (2)

AnkurSaxena
AnkurSaxena

Reputation: 825

Providing an alternative approach to @Noah's answer. You can first add an extra row, shift the df down by a row and then assign the header col as index 0 value.

import pandas as pd

page = pd.read_html('https://www.espn.com/nhl/standings')
dataset_one = page[0]  # Team Names
dataset_two = page[1]  # Stats

# Shifting down by one row
dataset_one.loc[max(dataset_one.index) + 1, :] = None
dataset_one = dataset_one.shift(1)
dataset_one.iloc[0] = dataset_one.columns
dataset_one.columns = ['team']

combined_data = dataset_one.join(dataset_two)

Upvotes: 1

noah
noah

Reputation: 2776

Just create the df slightly differently so it knows what is the proper header

dataset_one = pd.DataFrame(page[0], columns=["Team Name"])

Then when you join it should be aligned properly.

Another alternative is to do the following:

dataset_one = page[0].to_frame(name='Team Name')

Upvotes: 0

Related Questions