Reputation: 159
I have two data frames that I want to combine into one. I want to combine two columns together.
0
data_1_circulating_supply 17584875
data_2_circulating_supply 1.05209e+08
data_3_circulating_supply 41432141931
data_4_circulating_supply 6.08515e+07
data_5_circulating_supply 9.06245e+08
data_6_circulating_supply 17668725
data_7_circulating_supply 1.41175e+08
data_8_circulating_supply 1.99636e+09
data_9_circulating_supply 1.92156e+10
data_10_circulating_supply 6.66821e+10
0
data_1_symbol BTC
data_2_symbol ETH
data_3_symbol XRP
data_4_symbol LTC
data_5_symbol EOS
data_6_symbol BCH
data_7_symbol BNB
data_8_symbol USDT
data_9_symbol XLM
data_10_symbol TRX
I want the new dataframe to look like this:
BTC 17584875
ETH 1.05209e+08
XRP 41432141931
LTC 6.08515e+07
EOS 9.06245e+08
BCH 17668725
BNB 1.41175e+08
USDT 1.99636e+09
XLM 1.92156e+10
TRX 6.66821e+10
I cannot make it to work.
Upvotes: 2
Views: 104
Reputation: 765
df2.sql.select("*,regexp_extract(index,'(.*_\d{1,2})_.*',1) col22").join(df1.sql.select("col1 col3,regexp_extract(index,'(.*_\d{1,2})_.*',1) col21"),condition="col21=col22",how="left").select("col1,col3")
┌─────────┬───────────────┐
│ col1 │ col3 │
│ varchar │ double │
├─────────┼───────────────┤
│ BTC │ 17584875.0 │
│ ETH │ 105209000.0 │
│ XRP │ 41432141931.0 │
│ LTC │ 60851500.0 │
│ EOS │ 906245000.0 │
│ BCH │ 17668725.0 │
│ BNB │ 141175000.0 │
│ USDT │ 1996360000.0 │
│ XLM │ 19215600000.0 │
│ TRX │ 66682100000.0 │
├─────────┴───────────────┤
│ 10 rows 2 columns │
└─────────────────────────┘
Upvotes: 0
Reputation: 1722
seems that your two dataframes have perfect incremental index. So if the same rule apply for rows bigger than 10th, you can just apply:
# [df1, df2]: your two dataframes
# axis=1: horizontally
# ignore_index=True: we don't need your previous indexes as row matches each other anyway
result = pd.concat([df1, df2], axis=1, ignore_index=True, sort=False)
Upvotes: 0
Reputation: 3184
Reshape each index so that there is only the numbers left:
df1.index = df1.index.str.split('_', n=2).str[1]
df2.index = df2.index.str.split('_', n=2).str[1]
Then put the two frames together.
pd.concat([df2, df1], axis=1)
0 0
1 BTC 1.758488e+07
2 ETH 1.052090e+08
3 XRP 4.143214e+10
4 LTC 6.085150e+07
5 EOS 9.062450e+08
6 BCH 1.766872e+07
7 BNB 1.411750e+08
8 USDT 1.996360e+09
9 XLM 1.921560e+10
10 TRX 6.668210e+10
Upvotes: 2