Reputation: 27
I want to scrape the coin names from this website (https://www.coingecko.com/en/coins/recently_added?page=1)
I have come up with this to do it:
import pandas as pd
import requests
import time
url1 = "https://www.coingecko.com/en/coins/recently_added?page=1"
df = pd.read_html(requests.get(url1).text, flavor="bs4")
df = pd.concat(df).drop(["Unnamed: 0"], axis=1)
df1=df['Coin']
print(df1)
The code prints this:
0 Corgi Inu CORGI CORGI
1 Bistroo BIST BIST
2 FireBall FIRE FIRE
3 Neko Network NEKO NEKO
4 LatteSwap LATTE LATTE
...
I want to only select the names that appear in the first column, how do I do that?
Upvotes: 1
Views: 48
Reputation: 1055
You could split each row and grab what you need with a list comprehension :
# the [:-2] will omit the last two "columns"
coin_names = [" ".join(s.split()[:-2]) for s in list(df1)]
Upvotes: 0
Reputation: 195573
You can use .str.rsplit
:
names = df["Coin"].str.rsplit(n=2).str[0]
print(names)
Prints:
0 Corgi Inu
1 Bistroo
2 FireBall
3 Neko Network
4 LatteSwap
5 Voltbit
6 Paddycoin
7 Bezoge Earth
8 Anonymous BSC
...and so on.
If you want it in a list form:
print(names.tolist())
Upvotes: 1