Reputation: 381
I'm trying to use a single value (String or number) to search in a DB (close to 60000 rows and 20 columns) and retrieve the corresponding column name for the value. How do i do this without performing a complete search on the entire DB?
If table is as follows
Country | State | Population
----------------------------------
USA | Texas | 100000
India | Maharashtra | 200000
Using just the value 'Maharashtra', I want my program to understand that the column for maharashtra is 'State' and not Country or Population.
The data is stored inside a pandas data frame.
Upvotes: 1
Views: 374
Reputation: 43
Try this code, it worked for me
df.columns[(df.values=='Maharashtra').any(0)].tolist()
the df is your DB in Pandas.Dataframe format
it returns a list which includes the name of the column
Upvotes: 0
Reputation: 2135
An inverse index might be the most performant solution if the data does not change often:
import numpy as np
import pandas as pd
df = pd.DataFrame({
"Country": ["USA", "India"],
"State": ["Texas", "Maharashtra"],
"Population": [100_000, 200_000],
})
# Create an inverse index - must be done only once:
inverse_map = pd.Series()
for idx, column in enumerate(df.columns):
column_data = pd.Series(np.repeat(idx, len(df[column])), index=df[column])
inverse_map = inverse_map.append(column_data)
# This should be fast - even for many queries:
df.columns[inverse_map.loc["Maharashtra"]]
# Output: 'State'
I work with indices in the inverse_map instead of column names to save memory.
Upvotes: 0
Reputation: 791
Pandas : Rows filtered for matching search term in any columns
search = 'search term'
df.loc[df.isin([search]).any(axis=1)]
Upvotes: 1