Reputation: 81
Hello everyone i have data like this;
0 | 1 | 2 |
---|---|---|
-- state: US state (by number) - not counted a... | but if considered | should be consided nominal (nominal) |
-- county: numeric code for county - not predi... | and many missing values (numeric) | NaN |
... | ... | ... |
But i would like to transform into this;
0 | 1 | 2 |
---|---|---|
state | US state (by number) - not counted a ... but if considered | should be consided nominal (nominal) |
county | numeric code for county - not predi ... and many missing values (numeric) | NaN |
... | ..... | .... |
or simply;
0 |
---|
state |
country |
.... |
and i wrote this code, but i wonder that is there any possible way to do that quicker..
variable_names = pd.read_csv("path", header = None)
df = variable_names[0]
df = df.str.split(': ', expand = True)
df = df[0]
df = df.str.split('-- ', expand = True)
Upvotes: 0
Views: 63
Reputation: 1066
One way to solve this is:
# This becomes a pandas dataframe.
variable_names = pd.read_csv("path", header = None)
# Using simple apply works on all rows.
variable_names[0] = variable_names[0].apply(lambda x:x.split(': ')[0])
variable_names[0] = variable_names[0].apply(lambda x:x.split('-- ')[1])
Please check if this works for you.
Upvotes: 1