Reputation: 2643
I have a text table as a string like this and it's quite long. It can also be written to a file,
+--------------+----------+---------+------------+
| Endpoint | Table | Request | Is Updated |
+--------------+----------+---------+------------+
| /api/test1 | test1 | True | True |
+--------------+----------+---------+------------+
| /api/test2 | test2 | False | False |
+--------------+----------+---------+------------+
| /api/test3 | test3 | False | True |
+--------------+----------+---------+------------+
I want to convert this to a pandas dataframe. Here is my expected output:
>>> import pandas as pd
>>> df = pd.DataFrame(
{'Endpoint': ['/api/test1', '/api/test2', '/api/test3'],
'Table': ['test1', 'test2', 'test3'],
'Request': [True, False, False],
'Is Updated': [True, False, True]},
)... ... ... ... ...
>>> df
Endpoint Table Request Is Updated
0 /api/test1 test1 True True
1 /api/test2 test2 False False
2 /api/test3 test3 False True
Thanks in advance.
Upvotes: 3
Views: 3910
Reputation: 18647
IIUC, use re.sub
to replace regex and io.StringIO
to read:
import re
from io import StringIO
text = """
+--------------+----------+---------+------------+
| Endpoint | Table | Request | Is Updated |
+--------------+----------+---------+------------+
| /api/test1 | test1 | True | True |
+--------------+----------+---------+------------+
| /api/test2 | test2 | False | False |
+--------------+----------+---------+------------+
| /api/test3 | test3 | False | True |
+--------------+----------+---------+------------+
"""
df = pd.read_csv(StringIO(re.sub(r'[-+|]', '', text)), sep='\s{2,}', engine='python')
print(df)
Output:
Endpoint Table Request Is Updated
0 /api/test1 test1 True True
1 /api/test2 test2 False False
2 /api/test3 test3 False True
Upvotes: 9