Reputation: 243
Suppose a question is asked in Stackoverflow and example data is given, what is the best way to import the example data
In order to explain this question which I am asking, let me take to a real StackOverflow post.
This post has given the following data to solve the query.
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
Now what will be the best way to import the data into Python
The methodology should be in such a way that the import is done into the python environment in a single step.
I am aware that, I can copy-paste the data into a spreadsheet -> then save it as CSV file and then import it into Python. But I need a more elegant way.
Other solution as suggested by one developer,
import pandas as pd ;
df = pd.read_clipboard()
pd.read_clipboard(), is not a universal solution across all operating system. My system (Ubuntu 19.10), I did a clipboard copying and then the result came like this. The column names were wrongly imported.
This post deals with R solution, but I am interested in a solution for Python
Upvotes: 1
Views: 300
Reputation: 169284
Edited answer: this is slightly more work, but it's another viable option in my opinion.
from io import StringIO
import pandas as pd
s = StringIO(""" Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10""")
df = pd.read_fwf(s,index_col=0,infer_nrows=10)
Result:
In [175]: df
Out[175]:
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
Upvotes: 1
Reputation: 25924
As others have pointed out, The best solution seems to be pd.read_clipboard()
In order not to get messed up columns, just replace the spaces in the column names with _
and then to get the actual columns simply do:
df.columns = [name.replace('_',' ') for name in df.columns.values]
So copy the input when you replaced spaces with _
:
input:
Access_Type host/IP_address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
run the code:
import pandas as pd
df = pd.read_clipboard()
df.columns = [name.replace('_',' ') for name in df.columns.values]
and you'll get:
Output:
Access Type host/IP address Count
0 Authorized 206.196.21.129 23
1 Authorized 207.30.238.8 46
2 Authorized 208.62.55.75 23
3 Authorized 216.12.111.241 23
4 Authorized 63.197.98.106 23
5 Authorized 67.95.49.172 23
6 Unauthorized 207.243.167.114 23
7 Unauthorized 209.152.168.249 10
8 Unauthorized 65.166.159.14 10
9 Unauthorized 68.143.156.89 10
Upvotes: 1
Reputation: 1032
Try to use pandas function read_clipboard()
This method is as simple as you want: it reads copy-pasted tabular data and parses it into a Data Frame.
By the documentation this method read text from clipboard and pass to read_csv.
You can find more information in this blog post
EDIT: I think that your problem didn't cause by your OS, it could be because of the white spaces in the first line of the data.
Upvotes: 0