Reputation: 35
If I have a text file, data.txt, which contains many columns, how to call this file by python and plot only chosen two columns? for example:
10 -22.82215289 0.11s
12 -22.81978265 0.14s
15 -22.82359691 0.14s
20 -22.82464363 0.16s
25 -22.82615348 0.17s
30 -22.82641815 0.19s
35 -22.82649347 0.21s
40 -22.82655376 0.22s
50 -22.82661407 0.28s
60 -22.82663535 0.34s
70 -22.82664864 0.42s
80 -22.82665962 0.46s
90 -22.82666308 0.51s
100 -22.82666662 0.56s
and I need to plot only the first and second columns. Note the space before the first column.
Eidt I used the following code:
import matplotlib.pyplot as plt
from matplotlib import rcParamsDefault
import numpy as np
plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
x, y = np.loadtxt('./calc.dat', delimiter=' ')
plt.plot(x, y, "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")
but I have to modify my data to contain only two columns that I need to plot without any spaces before the first column, as follows
10 -22.82215289
12 -22.81978265
15 -22.82359691
20 -22.82464363
25 -22.82615348
30 -22.82641815
35 -22.82649347
40 -22.82655376
50 -22.82661407
60 -22.82663535
70 -22.82664864
80 -22.82665962
90 -22.82666308
100 -22.82666662
So, how to modify the code so that I do not have to modify the data every time?
Upvotes: 2
Views: 1234
Reputation: 1750
You can create a DataFrame from from a text file using pandas read_csv, which can simplify future processing of the data, besides plotting it.
In this case, the tricky part are the whitespaces, that can be managed by setting the optional parameter sep
to '\s+'
:
df = pd.read_csv('data.txt', sep='\s+', header=None, names=['foo', 'bar', 'baz'])
>>>df
index | foo | bar | baz |
---|---|---|---|
0 | 10 | -22.82215289 | 0.11s |
1 | 12 | -22.81978265 | 0.14s |
2 | 15 | -22.82359691 | 0.14s |
3 | 20 | -22.82464363 | 0.16s |
4 | 25 | -22.82615348 | 0.17s |
5 | 30 | -22.82641815 | 0.19s |
6 | 35 | -22.82649347 | 0.21s |
7 | 40 | -22.82655376 | 0.22s |
8 | 50 | -22.82661407 | 0.28s |
9 | 60 | -22.82663535 | 0.34s |
10 | 70 | -22.82664864 | 0.42s |
11 | 80 | -22.82665962 | 0.46s |
12 | 90 | -22.82666308 | 0.51s |
13 | 100 | -22.82666662 | 0.56s |
And the just your code:
plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
plt.plot(df['foo'], df['bar'], "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")
I set the names of the columns to arbitrary strings. You can avoid that, and just refer to the columns as df[0], df[1]
Upvotes: 1
Reputation: 3005
You could first read your file data.txt
and preprocess it by stripping the whitespaces on the left of each line, save the preprocessed data to data_processed.txt
, then load it with pd.read_csv
and then plot the two columns of choice col1
and col2
against each other with plt.plot
, as follows:
import pandas as pd
import matplotlib.pyplot as plt
s = """ 10 -22.82215289 0.11s
12 -22.81978265 0.14s
15 -22.82359691 0.14s
20 -22.82464363 0.16s
25 -22.82615348 0.17s
30 -22.82641815 0.19s
35 -22.82649347 0.21s
40 -22.82655376 0.22s
50 -22.82661407 0.28s
60 -22.82663535 0.34s
70 -22.82664864 0.42s
80 -22.82665962 0.46s
90 -22.82666308 0.51s
100 -22.82666662 0.56s"""
with open ('data.txt', 'w') as f:
f.write(s)
with open ('data.txt', 'r') as f:
data = f.read()
data_processed = '\n'.join([l.lstrip() for l in data.split('\n')])
with open ('data_processed.txt', 'w') as f:
f.write(data_processed)
df = pd.read_csv('data_processed.txt', sep=' ', header=None)
col1 = 0
col2 = 1
plt.plot(df[col1], df[col2]);
Upvotes: 0