plot specific columns from a text file

Question

If I have a text file, data.txt, which contains many columns, how to call this file by python and plot only chosen two columns? for example:

  10 -22.82215289 0.11s
  12 -22.81978265 0.14s
  15 -22.82359691 0.14s
  20 -22.82464363 0.16s
  25 -22.82615348 0.17s
  30 -22.82641815 0.19s
  35 -22.82649347 0.21s
  40 -22.82655376 0.22s
  50 -22.82661407 0.28s
  60 -22.82663535 0.34s
  70 -22.82664864 0.42s
  80 -22.82665962 0.46s
  90 -22.82666308 0.51s
 100 -22.82666662 0.56s

and I need to plot only the first and second columns. Note the space before the first column.

Eidt I used the following code:

import matplotlib.pyplot as plt
from matplotlib import rcParamsDefault
import numpy as np
plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
x, y = np.loadtxt('./calc.dat', delimiter=' ')
plt.plot(x, y, "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")

but I have to modify my data to contain only two columns that I need to plot without any spaces before the first column, as follows

 10 -22.82215289  
 12 -22.81978265  
 15 -22.82359691  
 20 -22.82464363  
 25 -22.82615348  
 30 -22.82641815  
 35 -22.82649347  
 40 -22.82655376  
 50 -22.82661407  
 60 -22.82663535  
 70 -22.82664864  
 80 -22.82665962  
 90 -22.82666308  
100 -22.82666662

So, how to modify the code so that I do not have to modify the data every time?

Ignatius Reilly · Accepted Answer

You can create a DataFrame from from a text file using pandas read_csv, which can simplify future processing of the data, besides plotting it.

In this case, the tricky part are the whitespaces, that can be managed by setting the optional parameter sep to '\s+':

df = pd.read_csv('data.txt', sep='\s+', header=None, names=['foo', 'bar', 'baz'])
>>>df

index	foo	bar	baz
0	10	-22.82215289	0.11s
1	12	-22.81978265	0.14s
2	15	-22.82359691	0.14s
3	20	-22.82464363	0.16s
4	25	-22.82615348	0.17s
5	30	-22.82641815	0.19s
6	35	-22.82649347	0.21s
7	40	-22.82655376	0.22s
8	50	-22.82661407	0.28s
9	60	-22.82663535	0.34s
10	70	-22.82664864	0.42s
11	80	-22.82665962	0.46s
12	90	-22.82666308	0.51s
13	100	-22.82666662	0.56s

And the just your code:

plt.rcParams["figure.dpi"]=150
plt.rcParams["figure.facecolor"]="white"
plt.plot(df['foo'], df['bar'], "o-", markersize=5, label='Etot')
plt.xlabel('ecut')
plt.ylabel('Etot')
plt.legend(frameon=False)
plt.savefig("fig.png")

I set the names of the columns to arbitrary strings. You can avoid that, and just refer to the columns as df[0], df[1]

plot specific columns from a text file

Answers (2)

Related Questions