Reputation: 13
I have two .txt files that contain the x and y pixel coordinates of thousands of stars in an image. These two different coordinate lists were the products of different data processing methods, which result in slightly different x and y values for the same object.
File 1 *the id_out is arbitrary
id_out x_out y_out m_out 0 803.6550 907.0910 -8.301 1 700.4570 246.7670 -8.333 2 802.2900 894.2130 -8.344 3 894.6710 780.0040 -8.387
File 2
xcen ycen mag merr 31.662 37.089 22.759 0.387 355.899 37.465 19.969 0.550 103.079 37.000 20.839 0.847 113.500 38.628 20.966 0.796
The objects listed in the .txt files are not organized in a way that allows me to identify the same object in both files. So, I thought for every object in file 1, which has fewer objects than file 2, I would impose a test to find the star match between file 1 and file 2. For every star in file 1, I want to find the star in file 2 that has the closest match to x-y coordinates using the distance formula: distance= sqrt((x1-x2)^2 + (y1-y2)^2) within some tolerance of distance that I can change. Then print onto a master list the x1, y1, x2, y2, m_out, mag, and merr parameters in the file.
Here is the code I have so far, but I am not sure how to arrive at a working solution.
#/usr/bin/python
import pandas
import numpy as np
xcen_1 = np.genfromtxt('file1.txt', dtype=float, usecols=1)
ycen_1 = np.genfromtxt('file1.txt', dtype=float, usecols=2)
mag1 = np.genfromtxt('file1.txt', dtype=float, usecols=3)
xcen_2 = np.genfromtxt('file2.txt', dtype=float, usecols=0)
ycen_2 = np.genfromtxt('file2.txt', dtype=float, usecols=1)
mag2 = np.genfromtxt('file2.txt', dtype=float, usecols=2)
merr2 = np.genfromtxt('file2.txt', dtype=float, usecols=3)
tolerance=10.0
i=0
file=open('results.txt', 'w+')
file.write("column names")
for i in len(xcen_1):
dist=np.sqrt((xcen_1[i]-xcen_2[*])^2+(ycen_1[i]-ycen_2[*]^2))
if dist < tolerance:
f.write(i, xcen_1, ycen_1, xcen_2, ycen_2, mag1, mag2, merr2)
else:
pass
i=i+1
file.close
The code doesn't work as I don't know how to implement that every star in file 2 must be run through the test, as indicated by the * index (which comes from idl, in which I am more versed). Is there a solution for this logic, as opposed to the thinking in this case:
Thanks in advance!
Upvotes: 1
Views: 328
Reputation: 2602
You can use pandas Dataframes
. Here's how:
import pandas as pd
# files containing the x and y pixel coordinates and other information
df_file1=pd.read_csv('file1.txt',sep='\s+')
df_file2=pd.read_csv('file2.txt',sep='\s+')
join=[]
for i in range(len(df_file1)):
for j in range(len(df_file2)):
dis=((df_file1['x_out'][i]-df_file2['xcen'][j])**2+(df_file1['y_out'][i]-df_file2['ycen'][j])**2)**0.5
if dis<10:
join.append({'id_out': df_file1['id_out'][i], 'x_out': df_file1['x_out'][i], 'y_out':df_file1['y_out'][i],
'm_out':df_file1['m_out'][i],'xcen':df_file2['xcen'][j],'ycen':df_file2['ycen'][j],
'mag':df_file2['mag'][j],'merr':df_file2['merr'][j]})
df_join=pd.DataFrame(join)
df_join.to_csv('results.txt', sep='\t')
Upvotes: 1