y_1234
y_1234

Reputation: 61

Pandas Columns Division by a Value

I have a csv file which has 29 columns like following:

enter image description here

Also, I have a folder which contains a bunch of images.

This csv contains header and ground truth points for every image in images folder. It has 29 columns. First column is a filename and next 28 columns are x and y coordinates for 14 facepoints.

My code:

def load_imgs_and_keypoints(dirname='facial-keypoints', image_size=(100, 100)):

  images = []
  points = []
  base_path = '/content/facial-keypoints/data/images'

  # Get all the images
  for file in os.listdir(base_path):

    # Set the path and read the image
    full_img_path = os.path.join(base_path, file)
    image = cv2.imread(full_img_path)

    # Get the rows and cols of image
    rows = image.shape[0]
    cols = image.shape[1]
    channels = image.shape[2]

    rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    rgb_image = cv2.resize(rgb_image, image_size)
    images.append(rgb_image)

  # Read the csv file which contains facial keypoints
  csv_path = '/content/facial-keypoints/data/gt.csv'
  csv_file = pd.read_csv(csv_path)
  print(csv_file.head())

  print('------------------------------------------------------------------------')

  # Scale the coordinates
  all_X_cols = ['x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'x7', 'x8', 'x9', 'x10', 
                'x11', 'x12', 'x13','x14']

  all_Y_cols = ['y1', 'y2', 'y3', 'y4', 'y5', 'y6', 'y7', 'y8', 'y9', 'y10', 
                'y11', 'y12', 'y13','y14']

What I want to do is -

Divide all x's by number of columns of image (cols), and divide all y's by number of rows of image (rows) and subtract 0.5 from all values.

I was not sure how to get all the X's and Y's from the csv and do the needful.

How can I achieve this task?

Thank You.

Upvotes: 0

Views: 112

Answers (2)

LTheriault
LTheriault

Reputation: 1230

The answer is relatively simple. Essentially, you just loop through the columns and apply the operation you want to each column. For performing the operation, you don't need any special functions. You can just write it as a vectorized operation that treat your series like a vector and that can be divided/subtracted by a scalar. You can change the way that you're counting the columns if you want to include the filename column.

all_X_cols = [i for i in csv_file.columns if "x" in i]
all_Y_cols = [i for i in csv_file.columns if "y" in i]

num_cols = len([i for i in csv_file.columns if i != "filename"])
num_rows = len(sv_file)

for x in all_X_cols:
    csv_file[x] = (csv_file[x]/num_cols)-.5

for y in all_Y_cols:
    csv_file[y] = (csv_file[y]/num_rows)-.5

Upvotes: 1

gtomer
gtomer

Reputation: 6564

Try a different approach:

for col in df.columns:
    if (col[0]=='x'):
        df[col] = np.where(df['filename'] == file, df[col] / cols, df[col])
    else:
        df[col] = np.where(df['filename'] == file, df[col] / rows, df[col])

Upvotes: 0

Related Questions