Add new columns and new column names in python

Question

I have a CSV file in the following format:

Date,Time,Open,High,Low,Close,Volume

09/22/2003,00:00,1024.5,1025.25,1015.75,1022.0,720382.0

09/23/2003,00:00,1022.0,1035.5,1019.25,1022.0,22441.0

10/22/2003,00:00,1035.0,1036.75,1024.25,1024.5,663229.0

I would like to add 20 new columns to this file, the value of each new column is synthetically created by simply randomizing a set of numbers.

It would be something like this:

import pandas as pd

df = pd.read_csv('dataset.csv')

print(len(df))
input()

for i in range(len(df)):

    #Data that already exist

    date = df.values[i][0]
    time = df.values[i][1]
    open_value= df.values[i][2]
    high_value=df.values[i][3]
    low_value=df.values[i][4]
    close_value=df.values[i][5]
    volume=df.values[i][6]

    #This is the new data
    prediction_1=randrange(3)
    prediction_2=randrange(3)
    prediction_3=randrange(3)
    prediction_4=randrange(3)
    prediction_5=randrange(3)
    prediction_6=randrange(3)
    prediction_7=randrange(3)
    prediction_8=randrange(3)
    prediction_9=randrange(3)
    prediction_10=randrange(3)
    prediction_11=randrange(3)
    prediction_12=randrange(3)
    prediction_13=randrange(3)
    prediction_14=randrange(3)
    prediction_15=randrange(3)
    prediction_16=randrange(3)
    prediction_17=randrange(3)
    prediction_18=randrange(3)
    prediction_19=randrange(3)
    prediction_20=randrange(3)
    
    #How to concatenate these data row by row in a matrix?
    #How to add new column names and save the file?

I would like to concatenate them (old+synthetic data) and, after that, I would like to add 20 new columns named 'synthetic1', 'synthetic2', ..., 'synthetic20', to the existing column names and then save the resulting new dataset in a new text file.

I could do that easily with NumPy, but here, we have no numeric data and, therefore, I don't know how to do (or if it is possible to do) that. Is possible to do that with Pandas or another library?

YOLO · Accepted Answer

Here's a way you can do:

import numpy as np

# set nrow and col, nrow should match the number of rows in existing df
n_row = 100
n_col = 20
f = pd.DataFrame(np.random.randint(100, size=(n_row, n_col)), columns=['synthetic' + str(x) for x in range(1,n_col+1)])

df = pd.concat([df, f])

Add new columns and new column names in python

Answers (1)

Related Questions