Sourav
Sourav

Reputation: 866

Arranging numpy array

I have more than 20 satellite images of an agricultural field from 3 different satellites. Each image name contains the data collection data and the satellite name in it. The first two digits of the file name is the month, the next two digits are the day, and the last part contains the satellite name. Suppose Six images will be used for this code.

Each image has been passed through a loop where they are processed into numpy array. The codes are -

image_list = ["D:/6.10.SkySat.tif", "D:/06.30.SkySat.tif", "D:/06.06.RapidEye.tif", 
"D:/06.16.RapidEye.tif", "D:/06.26.PlanetScope.tif", "D:/06.30.PlanetScope.tif"]

for image in image_list:

    #converting raster image to numpy array
    array = arcpy.RasterToNumPyArray(image, nodata_to_value=9999)
    #masking out the no data value and converting into one dimentional array
    marray = numpy.ma.masked_values(array,9999) 
    new_array = marray.flatten()

    #extracting the date and satellite name
    date = image[3:8]
    satellite = image[9:-4]

Here I am getting a one-dimensional array, one date, and one string(satellite name). For further use I want them in the following format shown below. The data will have three columns. One will have the all pixel values from the array, the next one will contain the date, and last will have the Satellite name.

Value       Date       Satellite
0.05825     6/15/2018   SkySat
0.07967976  6/15/2018   SkySat
0.09638854  6/15/2018   SkySat
0.12477265  6/15/2018   SkySat
0.13941683  6/15/2018   SkySat
0.13072205  6/15/2018   SkySat
0.12254229  6/15/2018   SkySat
0.13378483  6/15/2018   SkySat
0.13875392  6/15/2018   SkySat
0.14010076  6/10/2018   PlanetScope
0.1371166   6/10/2018   PlanetScope
0.13878246  6/10/2018   PlanetScope
0.1351179   6/10/2018   PlanetScope
0.16816537  6/10/2018   PlanetScope
0.16348109  6/10/2018   PlanetScope
0.15997969  6/10/2018   PlanetScope
0.16568226  6/10/2018   PlanetScope
0.190534599 6/12/2018   RapidEye
0.219114789 6/12/2018   RapidEye
0.251982007 6/12/2018   RapidEye
0.289779308 6/12/2018   RapidEye
0.333246204 6/12/2018   RapidEye

Is there any way to arrange the data in this format then write it into CSV or text file?

Upvotes: 0

Views: 227

Answers (2)

b-fg
b-fg

Reputation: 4137

Create a pandas.DataFrame with pandas with columns=['Value', 'Date', 'Satellite'] and for each image append the new data in the dataframe by concatenating the current dataframe with the new one from the image. For the dataframe you generate on each image, you need to repeat the date and satellite info. You can also convert the dates to the pandas date format with pd.to_datetime. It should look something like this:

import pandas as pd
import numpy


image_list = ["D:/6.10.SkySat.tif", "D:/06.30.SkySat.tif", "D:/06.06.RapidEye.tif", 
"D:/06.16.RapidEye.tif", "D:/06.26.PlanetScope.tif", "D:/06.30.PlanetScope.tif"]

df = pd.DataFrame(columns=['Value', 'Date', 'Satellite'])

for image in image_list:

    #converting raster image to numpy array
    array = arcpy.RasterToNumPyArray(image, nodata_to_value=9999)
    #masking out the no data value and converting into one dimentional array
    marray = numpy.ma.masked_values(array,9999) 
    new_array = marray.flatten()

    #extracting the date and satellite name
    date = pd.datetime(image[3:8], ignore_errors=True)
    satellite = image[9:-4]

    df2 =  pd.DataFrame({'Value': new_array, 'Date':[date]*new_array.size, 'Satellite':[satellite]*new_array.size})

    df = pd.concat([df,df2], ignore_index=True)

print(df) # Should output your expected columns

Upvotes: 0

gauravtolani
gauravtolani

Reputation: 130

Welcome to Stackoverflow Saurav!

The way i see your issue is, you just want to repeat the values of 'date' and 'satellite name' for the corresponding 1-d array of 'value'.

Considering the Below example:

value1 = [1,2,3]
date1 = '1 sep'
satellite_name1 = 'sauravyan'

You can use numpy's 'repeat' function:

date1 = np.repeat(date1,len(value1))
satellite_name1 = np.repeat(satellite_name1, len(value_1))

To make an array of dates repeated for any number of times. Length of values array in your case.

To convert everything to a csv finally, the best possible method i think is

(i) Push everything to a dictionery:

d['values'].extend(value_1)

d['dates'].extend(date_1)

d['satellites'].extend(s_1)

*Remember to create the dictionery with 'values', 'dates' and 'satellites' as the keys before the 'for' loop.

(ii) Convert your dictionery 'd' into a dataframe:

data = pd.DataFrame(d)

(iii) And finally convert your dataframe to a csv:

data.to_csv(<filepath/filename.csv>)

Seeing your code:

Just change the lines in the 'for' loop

date = np.repeat(image[3:8], len(new_array))
#similarly for the satellite name

Push all the three vars to the dictionery

After the for loop ends, convert your dictionery to a dataframe and to a csv next.

Comment in case of any doubts.

Hope it helps.

Upvotes: 1

Related Questions