user3493124
user3493124

Reputation: 11

Python line by line data processing

I am new to python and I searched few articles but do not find a correct syntax to read a file and do awk line processing in python . I need your help in solving this problem .

This is how my bash script for build and deploy looks, I read a configurationf file in bash which looks like as below .

backup             /apps/backup
oracle             /opt/qosmon/qostool/oracle    oracle-client-12.1.0.1.0

and the script for bash reading section looks like below

while read line
  do
    case "$line" in */package*) continue ;; esac
    host_file_array+=("$line")
  done < ${HOST_FILE}
 for ((i=0 ; i < ${#host_file_array[*]}; i++))
  do
    # echo "${host_file_array[i]}"
     host_file_line="${host_file_array[i]}"

     if [[ "$host_file_line" != "#"* ]];
     then
       COMPONENT_NAME=$(echo $host_file_line  | awk '{print $1;}' )
       DIRECTORY=$(echo $host_file_line  | awk '{print $2;}' )
       VERSION=$(echo $host_file_line  | awk '{print $3;}' )

       if [[ ("${COMPONENT_NAME}" == *"oracle"*)  ]];
       then
         print_parameters "Status ${DIRECTORY}/${COMPONENT_NAME}"
         /bin/bash ${DIRECTORY}/${COMPONENT_NAME}/current/script/manage-oracle.sh  ${FORMAT_STRING} start
       fi
     etc .........

How the same can be conveted to Python . This is what I have prepared so far in python .

f  = open ('%s' % host_file,"r")
array = []
line = f.readline()
index = 0
while line:
    line = line.strip("\n ' '")
    line=line.split()
    array.append([])
    for item in line:
        array[index].append(item)
    line = f.readline()
    index+= 1
f.close()

I tried with split in python , since the config file does not have equal number of columns in all rows, I get index bound error. what is the best way to process it .

Upvotes: 0

Views: 296

Answers (2)

msvalkon
msvalkon

Reputation: 12077

I think dictionaries might be a good fit here, you can generate them as follows:

>>> result = []
>>> keys = ["COMPONENT_NAME", "DIRECTORY", "VERSION"]
>>> with open(hosts_file) as f:
...     for line in f:
...         result.append(dict(zip(keys, line.strip().split())))
...     
>>> result
[{'DIRECTORY': '/apps/backup', 'COMPONENT_NAME': 'backup'},
 {'DIRECTORY': '/opt/qosmon/qostool/oracle', 'VERSION': 'oracle-client-12.1.0.1.0', 'COMPONENT_NAME': 'oracle'}]

As you see this creates a list of dictionaries. Now when you're accessing the dictionaries, you know that some of them might not contain a 'VERSION' key. There are multiple ways of handling this. Either you try/except KeyError or get the value using dict.get().

Example:

>>> for r in result:
...     print r.get('VERSION', "No version")
...     
... 
No version
oracle-client-12.1.0.1.0

Upvotes: 1

dugres
dugres

Reputation: 13093

result = [line.strip().split() for line in open(host_file)]

Upvotes: 0

Related Questions