Shahad
Shahad

Reputation: 25

How to use StringIO fuction on a Dataframe in python?

I have a data frame with two columns(Service Name,Port Number) where values in Service Name is a object and port number is a int value. when i tried to convert them into a StringIO format i am getting TypeError: initial_value must be str or None, not DataFrame.

I tried to convert the data frame into string with str(data) StringIo converts the values, but when i am trying to loop through i get the following error ValueError: not enough values to unpack (expected 2, got 1).

This is the fist 12 rows in my file.

Service Name    Port Number
Port_0  0
tcpmux  1
compressnet 2
compressnet 3
Unassigned  4
rje 5
Unassigned  6
echo    7
Unassigned  8
discard 9
Unassigned  10
systat  11
Unassigned  12

SO the loop i'm tying to run

#converting the "-" into a range and adding back to the data frame

import csv

def extend_ports(file, delim=','):
   handle = csv.reader(file, delimiter=delim)
   yield next(handle)  # skip header
   for row in handle:
      try:
         service_name, port_number = row
      except ValueError:
          print(f"Could not parse line '{row}'")
          raise
      if '-' not in port_number:
         yield [service_name, port_number]  # simple result
      else:
         start, end = map(int, port_number.split('-'))
         for port in map(str, range(start, end+1)):
            yield [service_name, port]  # expanded result

# get the result
result = list(extend_ports(data3))

This code is to convert "-" symbol into a range where all the ports numbers are added back into a data frame with its service name ie 272-276 mapped to "portx" will be expanded to 272,273,274,275,276 and mapped to "portx.

I think the error message when i try to loop is more important than the code here.

I have solved this problem the hard way. the input i gave ..

from io import StringIO

data = StringIO("""\
Service Name,Port Number
pt-tls,271
pt-tls,271
Unassigned,272-279
http-mgmt,280
http-mgmt,280
personal-link,281
personal-link,281
cableport-ax,282
cableport-ax,282
rescap,283
rescap,283
corerjd,284
corerjd,284
Unassigned,285
fxp,286
fxp,286
k-block,287
k-block,287
Unassigned,288-307
novastorbakcup,308
novastorbakcup,308
""")

with the above code i got the results as

['Service Name', 'Port Number']
['pt-tls', '271']
['pt-tls', '271']
['Unassigned', '272']
['Unassigned', '273']
['Unassigned', '274']
['Unassigned', '275']
['Unassigned', '276']
['Unassigned', '277']
...
['Unassigned', '306']
['Unassigned', '307']
['novastorbakcup', '308']
['novastorbakcup', '308']

The above result is what i want from the data frame. Thanks in advance.

Upvotes: 0

Views: 647

Answers (2)

anujagarwal
anujagarwal

Reputation: 1

Try to load all the data in same loop. no need of StringIO.

Eg:-

result = list(extend_ports(open("path/filename.csv", "\t")))

Upvotes: 0

Rakesh
Rakesh

Reputation: 82785

Works from csv file also.

Demo:

import csv

def extend_ports(file, delim=','):
   handle = csv.reader(file, delimiter=delim)
   yield next(handle)  # skip header
   for row in handle:
      try:
         service_name, port_number = row
      except ValueError:
          print("Could not parse line '{row}'")
          raise
      if '-' not in port_number:
         yield [service_name, port_number]  # simple result
      else:
         start, end = map(int, port_number.split('-'))
         for port in map(str, range(start, end+1)):
            yield [service_name, port]  # expanded result

# get the result
result = list(extend_ports(open(filename, "r")))  #Open file for read. 
print(result)

Output:

[['Service Name', 'Port Number'],
 ['pt-tls', '271'],
 ['pt-tls', '271'],
 ['Unassigned', '272'],
 ['Unassigned', '273'],
 ['Unassigned', '274'],
 ['Unassigned', '275'],
 ['Unassigned', '276'],
 ['Unassigned', '277'],
 ['Unassigned', '278'],
 ['Unassigned', '279'],
 ['http-mgmt', '280'],
 ['http-mgmt', '280'],
 ['personal-link', '281'],
 ['personal-link', '281'],
 ['cableport-ax', '282'],
 ['cableport-ax', '282'],
 ['rescap', '283'],
 ['rescap', '283'],
 ['corerjd', '284'],
 ['corerjd', '284'],
 ['Unassigned', '285'],
 ['fxp', '286'],
 ['fxp', '286'],
 ['k-block', '287'],
 ['k-block', '287'],
 ['Unassigned', '288'],
 ['Unassigned', '289'],
 ['Unassigned', '290'],
 ['Unassigned', '291'],
 ['Unassigned', '292'],
 ['Unassigned', '293'],
 ['Unassigned', '294'],
 ['Unassigned', '295'],
 ['Unassigned', '296'],
 ['Unassigned', '297'],
 ['Unassigned', '298'],
 ['Unassigned', '299'],
 ['Unassigned', '300'],
 ['Unassigned', '301'],
 ['Unassigned', '302'],
 ['Unassigned', '303'],
 ['Unassigned', '304'],
 ['Unassigned', '305'],
 ['Unassigned', '306'],
 ['Unassigned', '307'],
 ['novastorbakcup', '308'],
 ['novastorbakcup', '308']]

Upvotes: 1

Related Questions