Appending data into pandas dataframe

Question

I'm building a system where raspberry pi receives data via bluetooth and parses it into pandas dataframe for further processing. However, there are a few issues. The bluetooth packets are converted into a pandas Series object which I attempted to append into the empty dataframe unsuccesfully. Splitting below is performed in order to extract telemetry from a bluetooth packet.

Code creates a suitable dataframe with correct column names, but when I append into it, the Series object's row numbers become new columns. Each appended series is a single row in the final dataframe. What I want to know is: How do I add Series object into the dataframe so that values are put into columns with indices from 0 to 6 instead of from 7 to 14?

Edit: Added a screenshot with, output on the top, multiple of pkt below.

Edit2: Added full code per request. Added error traceback.

import time
import sys
import subprocess
import pandas as pd
import numpy as np

class Scan:
    def __init__(self, count, columns):
        self.running = True
        self.count = count
        self.columns = columns

    def run(self):
        i_count = 0
        p_data = pd.DataFrame(columns=self.columns, dtype='str')

        while self.running:
            output = subprocess.check_output(["commands", "to", "follow.py"]).decode('utf-8')
            p_rows = output.split(";")
            series_list = []
            print(len(self.columns))

            for packet in p_rows:
                pkt = pd.Series(packet.split(","),dtype='str', index=self.columns)
                pkt = pkt.replace('
','',regex=True)
                print(len(pkt))
                series_list.append(pkt)
            p_data = pd.DataFrame(pd.concat(series_list, axis=1)).T

            print(p_data.head())
            print(p_rows[0])
            print(list(p_data.columns.values))

            if i_count  == self.count:
                self.running = False
                sys.exit()
            else:
                i_count += 1
            time.sleep(10)

def main():
    columns = ['mac', 'rssi', 'voltage', 'temperature', 'ad count', 't since boot', 'other']
    scan = Scan(0, columns)

while True:
    scan.run()

if __name__ == '__main__':
    main()

Traceback (most recent call last): File "blescanner.py", line 48, in main() File "blescanner.py", line 45, in main scan.run()

File "blescanner.py", line 24, in run pkt = pd.Series(packet.split(","),dtype='str', index=self.columns)

File "/mypythonpath/site-packages/pandas/core/series.py", line 262, in init .format(val=len(data), ind=len(index)))

ValueError: Length of passed values is 1, index implies 7

dan_g · Accepted Answer

You don't want to append to a DataFrame in that way. What you can do instead is create a list of series, and concatenate them together.

So, something like this:

series_list = []
for packet in p_rows:
    pkt = pd.Series(packet.split(","),dtype='str')
    print(pkt)
    series_list.append(pkt)
p_data = pd.DataFrame(pd.concat(series_list), columns=self.columns, dtype='str')

As long as you don't specify ignore_index=True in the pd.concat call the index will not be reset (the default is ignore_index=False)

Edit:

It's not clear from your question, but if you're trying to add the series as new columns (instead of stack on top of each other), then change the last line from above to:

p_data = pd.concat(series_list, axis=1)
p_data.columns = self.columns

Edit2:

Still not entirely clear, but it sounds like (from your edit) that you want to transpose the series to be the rows, where the index of the series becomes your columns. I.e.:

series_list = []
for packet in p_rows:
    pkt = pd.Series(packet.split(","), dtype='str', index=self.columns)
    series_list.append(pkt)
p_data = pd.DataFrame(pd.concat(series_list, axis=1)).T

Edit 3: Based on your picture of output, when you split on ; the last element in your list is empty. E.g.:

output = """f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None;
            f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None;"""

output.split(';')

['f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None',
 '
            f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None',
 '']

So instead of for packet in p_rows do for packet in p_rows[:-1]

Full example:

columns = ['mac', 'rssi', 'voltage', 'temperature', 'ad count', 't since boot', 'other']

output = """f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None;
            f1:07:ad:6b:97:c8,-24,2800,23.00,17962365,25509655,None;"""
p_rows = output.split(";")
series_list = []

for packet in p_rows[:-1]:
    pkt = pd.Series(packet.strip().split(","), dtype='str', index=columns)
    series_list.append(pkt)
p_data = pd.DataFrame(pd.concat(series_list, axis=1)).T

produces

                 mac rssi voltage temperature  ad count t since boot other
0  f1:07:ad:6b:97:c8  -24    2800       23.00  17962365     25509655  None
1  f1:07:ad:6b:97:c8  -24    2800       23.00  17962365     25509655  None

Appending data into pandas dataframe

Answers (2)

Related Questions