Add one dataframe after another but increase the index

Question

I have a original dataframe and one other that I would like to add to the first. However, there is a column with the IDs and I would like the dataframe rows to be added to it to increase from the highest QID of the first dataframe. I know how to add one dataframe after another. The column names of the second one are included in the first one.

df_qb.append(dfgrouped)

Until today I tried to get the maximum in the QID column of the original dataframe.

# get highest QID and start the QID of the appended rows from here
max_qid = df_qb.QID.astype(dtype = int, errors = 'ignore').max()

But it brings me back:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
 in 
      1 # get highest QID and start the QID of the appended rows from here
----> 2 max_qid = df_qb.QID.astype(dtype = int, errors = 'ignore').max()

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
  11213             return self._agg_by_level(name, axis=axis, level=level, skipna=skipna)
  11214         return self._reduce(
> 11215             f, name, axis=axis, skipna=skipna, numeric_only=numeric_only
  11216         )
  11217 

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3889                 )
   3890             with np.errstate(all="ignore"):
-> 3891                 return op(delegate, skipna=skipna, **kwds)
   3892 
   3893         # TODO(EA) dispatch to Index

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core
anops.py in f(values, axis, skipna, **kwds)
    123                     result = alt(values, axis=axis, skipna=skipna, **kwds)
    124             else:
--> 125                 result = alt(values, axis=axis, skipna=skipna, **kwds)
    126 
    127             return result

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core
anops.py in reduction(values, axis, skipna, mask)
    835                 result = np.nan
    836         else:
--> 837             result = getattr(values, meth)(axis)
    838 
    839         result = _wrap_results(result, dtype, fill_value)

C:\ProgramData\Anaconda3\lib\site-packages
umpy\core\_methods.py in _amax(a, axis, out, keepdims, initial, where)
     28 def _amax(a, axis=None, out=None, keepdims=False,
     29           initial=_NoValue, where=True):
---> 30     return umr_maximum(a, axis, None, out, keepdims, initial, where)
     31 
     32 def _amin(a, axis=None, out=None, keepdims=False,

TypeError: '>=' not supported between instances of 'str' and 'float'

jezrael · Accepted Answer

If check Series.astype:

errors{'raise', 'ignore'}, default 'raise'
Control raising of exceptions on invalid data for provided dtype.

raise : allow exceptions to be raised
ignore : suppress exceptions. On error return original object.

So you need to_numeric with errors = 'coerce' for convert values to numbers:

max_qid = pd.to_numeric(df_qb.QID, errors = 'coerce').max()
dfgrouped['QID'] = np.arange(max_qid + 1, max_qid + len(dfgrouped) + 1)

df = df_qb.append(dfgrouped)

Add one dataframe after another but increase the index

Answers (1)

Related Questions