Reputation: 3
I want to separate all characters that start with uppercase characters in this dataframe column.
Unicainstancia_DF['TesteNomeJuiz']
0 ClinicadeOlhosSaoPauloLtda-Me
1 PatriciaAparecidaMendesFerreira
2 CarraroHoldingParticipaçõesLtda
3 IsadoraCentofantiFonseca
4 Petruso&PetrusoSupermercadosLtda
....
Name: TesteNomeJuiz, Length: 1510, dtype: object
And i already used a function that allows me to do that it seems not to work
def camel_case_split(identifier):
matches = finditer('.+?(?:(?<=[a-z])(?=[A-Z])|(?<=[A-Z])(?=[A-Z][a-z])|$)', identifier)
return [m.group(0) for m in matches]
Unicainstancia_DF['TesteNomeJuiz'].astype('str')
splitted = re.sub('([A-Z][a-z]+)', r' \1', re.sub('([A-Z]+)', r' \1', Unicainstancia_DF['TesteNomeJuiz'])).split
TypeError
Traceback (most recent call last)
<ipython-input-56-00cc9d0b832f> in <module>
1 Unicainstancia_DF['TesteNomeJuiz'].astype('str')
--> 2 splitted = re.sub('([A-Z][a-z]+)', r' \1', re.sub('([A-Z]+)', r' \1',
Unicainstancia_DF['TesteNomeJuiz'])).split
F:\Anaconda\lib\re.py in sub(pattern, repl, string, count, flags)
208 a callable, it's passed the Match object and must return
209 a replacement string to be used."""
--> 210 return _compile(pattern, flags).sub(repl, string, count)
211
212 def subn(pattern, repl, string, count=0, flags=0):
TypeError: expected string or bytes-like object
And i also tried to call The info() function but doesn't work
Unicainstancia_DF['TesteNomeJuiz'].info()
AttributeError Traceback (most recent call last)
<ipython-input-57-403d0ae3c1ac> in <module>
--> 1 Unicainstancia_DF['TesteNomeJuiz'].info()
F:\Anaconda\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
5272 if self._info_axis._can_hold_identifiers_and_holds_name(name):
5273 return self[name]
->5274 return object.__getattribute__(self, name)
5275
5276 def __setattr__(self, name: str, value) -> None:
AttributeError: 'Series' object has no attribute 'info'
Upvotes: 0
Views: 1054
Reputation: 1794
You can only call .info() on a pandas.DataFrame, not on a pandas.Series.
Assuming Unicainstancia_DF is a DataFrame, you could call: Unicainstancia_DF.info()
, but not Unicainstancia_DF['TesteNomeJuiz'].info()
You're using a series/column selector when you use Unicainstancia_DF['TesteNomeJuiz']
--
you've selected a column (or 'Series') from a DataFrame and are about to do something with it.
What, precisely, you want to do with that Series isn't clear to me from your example. If you want to split on A-Z, then you could do something like this:
import re
print([re.split(r'[A-Z]', x) for x in Unicainstancia_DF['TesteNomeJuiz']]
But as Chris suggests, if you clarify your expected output and where you're wanting to store the splits I can be more specific. It seems doubtful that you actually want to split on A-Z -- more likely is that you want to split on the boundary between A-Z and any other character. Is that the case?
Upvotes: 1