Reputation: 23
I am relatively new user of python so struggling with the below. Essentially I am trying to move a bunch of files from one folder to other, and rename them using a pandas df I built - states_mapping_df. I tried to convert the df to string using states_mapping_df = states_mapping_df.astype("string")
but that didnt help.
from shutil import copyfile
covid_src_dir = r"I:\COVID\COVID tracker\UserA\Hospitalizations"
covid_new_dir = r"I:\COVID\COVID tracker\Hospitalizations_images"
states_mapping_df = pd.DataFrame ({"Abbr" :['CA','FL','IL','NJ','NY','NC','OH','PA','TX','VA'],
"State_Name" :['California','Florida','Illinois','New Jersey','New York','North Caroliina','Ohio','Pennsylvania','Texas','Virginia']})
for row in states_mapping_df['Abbr']:
#oldname = states_mapping_df['Abbr']+'.png'
#newname = states_mapping_df['State_Name']+'.png'
oldpath_covid = covid_src_dir + "\\" + row +'.png'
newpath_covid = covid_new_dir + "\\" + states_mapping_df['State_Name'].astype('string') +'.png'
copyfile(oldpath_covid, newpath_covid)
I get the below error when I run it
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-260-40e52504bb33> in <module>
19 oldpath_covid = covid_src_dir + "\\" + row +'.png'
20 newpath_covid = covid_new_dir + "\\" + states_mapping_df['State_Name'].astype('string') +'.png'
---> 21 copyfile(oldpath_covid, newpath_covid)
22 #shutil.copy(oldpath_covid, newpath_covid)
~\Anaconda3\lib\shutil.py in copyfile(src, dst, follow_symlinks)
238 sys.audit("shutil.copyfile", src, dst)
239
--> 240 if _samefile(src, dst):
241 raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
242
~\Anaconda3\lib\shutil.py in _samefile(src, dst)
215 if hasattr(os.path, 'samefile'):
216 try:
--> 217 return os.path.samefile(src, dst)
218 except OSError:
219 return False
~\Anaconda3\lib\genericpath.py in samefile(f1, f2)
99 """
100 s1 = os.stat(f1)
--> 101 s2 = os.stat(f2)
102 return samestat(s1, s2)
103
TypeError: stat: path should be string, bytes, os.PathLike or integer, not Series
Upvotes: 1
Views: 611
Reputation: 560
I believe your problem is herestates_mapping_df['State_Name']
The error is telling you you're using a series. You are trying to rename a file as a whole column of values (series) from the DataFrame.
You need to filter the actual value you want.
Try this.
for row in states_mapping_df['Abbr']:
#oldname = states_mapping_df['Abbr']+'.png'
#newname = states_mapping_df['State_Name']+'.png'
# filter row of df according to present row from abbr
filt = (df['Abbr']==row)
# use .loc to isolate the specific cell from the filter and the column name
row_filtered = df.loc[filt, 'State_Name']
# a list is returned where first value is the cell value
state_name = row_filtered.values[0]
oldpath_covid = covid_src_dir + "\\" + row +'.png'
# renamed the initial series to the state name
newpath_covid = covid_new_dir + "\\" +
state_name +'.png'
copyfile(oldpath_covid, newpath_covid)
Edit, a bit more info:
.loc
is a means of filtering a Pandas DataFrame. You pass your df.loc[a,b]
with two parameters a,b
where a = rows and b = columns.
Generally, most will use this in the same way I did above where they first of all create a filter for use in a
just like I did.
(df['state'] == 'California')
would return a list of boolean values (true/false) where only instances of California
would return True. Then when you pass that through .loc[]
along with your column name then you return the specific cell (or cells if passing through multiple column names for b
). Then calling .values
returns an array of said values.
Another method is .iloc[]
which works the same way though i
means integer
. So if you wanted to return the 10th row and columns 5 through 8 you would use df.iloc[10,5:8]
Or if you wanted to return everything you could also do df.iloc[:,:]
or if you wanted to return all columns where your row values equate to California, using the same filter expression as above, then you could use df.loc[filt, ::]
The colon expressions represent index slicing just like you do on a list.
More here:
loc https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
iloc https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html
Indexing and slicing https://realpython.com/lessons/indexing-and-slicing/
Various other filtering methods including those mentioned https://towardsdatascience.com/7-different-ways-to-filter-pandas-dataframes-9e139888382a
Upvotes: 1