Alex
Alex

Reputation: 44749

Extracting extension from filename

Is there a function to extract the extension from a filename?

Upvotes: 1858

Views: 1618664

Answers (30)

jeromej
jeromej

Reputation: 11626

New in version 3.4.

import pathlib

print(pathlib.Path('yourPath.example').suffix)  # '.example'
print(pathlib.Path("hello/foo.bar.tar.gz").suffixes)  # ['.bar', '.tar', '.gz']
print(pathlib.Path('/foo/bar.txt').stem)  # 'bar'

I'm surprised no one has mentioned pathlib yet, pathlib IS awesome!

Upvotes: 726

Eric
Eric

Reputation: 503

I'm definitely late to the party, but in case anyone wanted to achieve this without the use of another library:

file_path = "example_tar.tar.gz"
file_name, file_ext = [file_path if "." not in file_path else file_path.split(".")[0], "" if "." not in file_path else file_path[file_path.find(".") + 1:]]
print(file_name, file_ext)

The 2nd line is basically just the following code but crammed into one line:

def name_and_ext(file_path):
    if "." not in file_path:
        file_name = file_path
    else:
        file_name = file_path.split(".")[0]
    if "." not in file_path:
        file_ext = ""
    else:
        file_ext = file_path[file_path.find(".") + 1:]
    return [file_name, file_ext]

Even though this works, it might not work will all types of files, specifically .zshrc, I would recomment using os's os.path.splitext function, example below:

import os
file_path = "example.tar.gz"
file_name, file_ext = os.path.splitext(file_path)
print(file_name, file_ext)

Cheers :)

Upvotes: 1

Yes But No
Yes But No

Reputation: 11

Here if you want to extract the last file extension if it has multiple

class functions:
    def listdir(self, filepath):
        return os.listdir(filepath)
    
func = functions()

os.chdir("C:\\Users\Asus-pc\Downloads") #absolute path, change this to your directory
current_dir = os.getcwd()

for i in range(len(func.listdir(current_dir))): #i is set to numbers of files and directories on path directory
    if os.path.isfile((func.listdir(current_dir))[i]): #check if it is a file
        fileName = func.listdir(current_dir)[i] #put the current filename into a variable
        rev_fileName = fileName[::-1] #reverse the filename
        currentFileExtension = rev_fileName[:rev_fileName.index('.')][::-1] #extract from beginning until before .
        print(currentFileExtension) #output can be mp3,pdf,ini,exe, depends on the file on your absolute directory

Output is mp3, even works if has only 1 extension name

Upvotes: 0

Waleed Khaled
Waleed Khaled

Reputation: 134

Well , i know im late

that's my simple solution

file = '/foo/bar/whatever.ext'
extension = file.split('.')[-1]
print(extension)

#output will be ext

Upvotes: 3

Harris Khan
Harris Khan

Reputation: 247

The easiest way to get is to use mimtypes, below is the example:

import mimetypes

mt = mimetypes.guess_type("file name")
file_extension =  mt[0]
print(file_extension)

Upvotes: 1

Import Error
Import Error

Reputation: 41

This method will require a dictonary, list, or set. you can just use ".endswith" using built in string methods. This will search for name in list at end of file and can be done with just str.endswith(fileName[index]). This is more for getting and comparing extensions.

https://docs.python.org/3/library/stdtypes.html#string-methods

Example 1:

dictonary = {0:".tar.gz", 1:".txt", 2:".exe", 3:".js", 4:".java", 5:".python", 6:".ruby",7:".c", 8:".bash", 9:".ps1", 10:".html", 11:".html5", 12:".css", 13:".json", 14:".abc"} 
for x in dictonary.values():
    str = "file" + x
    str.endswith(x, str.index("."), len(str))

Example 2:

set1 = {".tar.gz", ".txt", ".exe", ".js", ".java", ".python", ".ruby", ".c", ".bash", ".ps1", ".html", ".html5", ".css", ".json", ".abc"}
for x in set1:
   str = "file" + x
   str.endswith(x, str.index("."), len(str))

Example 3:

fileName = [".tar.gz", ".txt", ".exe", ".js", ".java", ".python", ".ruby", ".c", ".bash", ".ps1", ".html", ".html5", ".css", ".json", ".abc"];
for x in range(0, len(fileName)):
    str = "file" + fileName[x]
    str.endswith(fileName[x], str.index("."), len(str))

Example 4

fileName = [".tar.gz", ".txt", ".exe", ".js", ".java", ".python", ".ruby", ".c", ".bash", ".ps1", ".html", ".html5", ".css", ".json", ".abc"];
str = "file.txt"
str.endswith(fileName[1], str.index("."), len(str))

Examples 5, 6, 7 with output enter image description here

Example 8

fileName = [".tar.gz", ".txt", ".exe", ".js", ".java", ".python", ".ruby", ".c", ".bash", ".ps1", ".html", ".html5", ".css", ".json", ".abc"];
exts = []
str = "file.txt"
for x in range(0, len(x)):
    if str.endswith(fileName[1]) == 1:
         exts += [x]
     

Upvotes: 0

nosklo
nosklo

Reputation: 223102

Use os.path.splitext:

>>> import os
>>> filename, file_extension = os.path.splitext('/path/to/somefile.ext')
>>> filename
'/path/to/somefile'
>>> file_extension
'.ext'

Unlike most manual string-splitting attempts, os.path.splitext will correctly treat /a/b.c/d as having no extension instead of having extension .c/d, and it will treat .bashrc as having no extension instead of having extension .bashrc:

>>> os.path.splitext('/a/b.c/d')
('/a/b.c/d', '')
>>> os.path.splitext('.bashrc')
('.bashrc', '')

Upvotes: 2659

cng.buff
cng.buff

Reputation: 583

You can use endswith to identify the file extension in python

like bellow example

for file in os.listdir():
    if file.endswith('.csv'):
        df1 =pd.read_csv(file)
        frames.append(df1)
        result = pd.concat(frames)

Upvotes: 2

dataninsight
dataninsight

Reputation: 1343

Extracting extension from filename in Python

Python os module splitext()

splitext() function splits the file path into a tuple having two values – root and extension.

import os
# unpacking the tuple
file_name, file_extension = os.path.splitext("/Users/Username/abc.txt")
print(file_name)
print(file_extension)

Get File Extension using Pathlib Module

Pathlib module to get the file extension

import pathlib
pathlib.Path("/Users/pankaj/abc.txt").suffix
#output:'.txt'

Upvotes: 12

Muhammad Salman
Muhammad Salman

Reputation: 85

you can use following code to split file name and extension.

    import os.path
    filenamewithext = os.path.basename(filepath)
    filename, ext = os.path.splitext(filenamewithext)
    #print file name
    print(filename)
    #print file extension
    print(ext)

Upvotes: 5

lendoo
lendoo

Reputation: 573

a = ".bashrc"
b = "text.txt"
extension_a = a.split(".")
extension_b = b.split(".")
print(extension_a[-1])  # bashrc
print(extension_b[-1])  # txt

Upvotes: -2

Murat Çorlu
Murat Çorlu

Reputation: 8545

For simple use cases one option may be splitting from dot:

>>> filename = "example.jpeg"
>>> filename.split(".")[-1]
'jpeg'

No error when file doesn't have an extension:

>>> "filename".split(".")[-1]
'filename'

But you must be careful:

>>> "png".split(".")[-1]
'png'    # But file doesn't have an extension

Also will not work with hidden files in Unix systems:

>>> ".bashrc".split(".")[-1]
'bashrc'    # But this is not an extension

For general use, prefer os.path.splitext

Upvotes: 114

Ibnul Husainan
Ibnul Husainan

Reputation: 231

try this:

files = ['file.jpeg','file.tar.gz','file.png','file.foo.bar','file.etc']
pen_ext = ['foo', 'tar', 'bar', 'etc']

for file in files: #1
    if (file.split(".")[-2] in pen_ext): #2
        ext =  file.split(".")[-2]+"."+file.split(".")[-1]#3
    else:
        ext = file.split(".")[-1] #4
    print (ext) #5
  1. get all file name inside the list
  2. splitting file name and check the penultimate extension, is it in the pen_ext list or not?
  3. if yes then join it with the last extension and set it as the file's extension
  4. if not then just put the last extension as the file's extension
  5. and then check it out

Upvotes: 1

Victor Wang
Victor Wang

Reputation: 937

A true one-liner, if you like regex. And it doesn't matter even if you have additional "." in the middle

import re

file_ext = re.search(r"\.([^.]+)$", filename).group(1)

See here for the result: Click Here

Upvotes: 2

eatmeimadanish
eatmeimadanish

Reputation: 3907

For funsies... just collect the extensions in a dict, and track all of them in a folder. Then just pull the extensions you want.

import os

search = {}

for f in os.listdir(os.getcwd()):
    fn, fe = os.path.splitext(f)
    try:
        search[fe].append(f)
    except:
        search[fe]=[f,]

extensions = ('.png','.jpg')
for ex in extensions:
    found = search.get(ex,'')
    if found:
        print(found)

Upvotes: 0

r3t40
r3t40

Reputation: 637

You can find some great stuff in pathlib module (available in python 3.x).

import pathlib
x = pathlib.PurePosixPath("C:\\Path\\To\\File\\myfile.txt").suffix
print(x)

# Output 
'.txt'

Upvotes: 22

Ripon Kumar Saha
Ripon Kumar Saha

Reputation: 307

This is The Simplest Method to get both Filename & Extension in just a single line.

fName, ext = 'C:/folder name/Flower.jpeg'.split('/')[-1].split('.')

>>> print(fName)
Flower
>>> print(ext)
jpeg

Unlike other solutions, you don't need to import any package for this.

Upvotes: -3

Alex
Alex

Reputation: 1375

Just join all pathlib suffixes.

>>> x = 'file/path/archive.tar.gz'
>>> y = 'file/path/text.txt'
>>> ''.join(pathlib.Path(x).suffixes)
'.tar.gz'
>>> ''.join(pathlib.Path(y).suffixes)
'.txt'

Upvotes: 19

Kenstars
Kenstars

Reputation: 660

This is a direct string representation techniques : I see a lot of solutions mentioned, but I think most are looking at split. Split however does it at every occurrence of "." . What you would rather be looking for is partition.

string = "folder/to_path/filename.ext"
extension = string.rpartition(".")[-1]

Upvotes: 6

soheshdoshi
soheshdoshi

Reputation: 624

You can use a split on a filename:

f_extns = filename.split(".")
print ("The extension of the file is : " + repr(f_extns[-1]))

This does not require additional library

Upvotes: 12

Execuday
Execuday

Reputation: 79

Even this question is already answered I'd add the solution in Regex.

>>> import re
>>> file_suffix = ".*(\..*)"
>>> result = re.search(file_suffix, "somefile.ext")
>>> result.group(1)
'.ext'

Upvotes: 6

wonzbak
wonzbak

Reputation: 8124

import os.path
extension = os.path.splitext(filename)[1][1:]

To get only the text of the extension, without the dot.

Upvotes: 146

weiyixie
weiyixie

Reputation: 581

Although it is an old topic, but i wonder why there is none mentioning a very simple api of python called rpartition in this case:

to get extension of a given file absolute path, you can simply type:

filepath.rpartition('.')[-1]

example:

path = '/home/jersey/remote/data/test.csv'
print path.rpartition('.')[-1]

will give you: 'csv'

Upvotes: 17

PascalVKooten
PascalVKooten

Reputation: 21461

Surprised this wasn't mentioned yet:

import os
fn = '/some/path/a.tar.gz'

basename = os.path.basename(fn)  # os independent
Out[] a.tar.gz

base = basename.split('.')[0]
Out[] a

ext = '.'.join(basename.split('.')[1:])   # <-- main part

# if you want a leading '.', and if no result `None`:
ext = '.' + ext if ext else None
Out[] .tar.gz

Benefits:

  • Works as expected for anything I can think of
  • No modules
  • No regex
  • Cross-platform
  • Easily extendible (e.g. no leading dots for extension, only last part of extension)

As function:

def get_extension(filename):
    basename = os.path.basename(filename)  # os independent
    ext = '.'.join(basename.split('.')[1:])
    return '.' + ext if ext else None

Upvotes: 12

user5535053
user5535053

Reputation: 7

def NewFileName(fichier):
    cpt = 0
    fic , *ext =  fichier.split('.')
    ext = '.'.join(ext)
    while os.path.isfile(fichier):
        cpt += 1
        fichier = '{0}-({1}).{2}'.format(fic, cpt, ext)
    return fichier

Upvotes: -3

DragonX
DragonX

Reputation: 105

# try this, it works for anything, any length of extension
# e.g www.google.com/downloads/file1.gz.rs -> .gz.rs

import os.path

class LinkChecker:

    @staticmethod
    def get_link_extension(link: str)->str:
        if link is None or link == "":
            return ""
        else:
            paths = os.path.splitext(link)
            ext = paths[1]
            new_link = paths[0]
            if ext != "":
                return LinkChecker.get_link_extension(new_link) + ext
            else:
                return ""

Upvotes: -2

wookie
wookie

Reputation: 331

name_only=file_name[:filename.index(".")

That will give you the file name up to the first ".", which would be the most common.

Upvotes: -4

staytime
staytime

Reputation: 143

filename='ext.tar.gz'
extension = filename[filename.rfind('.'):]

Upvotes: 11

Another solution with right split:

# to get extension only

s = 'test.ext'

if '.' in s: ext = s.rsplit('.', 1)[1]

# or, to get file name and extension

def split_filepath(s):
    """
    get filename and extension from filepath 
    filepath -> (filename, extension)
    """
    if not '.' in s: return (s, '')
    r = s.rsplit('.', 1)
    return (r[0], r[1])

Upvotes: 5

XavierCLL
XavierCLL

Reputation: 1211

With splitext there are problems with files with double extension (e.g. file.tar.gz, file.tar.bz2, etc..)

>>> fileName, fileExtension = os.path.splitext('/path/to/somefile.tar.gz')
>>> fileExtension 
'.gz'

but should be: .tar.gz

The possible solutions are here

Upvotes: 24

Related Questions