Excluding a specific string of characters in a str()-function

A small issue I've encountered during coding.

I'm looking to print out the name of a .txt file. For example, the file is named: verdata_florida.txt, or verdata_newyork.txt How can I exclude .txt and verdata_, but keep the string between? It must work for any number of characters, but .txt and verdata_ must be excluded.

This is where I am so far, I've already defined filename to be input()

print("Average TAM at", str(filename[8:**????**]), "is higher than ")

Upvotes: 0

Views: 736

Answers (4)

Eric Ed Lohmar
Eric Ed Lohmar

Reputation: 1922

Assuming you want it to split on the first _ and the last . you can use slicing and the index and rindex functions to get this done. These functions will search for the first occurrence of the substring in the parenthesis and return the index number. If no substring is found, they will throw a ValueError. If the search is desired, but not the ValueError, you can also use find and rfind, which do the same thing but always return -1 if no match is found.

s = 'verdata_new_hampshire.txt'
s_trunc = s[s.index('_') + 1: s.rindex('.')]  # or s[s.find('_') + 1: s.rfind('.')]

print(s_trunc)  # new_hampshire

Of course, if you are always going to exclude verdata_ and .txt you could always hardcode the slice as well.

print(s[8:-4])  # new_hampshire

Upvotes: 2

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

3 ways of doing it:

using str.split twice:

>>> "verdata_florida.txt".split("_")[1].split(".")[0]
'florida'

using str.partition twice (you won't get an exception if the format doesn't match, and probably faster too):

>>> "verdata_florida.txt".partition("_")[2].partition(".")[0]
'florida'

using re, keeping only center part:

>>> import re
>>> re.sub(".*_(.*)\..*",r"\1","verdata_florida.txt")
'florida'

all those above must be tuned if _ and . appear multiple times (must we keep the longest or the shortest string)

EDIT: In your case, though, prefixes & suffixes seem fixed. In that case, just use str.replace twice:

>>> "verdata_florida.txt".replace("verdata_","").replace(".txt","")
'florida'

Upvotes: 2

user7757483
user7757483

Reputation:

You can just split string by dot and underscore like this:

string filename = "verdata_prague.txt";
string name = filename.split("."); //verdata_prague
name = name[0].split("_")[1]; //prague

or by replace function:

string filename = "verdata_prague.txt";
string name = filename.replace(".txt",""); //verdata_prague
name = name[0].replace("verdata_","")[1]; //prague

Upvotes: 1

floatingpurr
floatingpurr

Reputation: 8559

You can leverage str.split() on strings. For example:

s = 'verdata_newyork.txt'

s.split('verdata_')
# ['', 'florida.txt']

s.split('verdata_')[1]
# 'florida.txt'

s.split('verdata_')[1].split('.txt')
['florida', '']

s.split('verdata_')[1].split('.txt')[0]
# 'florida'

Upvotes: 1

Related Questions