Reputation: 46969
Chat.txt
ID674 25/01/1986 Thank you for choosing Optimus prime. Please wait for an Optimus prime Representative to respond. You are currently number 0 in the queue. You should be connected to an agent in approximately 0 minutes.. You are now chatting with 'Tom' 0 <br/>
ID674 2gb Hi there! Welcome to Optus Web Chat 0/0/0 . How can I help you today? 1
ID674 25-01-1986 I would like to change my bill plan from $0 with 0 expiry to something else $136. I find it very unuseful. Sam my phone no is 9838383821 2
In the text mentioned above is just an example of few lines in a file.My requirement is that all the dates for example 25/01/1986 or 0/0/0 should be replaced with "DATE123" .
Then :) should be replaced with "smileys123".
Currencies i.e, $0 or $136 should be replaced with "Currency123"
'TOM' (usually agents name in single quotes) should be replaced with AGENT123
and many more.The output should be the number of occurrences of the string as shown
DATE123=2 smileys123=2 Currency123=6 AGENT123=5
I have this approach as of now please let me know about this ,
class Replace:
dateformat=DATE123
smileys=smileys123
currency=currency123
count_dict={}
function count_data(type,count):
global count_dict
if type in count_dict:
count_dict[type]+=count
else:
count_dict = {type:count}
f=open("chat.txt")
while True:
for line in f.readlines():
print line,
if ":)" in line:
smileys = line.count(":)")
count_data("smileys",smileys)
elif "$number" in line : #how to see whether it is currency or nor??
currency=line.count("$number") //how can i do this
count_data("currecny",currency)
elif "1/2/3" in line : #how to validate date format
dateformat=line.count("dateformat") #how can i do this
count_data("currency",currency)
elif validate-agent-name in line:
agent_name=line.count("agentname") #How to do this get agentname in single quotes
count_data("agent_name",agent_name)
else:
break
f.close()
for keys in count_dict:
print keys,count_dict[keys]
The following would be the ouput
DATE123=2 smileys123=2 Currency123=6 AGENT123=5
Upvotes: 1
Views: 759
Reputation: 4674
Currencies i.e, $0 or $136 should be replaced with "Currency123" and 'TOM' (usually agents name in single quotes) should be replaced with AGENT123 and many more
I think your class Repalce should be replaced by a dictionary, in that case you can do more (because it comes with methods) while writing less code. The dictionary can keep track of what is it you need to replace wtih, and offer you more options to dynamically make changes to your replacement need. And doing do, maybe your code will be cleaner and easier to understand? Shorter for sure as you have more replacement words.
Edit: You might want to keep your list of replacement word in a text file, and load them into your dictionary. Instead of just hard code your replacement words into a class. That I don't think is a good idea. Since you did said many more, then it make more sense to do so, less code to write (and cleaner!)
To comment... use
# Here is a comment
The style of your code isn't the best, read http://www.python.org/dev/peps/pep-0008/#pet-peeves, or even the whole chapter if you want to learn the better coding style.
Here is the regular expression to check if it is currency, the name 'Tom', and the date.
import re
while True:
myString = input('Enter your string: ')
isMoney = re.match('^\$[0-9]+(,[0-9]{3})*(\.[0-9]{2})?$', myString)
isName = re.match('^\'+\w+\'$', myString)
isDate = re.match('^[0-1][0-9]\/[0-3][0-9]\/[0-1][0-9]{3}$', myString)
# or try '^[0-1]*?\/[0-9]*\/[0-9]*$ If you want 0/0/0 too...
if isMoney:
print('It is Money:', myString)
elif isName:
print('It is a Name:', myString)
elif isDate:
print('It is a Date:', myString)
else:
print('Not good.')
Sanple output:
Enter your string: $100
It is Money: $100
Enter your string: 100
Not good.
Enter your string: 'Tom'
It is a Name: 'Tom'
Enter your string: Tom
Not good.
Enter your string: 01/15/1989
It is a Date: 01/15/1989
Enter your string: 01151989
Not good.
You can replace the condition with one of these isSomething
variable, it depends on what exactly need to be done. I suppose, I hope this can help. If you want to learn more about regular expression, check out "Regular Expression Primer", or Python's RE Page.
Upvotes: 1
Reputation: 4852
This doesn't do all the replacements you said you need. But here's a way to count things in your data, using regular expressions and a default dictionary. If you really want the string replacements, I'm sure you can figure that out:
lines = [
"ID674 25/01/1986 Thank you for :) choosing Optimus prime. Please wait for an Optimus prime Representative to respond. You are currently number 0 in the queue. You should be connected to an agent in approximately 0 minutes.. You are now chatting with 'Tom' 0",
"ID674 2gb Hi there! Welcome to Optus Web Chat 0/0/0 . $5.45 How can I help you today? 1",
"ID674 25-01-1986 I would like to change my bill plan from $0 with 0 expiry to something else $136. I find it very unuseful. Sam my phone no is 9838383821 2'"
]
import re
from collections import defaultdict
p_smiley = re.compile(r':\)|:-\)')
p_currency = re.compile(r'\$[\d.]+')
p_date = re.compile(r'(\d{1,4}[/-]\d{1,4}[/-]\d{1,4})')
count_dict = defaultdict(int)
def count_data(type, count):
global count_dict
count_dict[type] += count
for line in lines:
count_data('smiley', len(re.findall(p_smiley, line)))
count_data('date', len(re.findall(p_date, line)))
count_data('currency', len(re.findall(p_currency, line)))
Upvotes: 1