Reputation: 1954
i am working on chet-bot program in perl and regular expression but i am not getting the desired results as you see i have all pronouns and verbs in hash and i am loop through string and if it match hash key then replace hash key value with the current substring value.
program output
Eliza: Hi, I'm a psychotherapist. What is your name?
adam: my name is adam
Eliza: Hello adam , How are you?
adam: i am feeling bad
Eliza: Why you am feeling bad ?
adam: because i am sick
Eliza: Why because you am sick?
regardless of "because" word in last question but output should be something like this
Eliza: Why because you are sick?
any suggestions of how i can solve this issue.
code:
sub makeQuestion{
my ($patient) = @_;
my %reflections = (
"am" => "are ",
"was" => "were ",
"i" => "you ",
"i'd" => "you would ",
"i've" => "you have ",
"i'll" => "you will ",
"my" => "your ",
"are" => "am ",
"you've"=> "I have ",
"you'll"=> "I will ",
"your" => "my ",
"yours" => "mine ",
"you" => "me ",
"me" => "you "
);
my @toBes = keys %reflections;
foreach my $toBe (@toBes) {
if ($patient =~/$toBe\b/)
{
$patient=~ s/$toBe /$reflections{$toBe}/i;
}
}
print "Why $patient? \n";
}
Upvotes: 2
Views: 242
Reputation: 490
EDIT: As suggested by @zdin, I replace '\s'
by ' '
which will match any number of whitespaces and also ignore leading whitespaces.
That's because you are looping over the full key of your %reflections hash and make systematic replacements. Therefore, you find the "am" key at loop 1 and replace it with "are". Then you find the "are" key at loop 8 and replace it with "am".
As you make single word replacements, you should rather ensure you run over a single word only once by using split:
#!/usr/bin/perl
use strict;
use warnings;
my $question = '';
while ($question ne 'stop') {
$question = <STDIN>;
chomp $question;
print makeQuestion($question)."\n";
}
sub makeQuestion{
my ($patient) = @_;
my @new_question;
my %reflections = (
"am" => "are",
"was" => "were",
"i" => "you",
"i'd" => "you would",
"i've" => "you have",
"i'll" => "you will",
"my" => "your",
"are" => "am",
"you've"=> "I have",
"you'll"=> "I will",
"your" => "my",
"yours" => "mine",
"you" => "me",
"me" => "you",
);
WORDS: foreach my $word (split ' ', $patient) {
REPLACE: foreach my $key (keys %reflections) {
if ($word =~ m{\A$key\Z}i) {
$word =~ s{$key}{$reflections{$key}}i;
last REPLACE;
}
}
push @new_question, $word;
}
return join ' ', @new_question;
}
Upvotes: 2
Reputation: 66964
Your code makes circular replacements, since it always processes the whole phrase. It replaces a word, only to later replace that replacement. The answer by David Verdin explains it and shows a way to fix this.
Here is another way
my $phrase = join ' ', map { $reflections{$_} // $_ } split ' ', $patient;
The list of words produced by split is fed to map, which applies the code in the block to each. Inside the block the currently processed element is in the default $_ variable.
If the word is a key in the hash with a defined
value then that value is returned, otherwise the word itself is returned. This is achieved by //
, the defined-or operator. Thus all words that have a hash key get replaced by the corresponding values, while others are passed along unchanged. Their order is kept as well so we get our list of words processed as needed.
That output list is then joined by space, forming the phrase to be prepended by 'Why '
Note that the pattern ' '
in split
matches any amount of any whitespace. It is commonly used to break strings "by space" (into "words") and is split
's default.
I'd like to add a comment on the regular expression use in the posted code. You don't need to first test for a match in order to do the substitution. You can just do
foreach my $item (@list) {
$item =~ s/$pattern/$repl/;
}
and if the $pattern
doesn't match in $item
nothing happens, the $item
is unchanged.
Upvotes: 3
Reputation: 21
This happens because in the particular phrase/sentence, there are two eligible candidates in matching your regex. And also, the fact is, elements of hashes are ordered in a random way in memory. In your code, you just got the keys of %refelections
, you did not provide for sorting that. Therefore, every run, keys %reflections
will return array of different sorting. For example, in Run 1, it might be ( 'am', 'i', 'my', me' ... )
, then next run ( 'i', "you'll', 'my', 'yours' )
.
Upvotes: 0