Reputation: 1961
I'm trying to extract first paragraph. But I have found any luck. Can any one help me? here is text. http://dpaste.com/638776/. My text is dynamic. thanks
UPDATE: I'm reading XML file using eTree module. In XML there is tag called <text></text>
. The data between <text></text>
is here. I just want print following data from text tags
. Is it possible? thanks
'''Zamindar''' ({{te|జమీందార్}}) is a 1965 [[Telugu language|Telugu]] "Thriller" film
directed by [[V. Madhusudhan Rao]] and produced by [[Tammareddy Krishna Murthy]]
of Ravindra Art Pictures.This is variety role for [[Akkineni Nageswara Rao]]
who is more popular with soft Romantic roles.He plays the role of a tough CID Officer very well.The Movie has some Good songs.This movie has a considerable resemblance with the 1963 [[Cary Grant]] English Movie ''[[Charade (1963 film)|Charade]]''.
Upvotes: 2
Views: 1715
Reputation: 4954
If you build a regex where the dot matches the newline, you have (tested in ruby but I guess that it will work in python as is). It is quite the same as the answer by Niall Byrne:
}}\n(.*?)\n\n
Please see the effect at rubular.
Upvotes: 1
Reputation: 2460
Revised based on new info ...
If you able to produce the text between the tags, you just need to find a pattern for the first paragraph that will fit all cases, so based on this example:
#data - stuff between text tags
firstparagraph = re.search("}}(.*?)\r*\n\r*\n",data,re.DOTALL)
print firstparagraph.group(1)
Upvotes: 0