Reputation: 1288
At the moment I am working with huge file which contains hundred thousands of xml entries, after changing them I have to upload them in specific systems as new database, the file contents looks like this:
<Row ss:AutoFitHeight="0">
<Cell><Data ss:Type="String">Product</Data></Cell>
<Cell><Data ss:Type="String">Home > Connectors > Power Entry</Data></Cell>
<Cell><Data ss:Type="Number">10430</Data></Cell>
<Cell><Data ss:Type="String">CAMDEN-BOSS CONTACT, 6AWG, 75A CBCAG14</Data></Cell>
<Cell><Data ss:Type="String">CONTACT, 6AWG, 75A; Connector Mounting:Cable; Contact Termination:Crimp; Current Rating:75A; SVHC:No SVHC (18-Jun-2012); Series:CBC; Voltage Rating:600V; Flammability Rating:UL94 V0; Wire Area Size Max:11mm; Wire Size AWG Max:6AWG; Wire Size AWG Min:6AWG<br /><br /><strong>Price for pack of: 1</strong><br /><br /><strong>Country Of Origin: CN</strong><br /><br /><a href="http://LALA.co.uk/datasheets/1508502.pdf"><img alt="" src="/ekmps/shops/LALA/resources/Design/icon-pdf.gif" style="width: 16px; height: 16px;" />&nbsp;Technical Data Sheet</a><br /></Data></Cell>
</Row>
My job is to remove all the entries in which there aren any links to .pdf files, examble above has it so would be left, but if there wouldnt be "http://LALA.co.uk/datasheets/1508502.pdf" in description it should have been removed (all row), I can work with diferend things, from C# to.. So doesnt really matter of solution type, can anyone suggest me something?
Upvotes: 0
Views: 1533
Reputation: 1385
In Notepad++ find (Ctrl+F)
<Row[^>]*>((?!\.pdf).)*?</Row>
Replace with
(leave blank)
"Regular expression" and ". matches newline" boxes have to be checked
Upvotes: 1