Reputation: 63
I am trying to replace everything from a data dump and keep only the ten digit numbers from that dump using notepad++ regex.
Trying to do something like this (?<!\d)0\d{7}(?!\d)
but no luck.
Upvotes: 2
Views: 17972
Reputation: 69
As an example of a different procedure but answering the question by examples: How to get a list of ID's of your Facebook group to avoid removal of active users, it's used to reduce as well a group from 10.000 to 5000 members as well as removal of not active members from a group on Facebook.
It might be outdated but don't mind old program just look below what will do what, since explanation was to understand FIND: and Replace: what it does:
As well as a different example of how to parse as well as text and code out of HTML. And a range of numbers if they are with 2 digits up to 30.
You can try this to purge the list of member_id= and with them along with numbers from 2 to up to 30 digits long. Making sure only numbers and whole "member_id=12456" or "member_id=12" are written to the file. Later you can replace out the member_id= with blanking it out. Then copy the whole list to a duplicate scanner or remove duplicates. And have all unique IDs. And then use it in the Java code below.
"This is used to purge all Facebook user ID's by a group out of a single HTML file after you saved it scrolling down the group"
You should use the "Regular Expression" and ". matches newline" on the code below. This represents the removal of all FIND by $1 zeroing out everything:
Find: (member_id=\d{2,30})|.
Replace: $1
Second use the Extended Mode on this mode:
Find: member_id=
Replace: \n
That will make new lines with \n and with an easy way to remove all Fx0 in all lines to manually remove all the extra characters that come in buggy Notepad++
Then you can easily as well then remove all duplicates. Connect all lines into one single space between. The option was to use this tool which aligns the whole text with one space between each ID since its removing all duplicates: https://www.tracemyip.org/tools/remove-duplicate-words-in-text/
As well then again "use Normal option in Notepad++": Remember to add ' to beginning and end
Find: "ONE SPACE"
Replace ','
Then you can copy the whole line into your java edit and then remove all members who are not active. If you though use a whole scrolled down HTML of a page. ['21','234','124234'] <-- remember right characters from beginning. Extra secure would be to add your IDs to the beginning.
The facebook group removal java code is here: https://gist.github.com/michaelv/11145168
Upvotes: 0
Reputation: 15000
There where problems in older versions of Notepad++ which wouldn't handle PCRE expressions. This proposed solution was tested in NotePad++ v6.8.8, but should work in any version later than v6.2.
([0-9]{10})|.
Replace with: $1
This expression will do the following:
From Notepad++
press the ctrlh to enter the find and replace mode
Select the Regular Expression option
In the "Find what" field place the regular expression
in the "Replace with" field enter $1
Click Replace all
Live Demo
https://regex101.com/r/fZ9vH7/1
Source Text
fdsafasfa1234567890zzzzzzz12345
After Replacement
1234567890
NODE EXPLANATION
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
[0-9]{10} any character of: '0' to '9' (10 times)
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
. any character except \n
----------------------------------------------------------------------
The OP wasn't clear on what to do with substrings of numbers longer than 10 characters. If strings of numbers longer than 10 digits are undesirable and need to be removed in their entirity, then use this
([0-9]{10})(?![0-9])|[0-9]+|.
Replace with: $1
Live Demo: https://regex101.com/r/aS4sN1/1
Upvotes: 7
Reputation: 520968
Try this:
Find: .*(\d{10}).*
Replace: \1
This has been tested in Notepad++.
Upvotes: 2