Rajiv
Rajiv

Reputation: 63

Finding Ten Digit Number using regex in notepad++

I am trying to replace everything from a data dump and keep only the ten digit numbers from that dump using notepad++ regex.

Trying to do something like this (?<!\d)0\d{7}(?!\d) but no luck.

Upvotes: 2

Views: 17972

Answers (3)

Dealazer
Dealazer

Reputation: 69

As an example of a different procedure but answering the question by examples: How to get a list of ID's of your Facebook group to avoid removal of active users, it's used to reduce as well a group from 10.000 to 5000 members as well as removal of not active members from a group on Facebook.

It might be outdated but don't mind old program just look below what will do what, since explanation was to understand FIND: and Replace: what it does:

As well as a different example of how to parse as well as text and code out of HTML. And a range of numbers if they are with 2 digits up to 30.

You can try this to purge the list of member_id= and with them along with numbers from 2 to up to 30 digits long. Making sure only numbers and whole "member_id=12456" or "member_id=12" are written to the file. Later you can replace out the member_id= with blanking it out. Then copy the whole list to a duplicate scanner or remove duplicates. And have all unique IDs. And then use it in the Java code below.

"This is used to purge all Facebook user ID's by a group out of a single HTML file after you saved it scrolling down the group"

You should use the "Regular Expression" and ". matches newline" on the code below. This represents the removal of all FIND by $1 zeroing out everything:

Find: (member_id=\d{2,30})|.
Replace: $1

Second use the Extended Mode on this mode:

Find: member_id=
Replace: \n

That will make new lines with \n and with an easy way to remove all Fx0 in all lines to manually remove all the extra characters that come in buggy Notepad++

Then you can easily as well then remove all duplicates. Connect all lines into one single space between. The option was to use this tool which aligns the whole text with one space between each ID since its removing all duplicates: https://www.tracemyip.org/tools/remove-duplicate-words-in-text/

As well then again "use Normal option in Notepad++": Remember to add ' to beginning and end

Find: "ONE SPACE"
Replace ','

Then you can copy the whole line into your java edit and then remove all members who are not active. If you though use a whole scrolled down HTML of a page. ['21','234','124234'] <-- remember right characters from beginning. Extra secure would be to add your IDs to the beginning.

The facebook group removal java code is here: https://gist.github.com/michaelv/11145168

Upvotes: 0

Ro Yo Mi
Ro Yo Mi

Reputation: 15000

Forward

There where problems in older versions of Notepad++ which wouldn't handle PCRE expressions. This proposed solution was tested in NotePad++ v6.8.8, but should work in any version later than v6.2.

Description

([0-9]{10})|.

Regular expression visualization

Replace with: $1

This expression will do the following:

  • capture 10 digit numbers and place them into capture group 1, which is then just reinserted into the output string
  • matches everything less and removes it.

How To in Notepad ++

From Notepad++

  1. press the ctrlh to enter the find and replace mode

  2. Select the Regular Expression option

  3. In the "Find what" field place the regular expression

  4. in the "Replace with" field enter $1

  5. Click Replace all

Example

Live Demo

https://regex101.com/r/fZ9vH7/1

Source Text

fdsafasfa1234567890zzzzzzz12345

After Replacement

1234567890

Explanation

NODE                     EXPLANATION
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [0-9]{10}                any character of: '0' to '9' (10 times)
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
 |                        OR
----------------------------------------------------------------------
  .                        any character except \n
----------------------------------------------------------------------

Extra credit

The OP wasn't clear on what to do with substrings of numbers longer than 10 characters. If strings of numbers longer than 10 digits are undesirable and need to be removed in their entirity, then use this

([0-9]{10})(?![0-9])|[0-9]+|.

Regular expression visualization

Replace with: $1

Live Demo: https://regex101.com/r/aS4sN1/1

Upvotes: 7

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

Try this:

Find: .*(\d{10}).*
Replace: \1

This has been tested in Notepad++.

Upvotes: 2

Related Questions