Simon Goodman
Simon Goodman

Reputation: 1204

Using pcre2 in a c++ project

I am looking at using pcre2 in my simple c++ app, (I am using vs2015). (I am looking at various regex libraries and the general feeling is that pcre/pcre2 are the most flexible)

First I downloaded pcre2 from the official location, (http://sourceforge.net/projects/pcre/files/pcre2/10.20/) and created a very simple example.

#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
...
PCRE2_SPTR subject = (PCRE2_SPTR)std::string("this is it").c_str();
PCRE2_SPTR pattern = (PCRE2_SPTR)std::string("([a-z]+)|\\s").c_str();

...
int errorcode;
PCRE2_SIZE erroroffset;
pcre2_code *re = pcre2_compile(pattern, PCRE2_ZERO_TERMINATED, 
                                PCRE2_ANCHORED | PCRE2_UTF, &errorcode,  
                                &erroroffset, NULL);
...

First of all the file "pcre2.h" does not exist, so I renamed pcre2.h.generic to pcre2.h

But then I get linker errors with unresolved externals.

I am guessing I need to include one or more files from the source to project. But I am reluctant to just randomly add files without knowing what it all does.

Can someone give some simple steps to follow to successfully build a project using pcre2?

UPDATE
This is not an import library issue, pcre2.h does not come with a librar, (not one that I can see in their release location).

Upvotes: 13

Views: 8526

Answers (6)

MelS
MelS

Reputation: 119

Here's more detail on Charles Thomas's answer...

If you're using this on Windows from C++ and you built PCRE2 as a static library...in the pcre2.h, there's this...

#if defined(_WIN32) && !defined(PCRE2_STATIC)
#  ifndef PCRE2_EXP_DECL
#    define PCRE2_EXP_DECL  extern __declspec(dllimport)
#  endif
#endif

_WIN32 is defined because you're on Windows, but you need to define PCRE2_STATIC at the top of pcre2.h, like so...

#define PCRE2_STATIC 1

This makes it puts extern "C" in front of each function, instead of extern __declspec(dllimport) so you can link statically.

Upvotes: 0

Don F
Don F

Reputation: 159

You can follow these steps:

  1. Download and install cmake.
  2. Set the source folder location and a VS project folder.
  3. Hit configure and select your VS version.
  4. Once the configure process is done you can select 8, 16, and / or 32 bit from the list.
  5. Press generate, then open the VS solution file in the project folder. This will open the solution in VS.
  6. There are about 6 projects. Highlight the pcre2._ project. Go to preferences and make sure the output file is for a DLL. Repeat this step for the pcre2posix project. And then set the greptest to be built as an exe (executable) and the other one as executable.
  7. At this point you can try to builld all, but you might need to build the DLLs first because the executables rely on them (or rather their static libraries) for linking.
  8. After all 6 projects are built successfully you should have shared /static libraries and test programs in your debug or release folders.

Upvotes: 1

Jahid
Jahid

Reputation: 22428

If you don't mind using a wrapper, here's mine: JPCRE2

You need to select the basic character type (char, wchar_t, char16_t, char32_t) according to the string classes you will use (respectively std::string, std::wstring, std::u16string, std::u32string):

typedef jpcre2::select<char> jp;
//Selecting char as the basic character type will require
//8 bit PCRE2 library where char is 8 bit,
//or 16 bit PCRE2 library where char is 16 bit,
//or 32 bit PCRE2 library where char is 32 bit.
//If char is not 8, 16 or 32 bit, it's a compile error.

Match Examples:

Check if a string matches a pattern:

if(jp::Regex("(\\d)|(\\w)").match("I am the subject")) 
    std::cout<<"\nmatched";
else
    std::cout<<"\nno match";

Match all and get the match count:

size_t count = 
jp::Regex("(\\d)|(\\w)","mi").match("I am the subject", "g");
// 'm' modifier enables multi-line mode for the regex
// 'i' modifier makes the regex case insensitive
// 'g' modifier enables global matching

Get numbered substrings/captured groups:

jp::VecNum vec_num;
count = 
jp::Regex("(\\w+)\\s*(\\d+)","im").initMatch()
                                  .setSubject("I am 23, I am digits 10")
                                  .setModifier("g")
                                  .setNumberedSubstringVector(&vec_num)
                                  .match();
std::cout<<"\nTotal match of first match: "<<vec_num[0][0];      
std::cout<<"\nCaptrued group 1 of first match: "<<vec_num[0][1]; 
std::cout<<"\nCaptrued group 2 of first match: "<<vec_num[0][2]; 

std::cout<<"\nTotal match of second match: "<<vec_num[1][0];
std::cout<<"\nCaptrued group 1 of second match: "<<vec_num[1][1];
std::cout<<"\nCaptrued group 2 of second match: "<<vec_num[1][2]; 

Get named substrings/captured groups:

jp::VecNas vec_nas;
count = 
jp::Regex("(?<word>\\w+)\\s*(?<digit>\\d+)","m")
                         .initMatch()
                         .setSubject("I am 23, I am digits 10")
                         .setModifier("g")
                         .setNamedSubstringVector(&vec_nas)
                         .match();
std::cout<<"\nCaptured group (word) of first match: "<<vec_nas[0]["word"];
std::cout<<"\nCaptured group (digit) of first match: "<<vec_nas[0]["digit"];

std::cout<<"\nCaptured group (word) of second match: "<<vec_nas[1]["word"];
std::cout<<"\nCaptured group (digit) of second match: "<<vec_nas[1]["digit"];

Iterate through all matches and substrings:

//Iterating through numbered substring
for(size_t i=0;i<vec_num.size();++i){
    //i=0 is the first match found, i=1 is the second and so forth
    for(size_t j=0;j<vec_num[i].size();++j){
        //j=0 is the capture group 0 i.e the total match
        //j=1 is the capture group 1 and so forth.
        std::cout<<"\n\t("<<j<<"): "<<vec_num[i][j]<<"\n";
    }
}

Replace/Substitute Examples:

std::cout<<"\n"<<
///replace all occurrences of a digit with @
jp::Regex("\\d").replace("I am the subject string 44", "@", "g");

///swap two parts of a string
std::cout<<"\n"<<
jp::Regex("^([^\t]+)\t([^\t]+)$")
             .initReplace()
             .setSubject("I am the subject\tTo be swapped according to tab")
             .setReplaceWith("$2 $1")
             .replace();

Replace with Match Evaluator:

jp::String callback1(const jp::NumSub& m, void*, void*){
    return "("+m[0]+")"; //m[0] is capture group 0, i.e total match (in each match)
}
int main(){
    jp::Regex re("(?<total>\\w+)", "n");
    jp::RegexReplace rr(&re);
    String s3 = "I am ঋ আা a string 879879 fdsjkll ১ ২ ৩ ৪ অ আ ক খ গ ঘ আমার সোনার বাংলা";
    rr.setSubject(s3)
      .setPcre2Option(PCRE2_SUBSTITUTE_GLOBAL);
    std::cout<<"\n\n### 1\n"<<
            rr.nreplace(jp::MatchEvaluator(callback1));
            //nreplace() treats the returned string from the callback as literal,
            //while replace() will process the returned string
            //with pcre2_substitute()

    #if __cplusplus >= 201103L
    //example with lambda
    std::cout<<"\n\n### Lambda\n"<<
            rr.nreplace(
                jp::MatchEvaluator(
                    [](const jp::NumSub& m1, const jp::MapNas& m2, void*){
                        return "("+m1[0]+"/"+m2.at("total")+")";
                    }
                ));
    #endif
    return 0;
}

You can read the complete documentation here.

Upvotes: 14

Charles Thomas
Charles Thomas

Reputation: 1005

I don't know if this is still something you're looking at or not... but just-in-case does this help?

From pcre2api man page:

In a Windows environment, if you want to statically link an application program against a non-dll PCRE2 library, you must define PCRE2_STATIC before including pcre2.h.

Upvotes: 1

Arne Vogel
Arne Vogel

Reputation: 6666

PCRE2_SPTR pattern = (PCRE2_SPTR)std::string("([a-z]+)|\\s").c_str();

Using this pointer with any of the PCRE functions will result in undefined behavior. The std::string temporary is destroyed at the end of the definition of pattern, causing pattern to dangle.

My recommendation is to change pattern's type to std::string and call c_str() when passing arguments to a PCRE function. It is a very fast operation in C++11 (and you are not using the old GCC 4 ABI).

There are also several C++ wrappers for PCRE that might help you avoid such issues and make PCRE easier to use, but I do not the status of Windows support.

Upvotes: 4

Simon Goodman
Simon Goodman

Reputation: 1204

In case someone wants to build the library using visual studio

  1. Download pcre2 from the website, (http://www.pcre.org/)
  2. in Visual Studio 2015, (and maybe others), create an empty project "Win32 project" and call it pcre2.
  3. Copy all the files in \pcre2\src\ to your newly created empty project.
  4. Add all the files listed in "NON-AUTOTOOLS-BUILD", (located in the base folder)
    • pcre2_auto_possess.c
    • pcre2_chartables.c
    • pcre2_compile.c
    • pcre2_config.c
    • etc...
  5. Rename the file config.h.generic to config.h
  6. Add the config.h file to the project.
  7. In your project, select all the *.c file Go Properties > C/C++ > Precompiled Header > "Not Using Precompiled header"
  8. Select the project, Go to Properties > Preprocessor > Preprocessor Definition and select the drop down list, and add...
    • PCRE2_CODE_UNIT_WIDTH=8
    • HAVE_CONFIG_H

Compile and the lib file should be created fine.

Upvotes: 10

Related Questions