Maddy
Maddy

Reputation: 1379

Installing ICU libraries and compiling program

I'm exploring using IBM's ICU to work with Unicode strings, by writing sample code.

Following the steps from ICU's page, I unpacked the contents of icu4c-57_1-RHEL6-x64.tgz onto /usr/local/include and /usr/local/bin, on my Linux box. Would this suffice for one to start using the ICU library?

Sample code:

#include <iostream>
#include <string.h>
#include <locale.h>
#include "unicode/coll.h"
#include "unicode/utypes.h"

using namespace icu;
using namespace std;

int main()
{
    UErrorCode success = U_ZERO_ERROR;
    Collator *collator = Collator::createInstance(success);
    collator->setStrength(Collator::PRIMARY);

    if (collator->compare("débárquér", "debarquer") == 0) {
        cout << "Strings are equal" << endl;
    } else {
        cout << "Strings are unequal" << endl;
    }
    return 0;
}

Compilation of this code fails:

$ g++ unicode.cc
/tmp/ccUknunM.o: In function `main':
unicode.cc:(.text+0x20): undefined reference to `icu_4_2::Collator::createInstance(UErrorCode&)'
unicode.cc:(.text+0x61): undefined reference to `icu_4_2::UnicodeString::UnicodeString(char const*)'
unicode.cc:(.text+0x72): undefined reference to `icu_4_2::UnicodeString::UnicodeString(char const*)'
unicode.cc:(.text+0x97): undefined reference to `icu_4_2::UnicodeString::~UnicodeString()'
unicode.cc:(.text+0xaa): undefined reference to `icu_4_2::UnicodeString::~UnicodeString()'
unicode.cc:(.text+0xc3): undefined reference to `icu_4_2::UnicodeString::~UnicodeString()'
unicode.cc:(.text+0xdd): undefined reference to `icu_4_2::UnicodeString::~UnicodeString()'
collect2: ld returned 1 exit status

From the output, it appears that ICU installation is either incorrect or incomplete. What am I missing?

Thanks!

Edit:

When I search for the file coll.h, this is what I see:

$ find /usr/local/ -name coll.h
/usr/local/bin/usr/local/include/unicode/coll.h
/usr/local/include/usr/local/include/unicode/coll.h

Does this look alright?

Upvotes: 1

Views: 7156

Answers (1)

DevSolar
DevSolar

Reputation: 70213

The "steps from the ICU page" state that...

...the .tgz file unpacks to a "/usr/local" type hierarchy.

Looking at the archive contents...

$ tar tzf icu4c-57_1-RHEL6-x64.tgz
readme.txt
usr/
usr/local/
usr/local/lib/
usr/local/lib/libicudata.so.57.1
usr/local/lib/libicudata.so
usr/local/lib/libicudata.so.57
...

...you are supposed to extract that archive to root (/), not once to /usr/local/bin and once to /usr/local/include as you did. (The path repetition in your find results should have been a hint. No, it does not look alright at all.)


That being said, what you really should have done is checking your distribution's package manager for the ICU packages (libicu, libicu-dev, ...). Installing via your package manager has several advantages:

  • It avoids problems like the one you just encountered.
  • It ensures that the version of ICU you are using for your programs is the same as used by other package's programs; ending up with a binary that links two different versions of the ICU libraries is just asking for trouble.
  • It keeps the package updated automatically.

Depending on your distribution, you might not get the absolute latest release, but that does usually not matter much.


Once you have a functioning installtion, third-party frameworks commonly require you to explicitly state an include/ subdirectory, either in the source or the build file, to make it clear which framework(s) you are actually using. For ICU, the prefix is (somewhat unintuitively) unicode... and they don't make this explicit in their documentation, where it reads...

#include <unistr.h>

So, if you instead write...

#include "unicode/coll.h"

...it should work.

Upvotes: 2

Related Questions