SS Hegde
SS Hegde

Reputation: 739

Unable to successfully instantiate BreakIterator even after u_setDataDirectory was set

I am using ICU's BreakIterator (icu 68.2) for word segmentation. I have used u_setDataDirectory to initialise the data path as mentioned in below code snippet's 1st line. But when I check the status of createWordInstance(), I am getting U_MISSING_RESOURCE_ERROR. This kind of error should be solved by calling u_setDataDirectory. I have used that, but the problem is still there.

u_setDataDirectory;
UErrorCode status = U_ZERO_ERROR;
BreakIterator *wordIterator = BreakIterator::createWordInstance(Locale("zh"), status);

if (U_FAILURE(status)) 
{
   std::cout<<"failed to create break iterator.  status = "<<u_errorName(status)<<std::endl;
   exit 1;
}
UnicodeString text = "sample input string";
wordIterator->setText(text);
delete wordIterator;

Upvotes: 0

Views: 338

Answers (1)

TeaAge Solutions
TeaAge Solutions

Reputation: 473

If u_setDataDirectory is the only cause for your error, you must call it with the correct path to the data directory.

change your first line from

 u_setDataDirectory;

to (on Linux/Unix)

 u_setDataDirectory( "/path/to/ICU/data/" );

or on Windows to

 u_setDataDirectory( "C:\\Path\\To\\ICU\\Data\\" );

Unfortunately I don't know where are the data files of ICU. You need to correct the path string to the correct path on your system.

But from the reading of the ICU documentation https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/putil_8h.html#a550ea8502df5fed47ae4359be7f8e6a2 I guess, this is not enough to solve your problem.

If the call above does not solve your problem, you can try to call u_init( UErrorCode *status ) as the very first statment:

UErrorCode status = U_ZERO_ERROR;
u_init( &status );
if (U_FAILURE(status)) 
{
   std::cout<<"failed to init. status = "<<u_errorName(status)<<std::endl;
   std::exit( 1 );
}

and then check what is the problem.

EDIT A second root cause can be that the "Locale("zh")" is missing. Does it work with other locales? Like change it to "en_US" ?

You can also test if the Locale works by this:

if( Locale("zh").isBogus() )
{
    std::cout << "Locale is not working!" << std::endl;
    std::exit( 1 );
}
    

Upvotes: 2

Related Questions