Reputation: 1013
I need to extract data from some old interbase databasefiles created with InterBase 4.2.1. I'm using the embedded version of Firebird (version 2.5.1) and the .NetProvider (version 2.7.0). I never worked with interbase of firebird before (but i have some experience with SQL SERVER and SQLite) and after cruising the web and experimenting for 2 days, I have yet to find a solution.
The tables in the database contain data in English, but also data in Hebrew. Optimistic as I was, I started by creating my connection string using UTF8:
FbConnectionStringBuilder builder = new FbConnectionStringBuilder();
builder.Database = m_DatabaseName;
builder.ServerType = FbServerType.Embedded;
builder.Charset = FbCharset.Utf8.ToString();
But that gave me the following exception:
bad parameters on attach or create database
CHARACTER SET Utf8 is not defined
I correctly used the fbintl.dll. (see the files in my application directory and subdirectories below). I even used ProcessMonitor to check if the fbintl.dll was loaded.
fbembed.dll
firebird.log
firebird.msg
FirebirdSql.Data.FirebirdClient.dll
ib_util.dll
icudt30.dll
icuin30.dll
icuuc30.dll
MyApplication.exe
Microsoft.VC80.CRT.manifest
msvcp80.dll
msvcr80.dll
intl\fbintl.conf
intl\fbintl.dll
udf\fbudf.dll
udf\ib_udf.dll
So I tried enumerating FbCharset
and trying to connect with each character set, more than half of them threw the same exception, and when I connected with the others and queried one of the Hebrew fields (by using IDataReader.GetString()
), I always got the same garbage as result. It doesn't seem to matter what character set I specify in the connection string, the result are always the same, even if I don't specify any character set at all.
Next I queried the character sets defined in the database SELECT RDB$CHARACTER_SET_NAME FROM RDB$CHARACTER_SETS
and I enumrated those, trying to connect with each one of them, some threw an exception, the others gave the same result as before.
I have no clue with what character set the database was created, but I checked the character sets of each field in the database and all text fields have their character sets set to 'NONE'.
SELECT r.RDB$RELATION_NAME, r.RDB$FIELD_NAME, f.RDB$FIELD_NAME, cset.RDB$CHARACTER_SET_NAME
FROM RDB$RELATION_FIELDS r
LEFT JOIN RDB$FIELDS f ON r.RDB$FIELD_SOURCE = f.RDB$FIELD_NAME
LEFT JOIN RDB$CHARACTER_SETS cset ON f.RDB$CHARACTER_SET_ID = cset.RDB$CHARACTER_SET_ID
ORDER BY r.RDB$RELATION_NAME ASC, r.RDB$FIELD_POSITION ASC
But I noticed some of the system tables' text fields have UNICODE_FSS as character set. I already tried that character set in the connection string, but I still get garbadge for the requested text fields.
My last try was to retrieve the bytes (by using IDataReader.GetBytes()
) and encode the string myself, but this gives me a cast exception (Unable to cast object of type 'System.String' to type 'System.Byte[]'.
)
Does anyone have any ideas on how to read this data ? I don't need to convert the databases permanently, as they won't be used anymore once i have extracted the data.
EDIT: btw, are there any free lightweight interbase/firebird database viewers, I can't seem to find any good ones (comparable to SQLiteSpy) ?
Marc
Upvotes: 2
Views: 3849
Reputation: 1013
I've found the solution.
The original database was written on a windows with codepage 1255. When I read the data now c# generates a unicode string using the default encoding (which is not 1255). So i just decode the string to bytes using the default encoding and then convert the bytes to string using the correct encoding.
Encoding encoding = Encoding.GetEncoding(1255);
...
if (!datareader.IsDBNull(i))
{
string value = dataReader.GetString(i);
if (value.Length > 0)
{
byte[] bytes = Encoding.Default.GetBytes(value);
value = encoding.GetString(bytes);
}
// store value
}
This solution works fine for me, but I still don't understand why I can't specify "WIN1255" as charset in my connectionstring without getting the exception bad parameters on attach or create database - CHARACTER SET WIN1255 is not defined
? (I even tried charset "WINDOWS1255" but then i get the exception Invalid character set specified
).
Upvotes: 3