Reputation: 825
We are working on a project where we need to migrate data stored in an Access database to a cache database. The Access database contains columns with a data type of Attachment
; some of the tuples contain multiple attachments. I am able to obtain the filenames of these files by using .FileName
, but I'm unsure how to determine when one file ends and another starts in .FileData
.
I am using the following to obtain this data:
System.Data.OleDb.OleDbCommand command= new System.Data.OleDb.OleDbCommand();
command.CommandText = "select [Sheet1].[pdf].FileData,* from [Sheet1]";
command.Connection = conn;
System.Data.OleDb.OleDbDataReader rdr = command.ExecuteReader();
Upvotes: 9
Views: 11549
Reputation: 1010
According to Gord Thompson's answer I would like to provide the following information.
The first byte is the hexadecimal representation of the metadata's length. Byte 8 (0x04) is the hexadecimal representation of the length of the extension + 1. In this example this means that we need to remove the first 20 bytes (0x14):
This can be achieved very easily with the following functions:
Function SaveBinaryData(sFileName As String, ByteArray() As Byte)
Dim stream As New ADODB.stream 'Create Stream object
With stream
.type = adTypeBinary 'Specify stream type - we want To save binary data.
.Open 'Open the stream And write binary data To the object
.Write ByteArray
.SaveToFile sFileName, adSaveCreateOverWrite 'Save binary data To disk
End With
End Function
Public Function ReadBinaryData(sFileName As String) As Byte()
Dim stream As New ADODB.stream
With stream
.type = adTypeBinary
.Open
.LoadFromFile sFileName
ReadBinaryData = .Read
End With
End Function
Public Function ShiftL(arrBytes() As Byte, iShift As Integer) As Byte()
Dim i As Integer
Dim arrReturn() As Byte
For i = 0 To iShift - 1
ReDim Preserve arrReturn(i)
arrReturn(i) = Shift(arrBytes)
Next
ShiftL = arrReturn
End Function
Public Function Shift(arrBytes() As Byte) As Byte
Dim b As Long
If Not IsArray(arrBytes) Then
Err.Raise 13, , "Type Mismatch"
Exit Function
End If
Shift = arrBytes(0)
For b = 1 To UBound(arrBytes)
arrBytes(b - 1) = arrBytes(b)
Next b
ReDim Preserve arrBytes(UBound(arrBytes) - 1)
End Function
When you are accessing the value
of the attachment field, just left shift the byte array by CDec(.Fields("FileData")(0))
. After the shift you can process the file data as desired, e.g.:
Dim fldAttachment As DAO.Field2
Dim arrBytes() As Byte
Set fldAttachment = .Fields("FileData")
With fldAttachment
arrBytes = fldAttachment.value
ShiftL arrBytes, CDec(arrBytes(0))
SaveBinaryData .Fields("FileName").value, ByteArray
End With
Upvotes: 1
Reputation: 32
It took me a while to piece together information to retrieve a file stored from within the attachment field so I just thought I'd share it.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using System.Data.OleDb;
using System.IO;
using System.Diagnostics;
namespace AttachCheck
{
public partial class Form1 : Form
{
DataSet Set1 = new DataSet();
int ColId;
public Form1()
{
InitializeComponent();
OleDbConnection connect = new OleDbConnection("Provider=Microsoft.ACE.OLEDB.12.0;Data Source='db/Adb.accdb'"); //set up connection
//CL_ID is a fk so attachments can be linked to users
OleDbCommand sql = new OleDbCommand("SELECT at_ID, [at_Name].[FileData], [at_Name].[FileName], [at_Name].[FileType] FROM Attachments WHERE at_ID =1;", connect);
//adding sql to addapter to be ran
OleDbDataAdapter OleDA = new OleDbDataAdapter(sql);
//attempting to open connection
try { connect.Open(); }
catch (Exception err) { System.Console.WriteLine(err); }
OleDA.Fill(Set1); //create and fill dataset
connect.Close();for (int i = 0; i < Set1.Tables[0].Rows.Count; i++)
{
System.Console.WriteLine(Set1.Tables[0].Rows[i]["at_Name.FileName"].ToString() + "This is the file name");
// by using a datagrid it allows you to display the attachments and select which to open, the open should be a button.
dataGridView1.Rows.Add(new object[] { Set1.Tables[0].Rows[i]["at_ID"].ToString(), Set1.Tables[0].Rows[i]["at_Name.FileName"].ToString(), "Open" });
}
}
private void dataGridView1_CellContentClick(object sender, DataGridViewCellEventArgs e)
{
DataGridViewCell cell = (DataGridViewCell)
dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex];
System.Console.WriteLine(dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex]);
string FullRow = dataGridView1.Rows[e.RowIndex].ToString(); //data retrieved from click on datagrid
//need to sub string to cut away row index and leave number
string SubRow = FullRow.Substring(24, 1); //cutting string down from position 24 for 1 character
System.Console.WriteLine(SubRow + " This is Row"); //
int RowId = int.Parse(SubRow); //turn row number from string into integer that can be used
string FullRow2 = dataGridView1.Rows[e.RowIndex].Cells[e.ColumnIndex].ToString(); //data retrieved from click on datagrid
//need to sub string to cut away row index and leave number
string SubRow2 = FullRow2.Substring(37, 1); //cutting string down from position 24 for 1 character
System.Console.WriteLine(SubRow2 + " This is Column"); //
int ColId = int.Parse(SubRow2); //turn row number from string into integer that can be used
if (ColId == 2)
{
string fileName = Set1.Tables[0].Rows[RowId]["at_Name.FileName"].ToString(); //assign the file to variable
//retrieving the file contents from the database as an array of bytes
byte[] fileContents = (byte[])Set1.Tables[0].Rows[RowId]["at_Name.FileData"];
fileContents = GetFileContents(fileContents); //send filecontents array to be decrypted
string fileType = Set1.Tables[0].Rows[RowId]["at_Name.FileType"].ToString();
DisplayTempFile(fileName, fileContents, fileType); //forward the file type to display file contents
}
}
private const int CONTENT_START_INDEX_DATA_OFFSET = 0; //values used for decoding
private const int UNKNOWN_DATA_OFFSET = 4; //the files
private const int EXTENSION_LENGTH_DATA_OFFSET = 8; //storedw within the access database
private const int EXTENSION_DATA_OFFSET = 12; //and this one
private byte[] GetFileContents(byte[] fileContents)
{
int contentStartIndex = BitConverter.ToInt32(fileContents, CONTENT_START_INDEX_DATA_OFFSET);
//'The next four bytes represent a value whose meaning is unknown at this stage, although it may represent a Boolean value indicating whether the data is compressed or not.
int unknown = BitConverter.ToInt32(fileContents, UNKNOWN_DATA_OFFSET);
//'The next four bytes contain the the length, in characters, of the file extension.
int extensionLength = BitConverter.ToInt32(fileContents, EXTENSION_LENGTH_DATA_OFFSET);
//'The next field in the header is the file extension, not including a dot but including a null terminator.
//'Characters are Unicode so double the character count to get the byte count.
string extension = Encoding.Unicode.GetString(fileContents, EXTENSION_DATA_OFFSET, extensionLength * 2);
return fileContents.Skip(contentStartIndex).ToArray();
}
private void DisplayTempFile(string fileName, byte[] fileContents, string fileType)
{
// System.Console.WriteLine(fileName + "File Name");
// System.Console.WriteLine(fileType + "File Type");
// System.Console.WriteLine(fileContents + "File Contents");
string tempFolderPath = Path.GetTempPath(); //creating a temperary path for file to be opened from
string tempFilePath = Path.Combine(tempFolderPath, fileName); // assigning the file to the path
if (!string.IsNullOrEmpty(tempFilePath)) //checking the temp file exists
{
tempFilePath = Path.Combine(tempFolderPath, //combines the strings 0 and 1 below
String.Format("{0}{1}",
Path.GetFileNameWithoutExtension(fileName), //0
Path.GetExtension(fileName))); //1
}
//System.Console.WriteLine(tempFolderPath + " tempFolderPath");
//System.Console.WriteLine(tempFilePath + " tempFilePath");
//'Save the file and open it.
File.WriteAllBytes(tempFilePath, fileContents);
//creates new file, writes bytes array to it then closes the file
//File.ReadAllBytes(tempFilePath);
//'Open the file.
System.Diagnostics.Process attachmentProcess = Process.Start(tempFilePath);
//chooses the program to open the file if available on the computer
}
}
}
Hope this helps someone
Upvotes: 1
Reputation: 321
The following code goes through all the records of the Microsoft Access database data table and assigns each row to a recordset. The goes through all the attachments which are saved in field "Docs". Then extracts and saves those files on the disk. This code is an extension on the code introduced by "Gord Thompson" above. The only thing that I did was that I wrote the code for Visual Basic.NET.
Imports Microsoft.Office.Interop.Access.Dao
Put a reference to Dao by using the above line of code.
'Visual Basic.NET
Private Sub ReadAttachmentFiles()
'required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
'define a new database engine and a new database
Dim dbe = New DBEngine
Dim db As Database = dbe.OpenDatabase("C:\Users\Meisam\Documents\Databases\myDatabase.accdb")
'define the main recordset object for each row
Dim rstMain As Recordset = db.OpenRecordset( _
"SELECT * FROM Companies", _
RecordsetTypeEnum.dbOpenSnapshot)
'evaluate whether the recordset is empty of records
If Not (rstMain.BOF And rstMain.EOF) Then
'if not empty, then move to the first record
rstMain.MoveFirst()
'do until the end of recordset is not reached
Do Until rstMain.EOF
Dim myID As Integer = -1
' ID is the name of primary field with uniqe values field
myID = CInt(rstMain.Fields("ID").Value)
'define the secondary recordset object for the attachment field "Docs"
Dim rstAttach As Recordset2 = rstMain.Fields("Docs").Value
'evaluate whether the recordset is empty of records
If Not (rstAttach.BOF And rstAttach.EOF) Then
'if not empty, then move to the first record
rstAttach.MoveFirst()
'do until the end of recordset is not reached
Do Until rstAttach.EOF
'get the filename for each attachment in the field "Docs"
Dim fileName As String = rstAttach.Fields("FileName").Value
Dim fld As Field2 = rstAttach.Fields("FileData")
fld.SaveToFile("C:\Users\Meisam\Documents\test\" & myID & "_" & fileName)
rstAttach.MoveNext()
Loop
End If
rstMain.MoveNext()
Loop
End If
'close the database
db.Close()
End Sub
Upvotes: 0
Reputation: 123829
(My original answer to this question was misleading. It worked okay for PDF files that were subsequently opened with Adobe Reader, but it did not always work properly for other types of files. The following is the corrected version.)
Unfortunately we cannot directly retrieve the contents of a file in an Access Attachment
field using OleDb. The Access Database Engine prepends some metadata to the binary contents of the file, and that metadata is included if we retrieve the .FileData
via OleDb.
To illustrate, a document named "Document1.pdf" is saved to an Attachment field using the Access UI. The beginning of that PDF file looks like this:
If we use the following code to try and extract the PDF file to disk
using (OleDbCommand cmd = new OleDbCommand())
{
cmd.Connection = con;
cmd.CommandText =
"SELECT Attachments.FileData " +
"FROM AttachTest " +
"WHERE Attachments.FileName='Document1.pdf'";
using (OleDbDataReader rdr = cmd.ExecuteReader())
{
rdr.Read();
byte[] fileData = (byte[])rdr[0];
using (var fs = new FileStream(
@"C:\Users\Gord\Desktop\FromFileData.pdf",
FileMode.Create, FileAccess.Write))
{
fs.Write(fileData, 0, fileData.Length);
fs.Close();
}
}
}
then the resulting file will include the metadata at the beginning of the file (20 bytes in this case)
Adobe Reader is able to open this file because it is robust enough to ignore any "junk" that may appear in the file before the '%PDF-1.4' signature. Unfortunately not all file formats and applications are so forgiving of extraneous bytes at the beginning of the file.
The only Official™ way of extracting files from an Attachment
field in Access is to use the .SaveToFile
method of an ACE DAO Field2
object, like so:
// required COM reference: Microsoft Office 14.0 Access Database Engine Object Library
//
// using Microsoft.Office.Interop.Access.Dao; ...
var dbe = new DBEngine();
Database db = dbe.OpenDatabase(@"C:\Users\Public\Database1.accdb");
Recordset rstMain = db.OpenRecordset(
"SELECT Attachments FROM AttachTest WHERE ID=1",
RecordsetTypeEnum.dbOpenSnapshot);
Recordset2 rstAttach = rstMain.Fields["Attachments"].Value;
while ((!"Document1.pdf".Equals(rstAttach.Fields["FileName"].Value)) && (!rstAttach.EOF))
{
rstAttach.MoveNext();
}
if (rstAttach.EOF)
{
Console.WriteLine("Not found.");
}
else
{
Field2 fld = (Field2)rstAttach.Fields["FileData"];
fld.SaveToFile(@"C:\Users\Gord\Desktop\FromSaveToFile.pdf");
}
db.Close();
Note that if you try to use the .Value
of the Field2 object you will still get the metadata at the beginning of the byte sequence; the .SaveToFile
process is what strips it out.
Upvotes: 16