Reputation: 4274
I'm trying to use DocumentFormat.OpenXml to read an uploaded Excel file. When I get the file (HttpPosteFileWrapper) I'm simply trying to read the cells and write them to a text string. (Later I will do more, but I'm just trying to get used to OpenXml right now.)
My data in Excel looks something like this:
Field1 - Field2 - Phone - City
IT Department - Emp - 7175551234 - Springfield
HR - Emp - 7175556543 - W Springfield
Code looks like this:
var doc = SpreadsheetDocument.Open(file.InputStream, false);
WorkbookPart workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData thisSheet = worksheetPart.Worksheet.Elements<SheetData>().First();
System.Text.StringBuilder text = new System.Text.StringBuilder();
foreach (Row r in thisSheet.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
text.Append(c.CellValue.Text + ",");
}
text.AppendLine();
}
And the string it creates looks like this:
49,51,50,0,1,2,3,4,5,6,7,8,9,10,11,12,13,16,14,15,17,18,19,20,21,22,40,41,42,43,44,45,54,\r\n
52,24,23,25,26,27,7306,33,28,29,30,31,17033,32,34,7175555555,7175551234,7175554321,7175550000,35,36,37,36526,40179,38,39,30,31,17033,32,55,\r\n
53,46,47,48,555,\r\n
It seems like the numeric values come through. Is it because I'm using the wrong stream type?
Edit: I've updated my code to now look like this, but it still doesn't work right. There seems to be no way for me to see the text data.
public ActionResult ProfileImport(IEnumerable<HttpPostedFileBase> files)
{
// Build file list
int i = 1;
foreach (var file in files)
{
if (file.ContentLength > 0)
{
var doc = SpreadsheetDocument.Open(file.InputStream, false);
WorkbookPart workbookPart = doc.WorkbookPart;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
SheetData thisSheet = worksheetPart.Worksheet.Elements<SheetData>().First();
System.Text.StringBuilder text = new System.Text.StringBuilder();
foreach (Row r in thisSheet.Elements<Row>())
{
foreach (Cell c in r.Elements<Cell>())
{
string value = c.InnerText;
if (c.DataType != null && c.DataType.Value == CellValues.SharedString) // Check DataType exists
{
var stringTable = workbookPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault(); // Get Table parts from workbookPart
if (stringTable != null)
value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
text.Append(value + ",");
}
else
text.Append(value + ",");
}
text.AppendLine();
}
var outText = text.ToString();
}
}
}
Actual data from 1st row of the file:
AddressDescription, Address1, Address2, City, State, PostalCode, CountryCode, Email, CellPhone, HomePhone, WorkPhone, Fax, OrganizationName, Department, Position, StartDate, EndDate, OrganizationAddress1, OrganizationAddress2 OrganizationCity, OrganizationState, OrganizationPostalCode, OrganizationCountryCode, Keywords
Row.InnerText of that row:
"49515001234567891011121316141517181920212240414243444554"
Row.OuterXml:
"<x:row r=\"1\" spans=\"1:33\" s=\"3\" customFormat=\"1\" x14ac:dyDescent=\"0.25\" xmlns:x14ac=\"http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac\" xmlns:x=\"http://schemas.openxmlformats.org/spreadsheetml/2006/main\">
<x:c r=\"A1\" s=\"3\" t=\"s\"><x:v>49</x:v></x:c>
<x:c r=\"B1\" s=\"3\" t=\"s\"><x:v>51</x:v></x:c>
<x:c r=\"C1\" s=\"3\" t=\"s\"><x:v>50</x:v></x:c>
<x:c r=\"D1\" s=\"3\" t=\"s\"><x:v>0</x:v></x:c>
<x:c r=\"E1\" s=\"3\" t=\"s\"><x:v>1</x:v></x:c>
<x:c r=\"F1\" s=\"3\" t=\"s\"><x:v>2</x:v></x:c>
<x:c r=\"G1\" s=\"3\" t=\"s\"><x:v>3</x:v></x:c>
<x:c r=\"H1\" s=\"3\" t=\"s\"><x:v>4</x:v></x:c>
<x:c r=\"I1\" s=\"3\" t=\"s\"><x:v>5</x:v></x:c>
<x:c r=\"J1\" s=\"3\" t=\"s\"><x:v>6</x:v></x:c>
<x:c r=\"K1\" s=\"3\" t=\"s\"><x:v>7</x:v></x:c>
<x:c r=\"L1\" s=\"3\" t=\"s\"><x:v>8</x:v></x:c>
<x:c r=\"M1\" s=\"3\" t=\"s\"><x:v>9</x:v></x:c>
<x:c r=\"N1\" s=\"3\" t=\"s\"><x:v>10</x:v></x:c>
<x:c r=\"O1\" s=\"3\" t=\"s\"><x:v>11</x:v></x:c>
<x:c r=\"P1\" s=\"4\" t=\"s\"><x:v>12</x:v></x:c>
<x:c r=\"Q1\" s=\"4\" t=\"s\"><x:v>13</x:v></x:c>
<x:c r=\"R1\" s=\"3\" t=\"s\"><x:v>16</x:v></x:c>
<x:c r=\"S1\" s=\"3\" t=\"s\"><x:v>14</x:v></x:c>
<x:c r=\"T1\" s=\"3\" t=\"s\"><x:v>15</x:v></x:c>
<x:c r=\"U1\" s=\"3\" t=\"s\"><x:v>17</x:v></x:c>
<x:c r=\"V1\" s=\"3\" t=\"s\"><x:v>18</x:v></x:c>
<x:c r=\"W1\" s=\"3\" t=\"s\"><x:v>19</x:v></x:c>
<x:c r=\"X1\" s=\"3\" t=\"s\"><x:v>20</x:v></x:c>
<x:c r=\"Y1\" s=\"3\" t=\"s\"><x:v>21</x:v></x:c>
<x:c r=\"Z1\" s=\"3\" t=\"s\"><x:v>22</x:v></x:c>
<x:c r=\"AA1\" s=\"3\" t=\"s\"><x:v>40</x:v></x:c>
<x:c r=\"AB1\" s=\"3\" t=\"s\"><x:v>41</x:v></x:c>
<x:c r=\"AC1\" s=\"3\" t=\"s\"><x:v>42</x:v></x:c>
<x:c r=\"AD1\" s=\"3\" t=\"s\"><x:v>43</x:v></x:c>
<x:c r=\"AE1\" s=\"3\" t=\"s\"><x:v>44</x:v></x:c>
<x:c r=\"AF1\" s=\"3\" t=\"s\"><x:v>45</x:v></x:c>
<x:c r=\"AG1\" s=\"3\" t=\"s\"><x:v>54</x:v></x:c>
</x:row>"
Upvotes: 5
Views: 4736
Reputation: 717
It looks like these are indices to the strings in the Shared Strings table. In Excel file formats, string data is stored in a shared strings table, which is then references on the cell level. Per the documentation, CellValue
returns an index to the StringTable
if the data type is text.
Not knowing more about the type of data that's in your cells (there are different ways to retrieve it based on the data type). If it's what I think it is, it'll be a SharedString
, which you will need to retrieve as a SharedStringTablePart, as shown in this MSDN page:
https://msdn.microsoft.com/en-us/library/hh298534%28v=office.14%29.aspx?f=255&MSPPError=-2147217396
Your code would look something like this:
foreach (Cell c in r.Elements<Cell>())
{
string value = c.InnerText;
if (c.DataType.Value == CellValues.SharedString)
{
var stringTable = workbookPart.GetPartsOfType<SharedStringTablePart>()
.FirstOrDefault();
if (stringTable != null)
value = stringTable.SharedStringTable.ElementAt(int.Parse(value)).InnerText;
text.Append(value + ",");
}
else
text.Append(value + ",");
text.AppendLine();
}
Upvotes: 8
Reputation: 51
I Have the same problem and found the solution, you just need to add this method, you will get the exact text that you need, not the numbers:
private string ReadExcelCell(Cell cell, WorkbookPart workbookPart)
{
var cellValue = cell.CellValue;
var text = (cellValue == null) ? cell.InnerText : cellValue.Text;
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
text = workbookPart.SharedStringTablePart.SharedStringTable
.Elements<SharedStringItem>().ElementAt(
Convert.ToInt32(cell.CellValue.Text)).InnerText;
}
return (text ?? string.Empty).Trim();
}
Upvotes: 5