Reputation: 17022
We are experiencing some difficulty unpacking COMP-3 fields containing both numeric and date data from a file provided to us by one of our vendors.
The file specification provides the following information:
0066-0070 DATE-OPENED S9(9) COMP-3
The specification indicates that the date will expand to MMDDYYYY format.
When I retrieve this block of data from the file, I can load it into memory and see that I retrieve 5 bytes of data. (In the file, there is one byte per character.) The bytes retrieved are as follows:
0: 10
1: 0
2: 18
3: 0
4: 2
There's no sign overpuched into the least significant digit (where it always appears), so that's not an issue here. The bits expand into the following nibbles:
0 1 0 0 1 2 0 0 0 2
There's a couple of problems here:
It's highly unlikely that 01001200 represents a valid date in MMDDYYYY format, and yet this seems to be how the data was packed into the field.
When a COMP-3 field is unpacked, the template specifies that it should expand to 9 characters, but if a COMP-3 is expanded, it's size will ALWAYS double (producing a string with an even number of characters). As a result, there is a mismatch between the expected size and the unpacked size.
No algorithm that I can find on the web seems to work for unpacking this data. Nothing seems to be able to come up with a recognizable date for any of the (supposedly) BCD values in our source file.
At this point I suspect that we may not be dealing with a true BCD format. However, keeping in mind that I should always doubt myself and not the tool, I am seeking suggestions for what I could be doing wrong in both my understanding of the COMP-3 format and the nature of the data I'm looking at.
My understanding of the format is taken from the following sources:
It's worth noting that I have attempted converting the data from EBCDIC to ASCII and vice versa before attempting to unpack it; neither produced any intelligible results. I've attempted every algorithm I could find on the Internet, and none of them seem to be producing any useful results.
I suppose, in the end, my question is: Am I actually dealing with BCD or COMP-3 data here?
Update
In answer to some questions:
We have both original and unparsed files on hand to use as reference materials. The date I'm expecting to get back is something along the lines of 06152008 (that's off the top of my head, but you get the gist). The value I'm computing is nothing like that.
Per request, the individual nibbles:
0 1 0 0 1 2 0 0 0 2
And for those who are interested in how I'm doing it, the class that's unpacking:
using System.Collections.Generic;
using System.Linq;
using System.Text;
internal class PackedDecimal
{
#region Fields
private bool _isPositive;
private bool _isNegative;
private bool _isUnsigned = true;
#endregion
#region Constructor
/// <summary>
/// Initializes a new instance of the <see cref="PackedDecimal"/> class.
/// </summary>
public PackedDecimal()
{
}
/// <summary>
/// Initializes a new instance of the <see cref="PackedDecimal"/> class.
/// </summary>
/// <param name="compressedDecimal">The compressed decimal.</param>
public PackedDecimal(string compressedDecimal)
{
this.ParsedValue = this.Parse(compressedDecimal);
}
#endregion
#region Properties
/// <summary>
/// Gets the bytes.
/// </summary>
public IEnumerable<byte> Bytes { get; private set; }
/// <summary>
/// Gets the hexadecimal values.
/// </summary>
public IEnumerable<string> HexValues { get; private set; }
/// <summary>
/// Gets or sets a value indicating whether this instance is positive.
/// </summary>
/// <value>
/// <c>true</c> if this instance is positive; otherwise, <c>false</c>.
/// </value>
public bool IsPositive
{
get { return this._isPositive; }
set
{
this._isNegative = !this.IsPositive;
this._isUnsigned = false;
}
}
/// <summary>
/// Gets or sets a value indicating whether this instance is negative.
/// </summary>
/// <value>
/// <c>true</c> if this instance is negative; otherwise, <c>false</c>.
/// </value>
public bool IsNegative
{
get { return this._isNegative; }
set
{
this._isNegative = value;
this._isPositive = !value;
this._isUnsigned = false;
}
}
/// <summary>
/// Gets a value indicating whether this instance is unsigned.
/// </summary>
/// <value>
/// <c>true</c> if this instance is unsigned; otherwise, <c>false</c>.
/// </value>
public bool IsUnsigned { get { return this._isUnsigned; } }
/// <summary>
/// Gets the nibbles.
/// </summary>
public IEnumerable<int> Nibbles { get; private set; }
/// <summary>
/// Gets the parsed value.
/// </summary>
public string ParsedValue { get; private set; }
#endregion
/// <summary>
/// Parses the specified value.
/// </summary>
/// <param name="value">The value.</param>
/// <returns></returns>
public string Parse(string value, SourceEncoding sourceEncoding = SourceEncoding.Ascii, int decimalPlaces = 0)
{
var localValue = value; // Encoding.Convert(Encoding.ASCII, Encoding.GetEncoding("IBM037"), value.ToByteArray()).FromByteArray();
var sign = this.GetSign(localValue, out localValue);
var bytes = localValue.ToByteArray();
var nibbles = new List<int>();
var buffer = new StringBuilder();
foreach (var b in bytes)
{
var hi = (int)b.HiNibble();
var lo = (int)b.LoNibble();
nibbles.Add(hi);
nibbles.Add(lo);
buffer.AppendFormat("{0}{1}", hi, lo);
}
this.Bytes = bytes;
this.Nibbles = nibbles;
this.HexValues = nibbles.Select(v => v.ToString("X"));
switch (sign)
{
case Sign.Unsigned:
this.ParsedValue = buffer.ToString();
break;
case Sign.Positive:
this.ParsedValue = "+" + buffer;
break;
case Sign.Negative:
this.ParsedValue = "-" + buffer;
break;
}
this.IsPositive = sign == Sign.Positive;
this.IsNegative = sign == Sign.Negative;
return this.ParsedValue;
}
#region GetSign Method
/// <summary>
/// Gets the sign for the packed decimal represented by this instance.
/// </summary>
/// <param name="value">The value to analyze.</param>
/// <param name="buffer">Receives <paramref name="value"/>, less the sign digit if it is present.</param>
/// <returns>The sign for the packed decimal represented by this instance.</returns>
/// <remarks>If the value provided does not include a sign digit, it is assumed to be unsigned.</remarks>
private Sign GetSign(string value, out string buffer)
{
var lastDigit = value.ToByteArray().Last();
var loNibble = lastDigit.LoNibble();
var hiNibble = lastDigit.HiNibble();
var result = Sign.Unsigned;
var hasSignDigit = true;
switch (hiNibble)
{
case 0xC0: // "c"
result = Sign.Positive;
break;
case 0xD0: // "d"
result = Sign.Negative;
break;
case 0xF0: // "f"
result = Sign.Unsigned;
break;
default:
hasSignDigit = false;
break;
}
// Remove the sign digit if it's present.
buffer = hasSignDigit
? value.Substring(0, value.Length - 1) + loNibble
: value;
return result;
}
#endregion
#region Sign Enum
private enum Sign
{
Unsigned,
Positive,
Negative
}
#endregion
}
And the extension methods that support it:
using System;
using System.Linq;
using System.Text;
public static class Extensions
{
/// <summary>
/// Gets the high nibble (the high 4 bits) from a byte.
/// </summary>
/// <param name="value">The byte from which the high 4-bit nibble will be retrieved.</param>
/// <returns>A byte containing the value of this byte, with all bits shifted four bits to the right.</returns>
public static byte HiNibble(this byte value)
{
return (byte)((value & 0xF0) >> 4);
}
/// <summary>
/// Gets the low nibble (the lowest 4 bits) from this byte.
/// </summary>
/// <param name="value">The byte from which the low 4-bit nibble will be retrieved.</param>
/// <returns>A byte containing the value of this byte, with the high four bits discarded.</returns>
public static byte LoNibble(this byte value)
{
return (byte)(value & 0x0F);
}
/// <summary>
/// Gets the individual bytes from a string.
/// </summary>
/// <param name="value">The string to convert to a byte array.</param>
/// <returns>An array of bytes representing the string.</returns>
public static byte[] ToByteArray(this string value)
{
var bytes = new byte[Encoding.ASCII.GetByteCount(value)];
Buffer.BlockCopy(value.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
}
public enum SourceEncoding
{
Ascii,
Ebcdic
}
Upvotes: 1
Views: 387