Reputation: 284
I am trying to write a program that can take a list of a specific type for example a double
or float
and convert it into bytes and write it to a file that can be converted back into that original list. I came up with the following struct
s and functions. It works for a float
list, however, it does not if it is a list of doubles
and I am trying to understand why.
template<typename T>
struct writer{
static constexpr size_t Size = sizeof(T);
//the size to determing if it needs to be converted to uint16_t, uint32_t, etc...
static constexpr bool U = std::conditional<std::is_unsigned_v<T>, std::true_type,
typename std::conditional<std::is_same_v<T, float>, std::true_type,
typename std::conditional<std::is_same_v<T, double>, std::true_type, std::false_type>::type >::type >::type::value;
//this is used to determine if the storing_num variable needs to be stored as unsigned or not
using value_t = std::conditional_t<sizeof(T) == 1,
std::conditional_t<U, uint8_t, int8_t>,
std::conditional_t<sizeof(T) == 2,
std::conditional_t<U, uint16_t, int16_t>,
std::conditional_t<sizeof(T) == 4,
std::conditional_t<U, uint32_t, int32_t>, //by default the only options are 1, 2, 4, and 8
std::conditional_t<U, uint64_t, int64_t> > > >;
//the value that will either be entered or bit_casted to (shown in convert_num function)
value_t storing_num;
using my_byte = std::conditional_t<U == true, uint8_t, int8_t>;
std::array<my_byte, Size> _arr;
bool convert_num(T inp){
static_assert(sizeof(T) == sizeof(value_t), "T and value_t need to be the same size");
if constexpr (!std::is_same_v<T, value_t>){
storing_num = std::bit_cast<value_t>(inp);
}else{
storing_num = inp;
}
auto begin = _arr.begin();
for(int32_t i = _arr.size() - 1; i >= 0; --i, ++begin){
*begin = ((storing_num >> (i << 3)) & 0xFF);
}
return true;
}
bool write(std::ostream& outfile){
auto begin = _arr.cbegin();
auto end = _arr.cend();
for(;begin != end; ++begin)
outfile << (char)(*begin);
return true;
}
};
The following can be used to write a float
or uint32_t
to a text file successfully. The following can be used to read one of the numbers back in:
template<typename T>
struct reader{
static constexpr size_t Size = sizeof(T);
static constexpr bool U = std::conditional<std::is_unsigned_v<T>, std::true_type,
typename std::conditional<std::is_same_v<T, float>, std::true_type,
typename std::conditional<std::is_same_v<T, double>, std::true_type, std::false_type>::type >::type >::type::value;
using value_t = std::conditional_t<sizeof(T) == 1,
std::conditional_t<U, uint8_t, int8_t>,
std::conditional_t<sizeof(T) == 2,
std::conditional_t<U, uint16_t, int16_t>,
std::conditional_t<sizeof(T) == 4,
std::conditional_t<U, uint32_t, int32_t>, //by default the only options are 1, 2, 4, and 8
std::conditional_t<U, uint64_t, int64_t> > > >;
value_t outp;
std::array<int8_t, Size> _arr;
bool add_nums(std::ifstream& in){
static_assert(sizeof(T) == sizeof(value_t), "T and value_t need to be the same size");
_arr[0] = in.get();
if(_arr[0] == -1)
return false;
for(uint32_t i = 1; i < _arr.size(); ++i){
_arr[i] = in.get();
}
return true;
}
bool convert(){
if(std::any_of(_arr.cbegin(), _arr.cend(), [](int v){return v == -1;}))
return false;
outp = 0;
if(U){
auto begin = _arr.cbegin();
for(int32_t i = _arr.size()-1; i >= 0; i--, ++begin){
outp += ((uint8_t)(*begin) << (i * 8));
}
return true;
}
auto begin = _arr.cbegin();
for(int32_t i = _arr.size() - 1; i >= 0; --i, ++begin)
outp += ((*begin) << (i << 3));
return true;
}
};
Then I use the following functions to iterate over a text file and read/write to said file:
template<typename T>
void read_list(T* begin, const char* filename){
reader<T> my_reader;
std::ifstream in(filename);
if(in.is_open()){
while(in.good()){
if(!my_reader.add_nums(in))
break;
if(!my_reader.convert()){
std::cerr << "error reading, got -1 from num reading " << filename;
return;
}
if(std::is_same_v<T, typename reader<T>::value_t >) *begin = my_reader.outp;
else *begin = std::bit_cast<T>(my_reader.outp);
++begin;
}
}
if(!in.eof() && in.fail()){
std::cerr << "error reading " << filename;
return;
}
in.close();
return;
}
template<typename T>
void write_list(T* begin, T* end, const char* filename){
writer<T> my_writer;
std::ofstream outfile(filename, std::ios::out | std::ios::binary | std::ios::trunc);
for(;begin != end; ++begin){
my_writer.convert_num(*begin);
my_writer.write(outfile);
}
}
For example, the following would work as expected:
void write_float_vector(){
std::vector<float> my_floats = {4.981, 832.991, 33.5, 889.56, 99.8191232, 88.192};
std::cout<<"my_floats: " << my_floats<<std::endl;
write_list(&my_floats[0], &my_floats[my_floats.size()], "binary_save/float_try.nt");
}
void read_floats(){
std::vector<float> my_floats(6);
read_list(&my_floats[0], "binary_save/float_try.nt");
std::cout<<"my_floats: " << my_floats<<std::endl;
}
int main(){
write_double_vector();
std::cout<<"reading..."<<std::endl;
read_doubles();
}
However, if it was converted to double
s instead of float
s it fails to read the doubles back in correctly. Why is it failing with doubles?
For example, the following fails based on what is outputted from the read_doubles
function:
void write_double_vector(){
std::vector<double> my_doubles = {4.981, 832.991, 33.5, 889.56, 99.8191232, 88.192};
std::cout<<"my_doubles: " << my_doubles<<std::endl;
write_list(&my_doubles[0], &my_doubles[my_doubles.size()], "binary_save/double_try.nt");
}
void read_doubles(){
std::vector<double> my_doubles(6);
read_list(&my_doubles[0], "binary_save/double_try.nt");
std::cout<<"my_doubles: " << my_doubles<<std::endl;
}
If you would like to run the code yourself, I added these helper functions, and used the following headers to make it easier to reproduce:
#include <cstddef>
#include <cstdint>
#include <ios>
#include <stdio.h>
#include <iostream>
#include <fstream>
#include <vector>
#include <array>
#include <bit>
template<typename T>
std::ostream& operator<<(std::ostream& os, const std::vector<T>& v){
os << "{";
for(uint32_t i = 0; i < v.size()-1; ++i)
os << v[i]<<',';
os << v.back() << "}";
return os;
}
Upvotes: 0
Views: 393
Reputation: 1
The use of byte vector will make it possible to do subsequent processing.
template <typename T>
int fn_convertToVb(T v, vector<BYTE> & vb)
{
vb.resize(sizeof(T));
memcpy(&vb[0], &v, sizeof(T));
return sizeof(T);
}
Upvotes: 0
Reputation: 4257
The only correct (not UB) and constexpr
way to do it in modern era(C++20) is std::bit_cast
:
#include <bit>
#include <array>
#include <algorithm>
double x;
auto x_bytes = std::bit_cast<std::array<std::unit8_t, sizeof(x)>(x);
if (std::endian::native==std::endian::little)
std::ranges::reverse(x_bytes);
auto y = bit_cast<double>(x_bytes);
C++ standard has eliminated all other options. They may work on sime platforms; but being UB, they may break under unspecified conditions and lead to surprise results.
std::bit_cast
works if the input and output are of same size and trivially copyable. The other options include reinterpret_cast
which creates problems with strict aliasing rule, std::memcpy
which is source of bugs due to programmer mistakes, and union
based type punning which has the worst of both previous ones. std::bit_cast
has also the constexpr
property, which means it can be used in expressions that need be evaluated at compile-time(create a none-type template parameter, or element count of an array...).
Upvotes: 1
Reputation: 63481
The main issue with your program is the shifting of a smaller data type when you read. You have this in two places (shown together here for clarity, despite being out of context):
outp += ((uint8_t)(*begin) << (i * 8));
outp += ((*begin) << (i << 3));
In both these cases, the iterator begin
references an underlying type of int8_t
. Not only is this a signed type, but you're placing trust in the compiler to promote the type to a large enough integer. I actually ran into a similar issue some years ago.
See here: Undefined high-order of uint64_t while shifting and masking 32-bit values
The fix is to cast to the correct size before shifting:
outp += (((value_t)(uint8_t)*begin) << (i * 8));
outp += (((value_t)(uint8_t)*begin) << (i << 3));
I would also like to suggest that you avoid using constructs like i << 3
instead of i * 8
. There is simply no need to try and be clever. The compiler will do the right thing, and you're just making the code harder to read. It also lacks consistency, since you're using both forms in the same function.
Upvotes: 2
Reputation: 118077
Building a float
or double
character-by-character can cause a trap representation so don't do that. Replace the _arr
definition with std::array<char, Size> _arr;
and then use in.read
+ std::memcpy
:
Example:
#include <cstring> // std::memcpy
// writer:
std::array<char, Size> _arr;
bool convert_num(T inp) {
static_assert(sizeof(T) == sizeof(value_t),
"T and value_t need to be the same size");
std::memcpy(_arr.data(), &inp, Size);
return true;
}
bool write(std::ostream& outfile) {
return static_cast<bool>(outfile.write(_arr.data(), Size));
}
// reader:
std::array<char, Size> _arr;
bool add_nums(std::istream& in) {
return static_cast<bool>(in.read(_arr.data(), Size));
}
bool convert() {
std::memcpy(&outp, _arr.data(), Size);
return true;
}
Demo (originally from Paddy)
Upvotes: 4