Reputation: 6978
My end-goal was getting the list of DLL names from the static import data table.
I thought I could do something like,
auto data_dirs = p_loaded_image->FileHeader->OptionalHeader.DataDirectory;
And then somehow iterate over that list of addresses and then get the DLL names that way; something like that.
So for baby-steps I was just trying to verify that I could match values for p_loaded_image->FileHeader->OptionalHeader.SizeOfStackCommit;
against a manual pointer-math equivalent. I can't seem to do this without Access Violation
exceptions, which seems to verify that I'm doing this incorrectly.
What did I do wrong, and how specifically do I get my pointer-math query to match the actual loaded image's API return for getting the same value of SizeOfStackCommit
? If you can teach me that much, I can hopefully progress from this current point of my DLL-name-finding WIP.
For time-saving purposes, if your compiler supports std::experimental::filesystem
you can start at the comment of // Skip to here
to avoid all the console and file verification boilerplate, otherwise you'll need to stub it out or change it to something more friendly for older C++ specifications.
#include "Windows.h"
#include "Imagehlp.h"
#include "tchar.h"
#include "stdio.h"
#include "stdlib.h"
#include <string>
#include <vector>
#include <experimental\filesystem>
// All hard-coded values taken directly from latest PE/COFF .docx Documentation from MS:
// => http://go.microsoft.com/fwlink/p/?linkid=84140
const int MAGIC_32_NUM = 0x10b;
const int MAGIC_64_NUM = 0x20b;
// These two require C++17 || If needed, replace with older valid file-verification.
namespace fs = std::experimental::filesystem;
bool verify_loaded_file(std::string);
int _tmain(int argc, _TCHAR* argv[])
{
std::string image_to_load;
if (argc == 2) {
image_to_load = argv[1];
}
else {
printf("A valid path to a loadable image needs to be your only command-line parameter for %s", argv[0]);
return -1;
}
bool validFile = verify_loaded_file(image_to_load);
if (!validFile) {
printf("A valid file path of a DLL or EXE needs to be your only command-line parameter for %s", argv[0]);
return -1;
}
auto filesystem_image = fs::absolute(fs::path(image_to_load));
std::string image_directory = filesystem_image.parent_path().string();
std::string image_name = filesystem_image.stem().string();
std::string image_name_and_extension = image_name + filesystem_image.extension().string();
bool is64bit, is32bit = false;
// To use MapAndLoad, you need to manually include Imagehlp.lib in your project.
// The Imagehlp.h header alone does not suffice.
LOADED_IMAGE loaded_image = { 0 };
LOADED_IMAGE * p_loaded_image = &loaded_image;
bool image_loaded = MapAndLoad(image_name_and_extension.c_str(), image_directory.c_str(), p_loaded_image, FALSE, TRUE);
int error_check = GetLastError();
if (!image_loaded) {
printf("Something went wrong when trying to load %s0 with error code %s1", image_to_load.c_str(), error_check);
UnMapAndLoad(p_loaded_image);
return -1;
}
int magic_number = loaded_image.FileHeader->OptionalHeader.Magic;
if (magic_number == MAGIC_32_NUM) { is32bit = true; }
else if (magic_number == MAGIC_64_NUM) { is64bit = true; }
else {
printf("The magic number from the optional header wasn't detected as 32-bit or 64-bit\n");
printf("Check Windows System Error Code: %s\n", magic_number);
UnMapAndLoad(p_loaded_image);
return -1;
}
// Skip to here
UCHAR * module_base_address = p_loaded_image->MappedAddress;
size_t coverted_base_address = size_t(module_base_address);
size_t windows_optional_header_offset;
if (is64bit) { windows_optional_header_offset = size_t(24); }
else { windows_optional_header_offset = size_t(28); }
size_t data_directory_optional_header_offset;
if (is64bit) { data_directory_optional_header_offset = size_t(112); }
else { data_directory_optional_header_offset = size_t(96); }
size_t size_stack_commit_offset;
if (is64bit) { size_stack_commit_offset = size_t(80); }
else { size_stack_commit_offset = size_t(76); }
// The commented out line below breaks with Access Violations, as does the line following it:
// auto sum_for_size_stack = size_t(coverted_base_address + size_stack_commit_offset);
auto sum_for_size_stack = size_t(coverted_base_address +
windows_optional_header_offset +
data_directory_optional_header_offset +
size_stack_commit_offset);
auto direct_access_size_stack = p_loaded_image->FileHeader->OptionalHeader.SizeOfStackCommit;
DWORD64 * addy = &direct_access_size_stack;
printf("Direct: %s\n", direct_access_size_stack);
printf("Pointer-Math: %s\n", sum_for_size_stack);
UnMapAndLoad(p_loaded_image);
return 0;
}
//
bool verify_loaded_file(std::string file_to_verify)
{
if (fs::exists(file_to_verify))
{
size_t extension_query = file_to_verify.find(".dll", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".DLL", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".exe", 0);
if (extension_query == std::string::npos)
{
extension_query = file_to_verify.find(".EXE", 0);
}
else { return true; }
if (extension_query != std::string::npos) { return true; }
}
else { return true; }
}
else { return true; }
}
return false;
}
The PE File Format for windows has its latest documentation here in a whitepaper attachment which holds its .docx
: http://www.microsoft.com/whdc/system/platform/firmware/PECOFF.mspx
I've got my pure pointer-arithmetic traversal working for up until just before my end-goal. Getting to that point required me to remove two layers of complexity.
Don't screw around with loading the image with special APIs, as I'm trying to get the static import information; solution for that was to load the EXE into a char vector to snapshot it in memory.
Forget the over-complicated sounding RVA crap unless you need to use it. Just use Byte Offsets for the header parts of the PE. The sections are where you need to use your RVA. Just consider the address of element 0 for the char vector to be your base address, for which all RVAs are calculated off from. The docx also tells you when to use the offset versus the actual address, which is good to know. Check my added answer where I briefly talk about using the RVA to get the import table.
My program still isn't doing what I want it to, but at least I have the gist of pointer-arithmetic matching the pointer accessors, which was the goal of this question.
I think my remaining blockers are related to which structures to load with data, and where. You can build and run my WIP gist on a Win10 box, or update the value of ph_file
to be some other locally installed 64-bit program on your OS, preferably one without an .idata
section. The .idata
section isn't guaranteed to exist even if importing DLLs. Calculator.exe doesn't have it, as one example.
I got some debugging help and finally got this working. The code is POC, so not much of anything is cleaned up or optimized, but it is functional. Tested against x86/win32 and x64 binaries. Gist here.
Upvotes: 0
Views: 456
Reputation: 6978
Giving Ben credit for this one, as even though it didn't unblock me, he pointed me down the right direction when he identified my lack of understanding for pointers in correlation to memory allocation and pointer initialization from address instead of object. To overcome that, I did some studying and exercises in order to:
Doing those two things unblocked the path for pure pointer-arithmetic usage.
My problem in those regards is that I only ever needed the most basic understanding of pointers when using them in the past. This process required actually understanding memory allocation and traversal.
Overall, it's been a great re-learning experience!
At a certain point, the data accessed via a char array to load the file requires the usage of the RVA. In order to get that, you need to get the correct IMAGE_SECTION_HEADER
struct loaded up that has your desired data. You use that struct to compute RVA with something like this to get say, the import table:
if (queried_section_header->PointerToRawData >= import_table_data_dir->VirtualAddress &&
(queried_section_header->PointerToRawData < (import_table_data_dir->VirtualAddress + queried_section_header->SizeOfRawData)))
{
DWORD import_table_offset = queried_section_header->PointerToRawData - import_table_data_dir->VirtualAddress + queried_section_header->PointerToRawData;
}
I haven't personally used this guide for understanding pointers, but at a glance it looks pretty promising: http://home.netcom.com/~tjensen/ptr/pointers.htm
In case it expires, this snapshot might still be around: https://web.archive.org/web/20161208002919/http://home.netcom.com/~tjensen/ptr/pointers.htm
Upvotes: 0
Reputation: 283793
I don't believe you that the line indicated (computation of sum_for_size_stack
) causes an access violation. It's just unsigned arithmetic, which cannot overflow or result in a trap value.
I do believe that you get an access violation from printf
, because you're using the %s
format specifier with an argument that is not a pointer to a NUL-terminated ASCII string. I have no idea what gave you the idea that stack sizes are stored as strings, or that it's a good idea to pass size_t
to a variadic function that requires const char*
, but neither is true.
Pay attention to the preconditions of printf
. The correct format string for a size_t
parameter is %zx
.
Upvotes: 4