Reputation: 394
I have been going through the assembly code of the sort
program in the GNU Coreutils
and found something I can't figure out and explain in a technological sense of why this is happening.
It all started with first disassembling the sort
program by the command:
~/coreutils/new_build/src (master*) » objdump --dwarf=info ./sort &> sort.objdwarf
Which provides me with useful DWARF information that can be used to learn about the program. However, looking through it, I found out that there are two numcompare
functions that are very different (which I will show later). For instance:
<1><5978>: Abbrev Number: 55 (DW_TAG_subprogram)
<5979> DW_AT_name : (indirect string, offset: 0x718): numcompare
...
<5986> DW_AT_low_pc : 0x6fb0
<598e> DW_AT_high_pc : 0x703a
<5996> DW_AT_frame_base : 0x1464 (location list)
<599a> DW_AT_GNU_all_tail_call_sites: 1
<599b> DW_AT_sibling : <0x59bc>
<2><599f>: Abbrev Number: 56 (DW_TAG_formal_parameter)
<59a0> DW_AT_name : a
...
<2><59ad>: Abbrev Number: 56 (DW_TAG_formal_parameter)
<59ae> DW_AT_name : b
...
<2><59bb>: Abbrev Number: 0
and second one is:
<1><f882>: Abbrev Number: 10 (DW_TAG_subprogram)
<f883> DW_AT_name : (indirect string, offset: 0x718): numcompare
...
<f88f> DW_AT_low_pc : 0x19118
<f897> DW_AT_high_pc : 0x1957e
<f89f> DW_AT_frame_base : 0x5d34 (location list)
<f8a3> DW_AT_GNU_all_tail_call_sites: 1
<f8a4> DW_AT_sibling : <0xf92e>
<2><f8a8>: Abbrev Number: 6 (DW_TAG_formal_parameter)
<f8a9> DW_AT_name : a
...
<2><f8b5>: Abbrev Number: 6 (DW_TAG_formal_parameter)
<f8b6> DW_AT_name : b
...
<2><f8c2>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<f8c3> DW_AT_name : (indirect string, offset: 0x20af): decimal_point
...
<2><f8d2>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<f8d3> DW_AT_name : (indirect string, offset: 0xa1e): thousands_sep
...
After grep
ing the source code, I was able to verify that there is only one numcompare
function:
~/coreutils/src (master*) » grep -Hnriw numcompare
sort.c:1998:numcompare (char const *a, char const *b)
sort.c:2694: diff = numcompare (ta, tb);
Upon further investigation looking through the source code, I happened to find something interesting in terms of how it is defined:
/* Compare strings A and B as numbers without explicitly converting them to
machine numbers. Comparatively slow for short strings, but asymptotically
hideously fast. */
ATTRIBUTE_PURE
static int
numcompare (char const *a, char const *b)
{
while (blanks[to_uchar (*a)])
a++;
while (blanks[to_uchar (*b)])
b++;
return strnumcmp (a, b, decimal_point, thousands_sep);
}
From this, I realized that the second DWARF information shown regarding numcompare
is actually the information of the strnumcmp
! As shown here:
<1><f810>: Abbrev Number: 5 (DW_TAG_subprogram)
<f811> DW_AT_external : 1
<f812> DW_AT_name : (indirect string, offset: 0x56b8): strnumcmp
...
<f81e> DW_AT_low_pc : 0x1957e
<f826> DW_AT_high_pc : 0x195ac
<f82e> DW_AT_frame_base : 0x5cd4 (location list)
<f832> DW_AT_GNU_all_tail_call_sites: 1
<f833> DW_AT_sibling : <0xf870>
<2><f837>: Abbrev Number: 6 (DW_TAG_formal_parameter)
<f838> DW_AT_name : a
...
<2><f844>: Abbrev Number: 6 (DW_TAG_formal_parameter)
<f845> DW_AT_name : b
...
<2><f851>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<f852> DW_AT_name : (indirect string, offset: 0x20af): decimal_point
...
<2><f860>: Abbrev Number: 7 (DW_TAG_formal_parameter)
<f861> DW_AT_name : (indirect string, offset: 0xa1e): thousands_sep
...
And I did a further investigation of this strnumcmp
, and it is an outside library function used by GNU Coreutils
that is defined as:
int
strnumcmp (char const *a, char const *b,
int decimal_point, int thousands_sep)
{
return numcompare (a, b, decimal_point, thousands_sep);
}
So now I'm very confused about what is going on here. The fact ATTRIBUTE_PURE
flag is used with static
seems to mean:
"ATTRIBUTE_PURE" is a function attribute in C programming language that can be used to indicate that a function has no side effects and only depends on its arguments, and "static" is a storage class specifier that indicates that the function or variable is only visible within the file it is declared in.
This doesn't quite help me explain what is going on here.
The main questions I wanted to ask are why and how are there two separate DWARF information of a function numcompare
? Thank you in advance.
Upvotes: 1
Views: 74
Reputation: 140990
why and how are there two separate DWARF information of a function numcompare?
The functions are static
- they have internal linkage. Every object file can have a different static
function with the same name. For every of them, DWARF
can be added.
One numcompare
is here https://github.com/coreutils/coreutils/blob/master/src/sort.c#L1998 and the other is here https://github.com/coreutils/coreutils/blob/master/gl/lib/strnumcmp-in.h#L114 . One is compiled in sort.o
the other in strnumcmp.o
, and object files are then linked together.
Upvotes: 2