Reputation: 21

Find the end of a list in Assembly

I'm new to x86 Assembly. I encountered a question in the book I'm learning from:

We have discussed three approaches to finding the end of the list: using a special number, using the ending address and using the length count. Which approach do you think is best? Why? Which approach would you use if you knew that the list was sorted? Why?

Using a special number refers to putting a number at the end of the list and stopping when reached to that.

I can't figure out the answer. Which approach is more effective? Why? How does a sorted list differ from normal list in finding the end? Thanks in advance.

Upvotes: 1

Answers (2)

Rptx

Reputation: 1189

I believe you are talking about a question in Chapter 3 of "Programming from the Ground up". If that is the case, the list of items is hard-coded in the source code as


data_items:                       #These are the data items
.long 3,67,34,222,45,75,54,34,44,33,22,11,66,0

In the example, they use 0 to mark the end of the list. In my opinion it's not a good approach because it's a valid value of the list, so if it appears earlier, you would not check the entire list. Using the length in this particular case would not be ideal as you would have to remember to update it every time you add or remove items from the list. So using the address of the last item sounds like the best approach to me (you can add a label just after the list and use it to compute the address of the last item).

I'm not sure how a sorted list would changes this. Although the example is to find the maximum of a list of numbers, so if you know it's sorted, there is no reason to check all of them, just return last one (assuming it's ascending order).

Upvotes: 2

Peter Cordes

Reputation: 365832

Best for what purpose? Size efficiency? Implicit-length C-strings are pretty compact, just costing 1 extra byte for the terminator, not a whole extra pointer.

But for performance, explicit-length strings and arrays are much easier to work with especially using SIMD: you can check that there are at least N elements left before entering an unrolled loop. Compare glibc's x86-64 SSE2 implementations of memchr vs strchr: you really have to jump through hoops with strchr to make sure you don't read into an unmapped page if the end of the string is right at the end of a page.

https://code.woboq.org/userspace/glibc/sysdeps/x86_64/memchr.S.html
alignment handling could be simpler for sizes >= 16 bytes, but actually glibc chooses to avoid cache-line splits. But note the main loop, once it reaches a 64-byte alignment boundary: the loop condition based on end-of-buffer is just sub $64, %rdx / jbe L(exit_loop), without any per-byte work.
https://code.woboq.org/userspace/glibc/sysdeps/x86_64/strchr.S.html
Note the andl $4095, %eax / cmpl $4032, %eax to check for being near a page-crossing, and the extra SIMD work to check for 0 as well as the target character in each of the 16 bytes being checked in parallel.

Upvotes: 2

Find the end of a list in Assembly

Answers (2)

Related Questions