Lester Bonker
Lester Bonker

Reputation: 127

What is the difference between "code" and "data" if all code accesses data, and all data accesses code?

In Assembly there's great theoretical debates on memory, code, data, segments, etc.

It doesn't make complete sense to be blunt.

What is code, text, data, etc.?

I've read tutorials and they focus only on the application, not what the code is doing under the written words 100%.

I can't be an Assembly-programmer it this is unclear; do any tutorials clarify this better?

Upvotes: 3

Views: 5008

Answers (4)

old_timer
old_timer

Reputation: 71576

I have not heard nor seen "great theoretical debates" on those terms, unless it is between folks that dont understand and are speculating.

Bits are just bits, as you get lower level they have less and less long term meaning. In a high level language your variable may be very clearly defined as some "type" and that variable is always that type for the life of that variable in that program or scope. But as you get closer to the logic the bits just become bits or signals and lose their type. Types begin to make no sense and dont matter. if you are "computing an address" to something you are doing that with data bits, they dont become an address until the brief moment they land on an address bus to something only they are they an address, the rest of the time they are just bits laying around.

When a processor "fetches" an "instruction" from somewhere that act of fetching an instruction for the purpose of ultimately executing it, makes those bits code. but if you take all of those bits that make up an instruction and break that instruction down into smaller parts, some of those bits might be immediate values which might be data being loaded into a register or an address or a fraction of an address being used to access some data. so some of that "code" is just plain data closely bundled with the code. So from one perspective bits might be called code only for the brief moment they are being fetched and destined to be and are executed as an instruction. While laying about in ram or in a cache or on a disk they are most of the time just data.

When that code is being read from the hard disk, being written to memory, long before the operating system does whatever to try to execute that code, it is just data bits, at the lower levels cannot be distinguished from a jpeg image or mpeg video or whatever bits being read from the hard disk and moved into memory.

When an instruction is reading or writing memory not as an instruction to be executed (as far as that operation is concerned at that moment) then that is just data, a bit of a generic term for those bits. They are always data, they have a dual definition of code if they are also actually or possibly destined to be executed as instructions directly or indirectly.

As far as text vs data text for whatever reason (it has been explained by others in more detail elsewhere on stackoverflow) text is basically the code part of your program, another word for code, the instructions part. And data is just data, the part that isnt instructions, isnt going to be instructions. At least in that context.

For whatever historical or practical reasons which dont matter, your program often consists of some percentage of the bits being instructions and and some percentage might be considered data, the memory that holds your variables for example, or perhaps you have an image embedded in the code, or an un-initialized array of memory that will when your program runs get loaded with some data from somewhere that your program operates on. compiler jargon tends to use the term text for the code part and data for the other part if there is any.

It is not incorrect for all the bits regardless of purpose or destination, in many contexts, to be called "data".

What is memory, well it is something that stores the bits. Lots of tangents on that one, but basically it stores your bits, short term or long term or both.

Segments are what the word implies something broken down into segments. Many folks first think x86 and its history as a segmented architecture, which folks often over complicate, just means that addresses are computed using more than one register or entity as far as that architecture goes. Not realizing that often so called flat memory architectures use segmenting all the time. often your video memory can only be accessed a segment at a time. often your hard disk can only be accessed a segment at a time, you have a small aperture you can look through and address in with the rest of the address being managed elsewhere, the two addresses at some point outside your program, coming together to access the thing you wanted to access. The boundaries and rules for accessing the segment vary based on the hardware and sometimes software. If the goal here is you are trying to learn assembly language and the word segment is coming up that probably means x86, and x86 is the last if ever assembly language you want to learn, there are many better first assembly languages to learn (having the hardware is the worst excuse to use when choosing a first assembly language, maybe, just maybe 50 years ago you could make a valid argument there, absolutely not today).

Your question is quite vague, what is it you are wanting to do, and what is it you are hung up on, what resources have you used to try to answer the question on your own. If you are trying to learn assembly language and have only looked at one book, that book could suck and that language could suck or both and you are just wasting your time, find another book or another assembly language. There are piles of different books that teach any particular programming language, lets say C for example, that doesnt mean every one of those books is really good and going to work for everyone. Specifically call out your question, I read "blah" in this "blah" book and I dont understand what this word means. How does it apply. Or maybe take this from another approach which you had better not ask because it has been asked and answered so many times "I want to learn xyz assembly language, is there a good book" or what is the "best" (very bad word to use here) book, etc.

Upvotes: 0

Peter Wooster
Peter Wooster

Reputation: 6089

There really are no distinctions between code and data at the lowest levels. In the end all code is data, just ask a compiler, and any data can be executed. Operating systems apply rules to how various blocks of memory can be accessed. That's where the distinction starts. As you go higher in language and system levels the distinctions get more complex, you get heaps and stacks and paged memory and virtual memory and virtual machines. It all gets very complex, but at its root it is a Von Neumann machine, even if it looks like a cool gadget from Apple.

And yes, I used to be an Assembler programmer.

Upvotes: 4

GreyBeardedGeek
GreyBeardedGeek

Reputation: 30088

Maybe some historical background is in order here. Currently popular architectures are of the Von Nuemann type, where code and data are shared in the same memory. Not all architectures are like this - some segregate code and data.

Upvotes: 0

Carl Norum
Carl Norum

Reputation: 225112

OK, so some of this stuff is a bit subjective in that it can vary from system to system and toolchain to toolchain, but:

  • code & text are usually synonyms meaning "this section/segment contains executable code"

  • data usually means "this section/segment contains non-executable data"

If your hardware supports it, the memory pages that the data sections get loaded into may be marked "not-executable" so that if your program tries to jump into that area, it will crash immediately rather than doing something crazy.

Likewise, the code/text sections may have their pages marked "read-only", so that they aren't accidentally modified by the program. Some systems have "read-only data" sections too, where they put string literals and constant variables, and so forth.

The most extreme example might be a Harvard architecture, in which the code and data memories aren't the same physical device.

Upvotes: 8

Related Questions