Reputation: 3036
In functional programming, we tend to distinguish between data and functions, but what is the difference?
If I consider a constant, I could think of it as a function, which just returns the same value:
(def x 5)
So what is the distinction between data and a function? I fail to see the difference.
Upvotes: 4
Views: 1478
Reputation: 28414
To be honest, I think the two existing answers are entirely missing the point, because if you have a Turing-complete language you can clearly write code to turn data into code and run code to get data.
The original question
What is the difference between functions and data?
is implicitly asking first of all if there is a difference, and asking if there is a difference between functions and data is equivalent to asking if you can accomplish, just using functions, something that you "normally" accomplish by using data, and vice versa.
So the answer is contained in §2.1.3 What Is Meant by Data? (as user5536315 hinted, I believe), where the authors make the point that you can implement data structures by just using functions.
For instance, cons
is the function that allows you creating a pair data:
(define p (cons 3 7))
(display p)
; (3 . 7)
(display (car p))
; 3
(display (cdr p))
; 7
But "pair" is just the concept of "something" that behaves like the object you get via calling const
with two arguments; as long as the "something" meets those requirements, it's a pair, it doesn't matter how you achieve that.
So you can define your own pair by re-defining its interface functions:
(define (cons a b) ; override builtin cons
(define (pair i)
(cond ((= i 0) a)
((= i 1) b)
(else (error "caio"))))
pair)
(define (car p) ; override builtin car
(p 0))
(define (cdr p) ; override builtin cdr
(p 1))
The pair
function returned by cons
is the "pair", because it implements the interface of "pair", i.e. you can call car
and cdr
on it, so it works just like the original data,
(define p (cons 3 7))
(display p)
; #<procedure impl (a)>
(display (car p))
; 3
(display (cdr p))
; 7
where the output #<procedure impl (a)>
is different than before pecisely because (cons 3 7)
using the builtin cons
gives you a data structure that the REPL knows how to print, whereas the one using our new cons
gives you a function that the REPL doesn't know how to print, so it prints #<procedure impl (a)>
.
Notice also that this identity between functions and data is possible only because functions are first class citizens in the given language, so you can return pair
from cons
.
In a language where functions are not first class, e.g. in C++, that identity is really not possible. The closest you can get still implicitly makes use of data, which is more versitile because it is first class. E.g. an implementation of pair
could look something along these lines,
constexpr auto make_pair(auto&& a, auto&& b) {
return [&a, &b]<int n>() -> auto& {
static_assert(n == 0 || n == 1);
return n == 0 ? a : b;
};
}
template<int n>
constexpr auto get(auto p) -> auto& {
return p.template operator()<n>();
}
int main() {
auto p = make_pair(3, 7);
get<0>(p) = 4;
get<1>(p) = 8;
};
but make_pair
is returning a lambda, which is not really a function but a function object, i.e. a struct
(though anonymous) with operator()
.
Upvotes: 1
Reputation: 29958
Data is a value (with a specific type).
For example, 5
is a value of type Integer
, and "abc"
is a value of type String
. A composite value such as [5 "abc"]
has the type Vector
.
Two data values of the same type can always be compared for equality.
Data is never executed. That is, the thread of control (aka program counter or PC) never enters the data structure.
A function's only type is "code".
A function produces a value (with a specific type) when it is executed (possibly with arguments).
Execution means the thread of control enters the code data structure. The code and data values encountered there have complete control over any side-effects that occur, as well as the return value.
Both compiled and interpreted code produce the same results. The only difference between them are implementation details that trade off complexity vs speed.
The (eval ...)
special form accepts data as input and returns a function as output. The returned function can be executed (i.e. invoked) so the thread of control enters the function.
For clarity, the above elides details such as the reader, etc.
Macros are best viewed as a compiler extension embedded within the code, and do not affect the data vs code distinction.
It occurred to me that the original question has not been fully answered. Consider the following:
; A Clojure Var pointing to the value 5
(def five 5)
; A Clojure Var pointing to a function that always returns the value 5
(def ->five (fn [& args] 5))
and then use these 2 Vars:
five => 5
(->five) => 5
The parentheses make all the difference.
See also:
Upvotes: 6
Reputation: 9865
In languages with the property of homoiconicity, code is data and data is code.
This code data duality blurs the distinction between code and data.
(I think your question is about what is the difference between lambda
and data - if lambda
itself is actually also just a data structure which has to be executed ...)
In homoiconic languages, data can become lambda
(if it contains the instructions for a lambda
) and vice versa.
So perhaps, the distinction is only by their type (function vs. any other data structure or primitive data type).
Upvotes: 1