abk07
abk07

Reputation: 49

Extract variable definitions and variable reference from a 'C' program

I've been assigned a strange project at college. I've been asked to extract variable definitions and references from a given input 'C' program.

Each line in the input program contains respective line number at the beginning and followed by a space, and then the actual code begins.

Consider the following program..

1 int main()  
2 {  
3 int a,b,c;  
4 printf("Enter the values of a and b\n");  
5 scanf("%d%d",&a,&b);  
6 c=a+b;  
7 printf("The sum of two numbers is %d",c);  
8 }  

And the input for the program which I'm developing is a 'C' program, in which single line consists of a single statement.. ie, we know that a whole program can be written in a single line. But not in my case, that is once there is a termination(semicolon), the lines following the semi colon is shifted to next line..

Anyways my job is to extract the variable definitions/declarations and variable use/reference in the given input C program..

Consider the above program, In line number 3, variables a,b and c are declared, hence it has to be printed under the "definition" column of the output..

Similarly in the statement 5, values of a and b are being initialized using a scanf statement, hence variables a and b should be printed under the definition column of the output..

Now consider the statement 6, The value of variable c is being initialized/defined hence c must be printed under the definition column.. At the same time, values of a and b are being used to determine the value of c, hence variables a and b must be printed under the "reference" column of the output..

And lastly, the value of variable c is being referenced/used in the statement 7, hence the variable c has to be printed under the referenced column..

The sample output of the program is as shown below..

Line Number          Defined Variable        Referenced Variable
_____________________________________________________________________
  1                       --                           --
  2                       --                           --
  3                     a,b,c                         --
  4                       --                           --
  5                      a,b                          --
  6                       c                           a,b
  7                       --                          c
  8                       --                          --

Can anyone tell me how to solve the problem???? Remember, I need to write a C/C++ program or even shell script is allowed for the project.. I need to consider the mathematical expressions, logical expressions, built in function calls, user defined function calls and function definitions as well..

Thanks in advance..

Upvotes: 1

Views: 1551

Answers (2)

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272537

You basically need to start with a full-blown C parser. You could write this yourself, but you're probably better off using something pre-existing, such as CLang.

Upvotes: 1

Milimetric
Milimetric

Reputation: 13549

The standard thing to do would be to write a tokenizer and a parser to partially compile your input program. Then you'll know what expressions are on each line. For the purpose of this assignment, you can just match regular expressions up to:

  1. variable definition
  2. variable reference

and spit out the captures for each line. So for example the variable reference might be something like "a valid C identifier anywhere besides after a valid C datatype". The captures here would be "a valid C identifier", so just print those out under the "Referenced Variable" column.

Upvotes: 1

Related Questions