shahooo
shahooo

Reputation: 593

I want to capture comments from java file using python script

For documentation purpose, I want to capture the comments of every function which lies above its code.

I am able to iterate the file to their function names. As soon as I get the function name line, I want to capture its comment which is above it. comment are in '/** xxx */' block

/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";

/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";

Upvotes: 0

Views: 710

Answers (3)

Matthijs990
Matthijs990

Reputation: 629

this need to work:

data = open(file_name).read()
data = data.split('/**')
old = data
data = list()
for i in old:
    data.extend(old.split('*/'))
comments = []
for i in range(1, len(data), 2):
    comments.append(data[i])
for k in comments:
    print(k)

Upvotes: 1

Arif Ratul
Arif Ratul

Reputation: 11

  x = find_comment(x, "/*", "*/", 2)
  x = find_comment(x, "//", "\n", 0)
  def find_comment(n_array, start_string, end_string, add_index):
     comment_index = n_array.find(start_string)
     if comment_index != -1:
         comment_end_index = n_array.find(end_string, comment_index)
         print(comment_end_index)
         if len(n_array) > comment_end_index:
             print(n_array[comment_index:comment_end_index + add_index])
             n_array = n_array[0: comment_index:] + n_array[comment_end_index + add_index::]
             find_comment(n_array, start_string, end_string, add_index)
             return n_array
      return n_array

Upvotes: 1

Daweo
Daweo

Reputation: 36630

Now when I know function name line starts with @Attribute it is quite easy to get it done using regular expression (re module), which can be done following way:

import re
content = '''
/**
* this is the comment
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.STRING.class)
String RESPONSE_TEXT = "responseText";

/**
* this is the comment
* this is the comment
*/
@Attribute(type = Attribute.LONG.class)
String TIME = "clTimestamp";
'''
comments = re.findall(r'(/\*\*.*?\*/)\n(@Attribute[^\n]*)',content,re.DOTALL)

print('Function comments:')
for i in comments:
    print(i[1])
    print(i[0])
    print('\n')

Output:

Function comments
@Attribute(type = Attribute.STRING.class)
/**
* this is the comment
* this is the comment
* this is the comment
*/


@Attribute(type = Attribute.LONG.class)
/**
* this is the comment
* this is the comment
*/

For clarity I hardcoded content, I used re.findall with pattern which have two groups, one is for comment, second for name, thus it give list of 2-tuples, each consisting of comment and function name. Note re.DOTALL meaning that .*? might give multiline match and escaping of characters with special meaning, namely * as \*.

Upvotes: 1

Related Questions