Read file contents present in sharepoint

Question

I have a directory with n number of subdirectory in each on a sharepoint. Each sub directory has files either with doc or docx extension. I want to read the text(possibly read the content as plane text to parse every string). I know docx2txt but it needs the file to present on local machine(i feel so). Is there any better way?

I am using graph api to scan/browse sharepoint directory.

Looking for some direction please.

import requests
import pathlib

# Copy access_token and specify the MS Graph API endpoint you want to call, e.g. 'https://graph.microsoft.com/v1.0/groups' to get all groups in your organization
#access_token = '{ACCESS TOKEN YOU ACQUIRED PREVIOUSLY}'

url = "https://graph.microsoft.com/v1.0/......"
headers = {
  'Authorization': token_result['access_token']
}

consentfilecount=0
clientreportcount = 0
graphlinkcount = 0

while True:
    
    try:
      graph_result = requests.get(url=url, headers=headers)
      graph_result.raise_for_status()
    except:
      token_result = client.acquire_token_for_client(scopes=scope)
    
    headers = {
      'Authorization': token_result['access_token']
    }
    

    if ('value' in graph_result.json()):
      for list in graph_result.json()['value']:
        for ele in finalReportNames:
          if ele.lower() in list["name"].lower():
            clientreportcount +=1
            response = requests.get(list["webUrl"],headers=headers)#{"Authorization": f"Bearer " +token_result['access_token']})
            print(response)
            print(list["name"])
            print(list["webUrl"])
            print(pathlib.Path(list["name"]).suffix)
        #print(graph_result.json())
      if('@odata.nextLink' in graph_result.json()):
        url = graph_result.json()['@odata.nextLink']
        graphlinkcount += 1
      else:
        break

print(consentfilecount)

Read file contents present in sharepoint

Answers (1)

Related Questions