Martin
Martin

Reputation: 117

How to search for substrings in a long string and create a list in Python?

I have a long string:

query = "PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"

Now I need to create a list consisting all the substrings that start with '?'. So the list should look like as follows:

schema = ['Age', 'Sex', 'Chest_Pain_Type', 'Trestbps', 'Chol', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'ThalachD', 
             'Exercise_Induced_Angina', 'OldpeakD', 'CaD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis']

I tried with str.startswith(str, beg=0,end=len(string))

But it's not working as I expected. How can do it in Python?

Upvotes: 0

Views: 74

Answers (1)

Rakesh
Rakesh

Reputation: 82815

Using regex:

import re
query = """PREFIX pht: <http://datalab.rwth-aachen.de/vocab/pht/>
         PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 

         SELECT ?Age, ?SexTypes, ?Chest_Pain_Type, ?trestbpsD, ?cholD, 
                    ?Fasting_Glucose_Level, ?Resting_ECG_Type, ?thalachD, 
                    ?Exercise_Induced_Angina, ?oldpeakD, ?caD, ?Slope, ?Thallium_Scintigraphy, ?Diagnosis
                      WHERE {?URI a sct:125676002. }"""

#print re.findall("\?\w+", query)
print([i.replace("?", "") for i in re.findall("\?\w+", query)])

Output:

['Age', 'SexTypes', 'Chest_Pain_Type', 'trestbpsD', 'cholD', 'Fasting_Glucose_Level', 'Resting_ECG_Type', 'thalachD', 'Exercise_Induced_Angina', 'oldpeakD', 'caD', 'Slope', 'Thallium_Scintigraphy', 'Diagnosis', 'URI']

Upvotes: 5

Related Questions