Reputation: 231
I want to read a file in python and want to save it in a list without losing any data.
loadingFile = open('lorem.txt','r')
Data = loadingFile.read()
#print(Data)
data = Data.split("#*")
print(data)
Input from the dataset:
#*OQL[C++]: Extending C++ with an Object Query Capability.
#@José A. Blakeley
#t1995
#cModern Database Systems
#index0
#*Transaction Management in Multidatabase Systems.
#@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz
#t1995
#cModern Database Systems
#index1
Required Output:
List = ['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A.Blakeley #t1995#cModern Database Systems #index0','#*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']
Upvotes: 0
Views: 47
Reputation: 11
This must work.
loadingFile = open('file.txt','r')
data = loadingFile.read()
list = data.split("\n")
a = "-".join(list)
b = a.split("\\")
c = "-".join(b)
print(c.replace('-', ''))
Upvotes: 0
Reputation: 195623
One possible solution with re
module:
data = '''#*OQL[C++]: Extending C++ with an Object Query Capability.
#@José A. Blakeley
#t1995
#cModern Database Systems
#index0
#*Transaction Management in Multidatabase Systems.
#@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz
#t1995
#cModern Database Systems
#index1'''
import re
lst = re.findall(r'(#\*.*?)\s*(?=#\*|\Z)', re.sub(r'\n+', ' ', data), flags=re.DOTALL)
# pprint is used here only for pretty printing, all the data are in list `lst`
from pprint import pprint
pprint(lst, width=180)
Prints:
['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A. Blakeley #t1995 #cModern Database Systems #index0',
'#*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']
Upvotes: 1
Reputation: 1092
How about this:
d = "#*"
output = []
for line in Data:
output.append([d+e for e in line.split(d) if e])
print(output)
Upvotes: 1
Reputation: 2600
How about that:
data = ['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A.Blakeley #t1995#cModern Database Systems #index0 #*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']
lst = ['#*' + segment for segment in data[0].split(sep='#*')]
print(lst)
Upvotes: 1