Ahmed Ashraf
Ahmed Ashraf

Reputation: 231

Reading a whole file and storing it in a list without loosing content splitting it

I want to read a file in python and want to save it in a list without losing any data.

loadingFile = open('lorem.txt','r')
Data = loadingFile.read()

#print(Data)

data = Data.split("#*")
print(data)

Input from the dataset:

#*OQL[C++]: Extending C++ with an Object Query Capability.

#@José A. Blakeley

#t1995

#cModern Database Systems

#index0

#*Transaction Management in Multidatabase Systems.

#@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz

#t1995

#cModern Database Systems

#index1

Required Output:

List = ['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A.Blakeley #t1995#cModern Database Systems #index0','#*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']

Upvotes: 0

Views: 47

Answers (4)

Vansh
Vansh

Reputation: 11

This must work.


loadingFile = open('file.txt','r')
data = loadingFile.read()

list = data.split("\n")

a = "-".join(list)
b = a.split("\\")
c = "-".join(b)

print(c.replace('-', ''))

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195623

One possible solution with re module:

data = '''#*OQL[C++]: Extending C++ with an Object Query Capability.

#@José A. Blakeley

#t1995

#cModern Database Systems

#index0

#*Transaction Management in Multidatabase Systems.

#@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz

#t1995

#cModern Database Systems

#index1'''

import re

lst = re.findall(r'(#\*.*?)\s*(?=#\*|\Z)', re.sub(r'\n+', ' ', data), flags=re.DOTALL)

# pprint is used here only for pretty printing, all the data are in list `lst`
from pprint import pprint
pprint(lst, width=180)

Prints:

['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A. Blakeley #t1995 #cModern Database Systems #index0',
 '#*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']

Upvotes: 1

Nidhin Bose J.
Nidhin Bose J.

Reputation: 1092

How about this:

d = "#*"
output = []
for line in Data:
    output.append([d+e for e in line.split(d) if e])
print(output)

Upvotes: 1

logicOnAbstractions
logicOnAbstractions

Reputation: 2600

How about that:

data = ['#*OQL[C++]: Extending C++ with an Object Query Capability. #@José A.Blakeley #t1995#cModern Database Systems #index0 #*Transaction Management in Multidatabase Systems. #@Yuri Breitbart,Hector Garcia-Molina,Abraham Silberschatz #t1995 #cModern Database Systems #index1']


lst = ['#*' + segment for segment in data[0].split(sep='#*')]
print(lst)

Upvotes: 1

Related Questions