user3932077
user3932077

Reputation: 77

how to avoid space between particular words while using split() in python

I am using split() to convert my string into list but I found some values which I want to be together are get separated using split(). Below is my example.

I have string as "Ambala Cantt. 1.2 Bitter Gourd 1200 2000 1500" and after splitting it I want it to be as [Ambala Cantt.,1.2,Bitter Gourd,1200,2000,1500] but I am getting the result as ['Ambala', 'Cantt.', '1.2', 'Bitter', 'Gourd', '1200', '2000', '1500']. Which is what I don't want it to be.

Why I am using split() because I have to convert my string into list so that I can stored each data into my database. Can anyone tell me how to resolve this or some better way to convert my string into list.

Upvotes: 2

Views: 111

Answers (2)

TkTech
TkTech

Reputation: 4976

Looks like you're trying to parse results for Mandi pricing from http://agmarknet.nic.in/. These have a predictable pattern.

example = "Ambala Cantt. 1.2 Bitter Gourd 1200 2000 1500"
print([c.strip() for c in re.match(r"""
    (?P<market>[^0-9]+)
    (?P<arrivals>[^ ]+)
    (?P<variety>[^0-9]+)
    (?P<min>[0-9]+)
    \ (?P<max>[0-9]+)
    \ (?P<modal>[0-9]+)""",
    example,
    re.VERBOSE
).groups()])
['Ambala Cantt.', '1.2', 'Bitter Gourd', '1200', '2000', '1500']

Upvotes: 2

mikequentel
mikequentel

Reputation: 288

Need to find a consistent pattern in the input (I'm assuming there are a lot of strings with inconsistent delimiters in this dataset)--possibly use a regex to perform the split: https://docs.python.org/2/library/re.html

OpenRefine could facilitate data scrubbing the strings if they are from an input file.

Upvotes: 1

Related Questions