An Yan
An Yan

Reputation: 131

How to write my own split function without using .split and .strip function?

How to write my own split function? I just think I should remove spaces, '\t' and '\n'. But because of the shortage of knowledge, I have no idea of doing this question

Here is the original question:

Write a function split(string) that returns a list of words in the given string. Words may be separated by one or more spaces ' ' , tabs '\t' or newline characters '\n' .

And there are examples:

words = split('duff_beer 4.00') # ['duff_beer', '4.00']
words = split('a b c\n') # ['a', 'b', 'c']
words = split('\tx y \n z ') # ['x', 'y', 'z']

Restrictions: Don't use the str.split method! Don't use the str.strip method

Upvotes: 2

Views: 10609

Answers (14)

Dos Berdimbet
Dos Berdimbet

Reputation: 1

def split(data: str, sep=None, maxsplit=-1):
    words = []
    word = ''
    if maxsplit == 0:
        if sep is None or sep is not None:
            words.append(data.strip())
    elif (maxsplit == -1 or maxsplit > 0) and sep is not None:
        parts = []
        part = ''
        i = 0
        splits = 0
        while i < len(data):
            if data[i:i + len(sep)] == sep:
                parts.append(part)
                part = ''
                i += len(sep) - 1
                splits += 1
                if maxsplit != -1 and splits >= maxsplit:
                    break
            else:
                part += data[i]
            i += 1
        part += data[i + 1:]
        parts.append(part)
        return parts
    elif maxsplit == -1 and sep is None:
        for char in data:
            if char.isspace():
                if word != '' and word:
                    words.append(word)
                word = ''
            elif char:
                word += char
        if word:
            words.append(word)
    elif maxsplit > 0 and sep is None:
        splits = 0
        for char in data:
            if char.isspace():
                if word != '' and word:
                    words.append(word)
                    splits += 1
                if splits >= maxsplit:
                    words.append(data[data.index(word) + len(word):].lstrip())
                    break
            elif char:
                word += char
    return words

Upvotes: 0

Vansh Singhal
Vansh Singhal

Reputation: 1

split first removes the beginning and ending character and then it converts your string into list based on the character you have input in the function. the default character is " " so i have done this using " ".

def my_split(string):
    string=string.strip(" ")
    ans=[]
    word=""
    for character in string:
        if character!=" ":
            word+=character
        else:
            ans.append(word)
            word=""
    ans.append(word)
    return ans
print(my_split(input("Enter string")))

Upvotes: 0

Praveen Paliwal
Praveen Paliwal

Reputation: 65

All the above answers are good, there is a similar solution with an extra empty list.

def my_split(s):
    l1 = []
    l2 = []
    word = ''
    spaces = ['', '\t', ' ']
    for letters in s:
        if letters != ' ':
            word += letters
        else:
            l1.append(word)
            word = ''
    if word:
        l1.append(word)

    for words in l1:
        if words not in spaces:
            l2.append(words)

    return l2


my_string = '       The old fox jumps into the deep river'
y = my_split(my_string)
print(y)

Upvotes: 0

Shck Tchamna
Shck Tchamna

Reputation: 135

It is always a good idea to provide algorithm before coding: This is the procedure for splitting words on delimiters without using any python built in method or function:

  1. Initialize an empty list [] called result which will be used to save the resulting list of words, and an empty string called word = "" which will be used to concatenate each block of string.

  2. Keep adding string characters as long as the delimiter is not reached

  3. When you reach the delimiter, and len(word) = 0, Don't do whatever is below. Just go to the next iteration. This will help detecting and removing leading spaces.

  4. When you reach the delimiter, and len(word) != 0, Append word to result, reinitialize word and jump to the next iteration without doing whatever is below

  5. Return result


def my_split(s, delimiter = [" ","\t"]): 
  result,word = [], "" # Step 0
   
  N = len(s)
  for i in range(N) : #

    if N == 0:#  Case of empty string
      return result

    else: # Non empty string        
      
      if s[i] in delimiter and len(word) == 0: # Step 2     
        continue # Step 2: Skip, jump to the next iteration
      if s[i] in delimiter and len(word) != 0: # Step 3        
        result.append(word) # Step 3
        word = "" # Step 3
        continue # Step 3: Skip, jump to the next iteration          
      
      word = word + s[i] # Step 1.
  
     
  return result

print(my_split("        how are    you?  please split me now!       "))

Upvotes: 0

BorisConstantin
BorisConstantin

Reputation: 1

def mysplit(strng):
my_string = ''
liste = []
for x in range(len(strng)):
    my_string += "".join(strng[x])
    if strng[x] == ' ' or x+1 == len(strng):
        liste.append(my_string.strip())
        my_string = ''
        liste = [elem for elem in liste if elem!='']
return liste

Upvotes: 0

Chris Campbell
Chris Campbell

Reputation: 1

This handles for whitespaces in strings and returns empty lists if present

def mysplit(strng):
    #
    # put your code here
    #
    result = []
    words = ''
    
    for char in strng:
        if char != ' ':
            words += char
        else:
            if words:
                result.append(words)
            words = ''
            
            
    result.append(words)
    
    for item in result:
        if item == '':
            result.remove(item)
    
    return result

print(mysplit("To be or not to be, that is the question"))
print(mysplit("To be or not to be,that is the question"))
print(mysplit("   "))
print(mysplit(" abc "))
print(mysplit(""))

Upvotes: 0

Tapas
Tapas

Reputation: 75

a is string and s is pattern here.

a="Tapas Pall Tapas TPal TapP al Pala"
s="Tapas"
def fun(a,s):
  st=""
  l=len(s)
  li=[]
  lii=[]
  for i in range(0,len(a)):
      if a[i:i+l]!=s:
        st=st+a[i]
    elif i+l>len(a):
        st=st+a[i]
    else:
        li.append(st)
        i=i+l
        st=""
  li.append(st)
  lii.append(li[0])
  for i in li[1:]:
      lii.append(i[l-1:])
  return lii
print(fun(a,s))
print(a.split(s))    

Upvotes: 0

Benjamin Smith
Benjamin Smith

Reputation: 1

Some of your solutions are very good, but it seems to me that there are more alternative options than using the function:

values = 'This is a sentence'
split_values = []
tmp = ''
for words in values:
    if words == ' ':
        split_values.append(tmp)
    tmp = ''
else:
    tmp += words
if tmp:
    split_values.append(tmp)
    print(split_values)

Upvotes: 0

kalehmann
kalehmann

Reputation: 5011

Some of the comments on your question provide really interesting ideas to solve the problem with the given restrictions.

But assuming you should not use any python builtin split function, here is another solution:

def split(string, delimiters=' \t\n'):
    result = []
    word = ''
    for c in string:
        if c not in delimiters:
            word += c
        elif word:
            result.append(word)
            word = ''

    if word:
        result.append(word)

    return result

Example output:

>>> split('duff_beer 4.00')
['duff_beer', '4.00']
>>> split('a b c\n')
['a', 'b', 'c']
>>> split('\tx y \n z ')
['x', 'y', 'z']

Upvotes: 7

Artiom  Kozyrev
Artiom Kozyrev

Reputation: 3836

Please find my solution, it is not the best one, but it works:

def convert_list_to_string(b):
    localstring=""
    for i in b:
        localstring+=i
    return localstring

def convert_string_to_list(b):
    locallist=[]
    for i in b:
        locallist.append(i)
    return locallist

def mysplit(inputString, separator):
    listFromInputString=convert_string_to_list(inputString)
    part=[]
    result=[]
    j=0
    for i in range(0, len(listFromInputString)):
        if listFromInputString[i]==separator:
            part=listFromInputString[j:i]
            j=i+1
            result.append(convert_to_string(part))
        else:
            pass
    if j != 0:
        result.append(convert_to_string(listFromInputString[j:]))
    if len(result)==0:
        result.append(inputString)
    return result

Test:

mysplit("deesdfedefddfssd", 'd')

Result: ['', 'ees', 'fe', 'ef', '', 'fss', '']

Upvotes: 1

Igl3
Igl3

Reputation: 5108

One approach would be to iterate over every char until you find a seperator, built a string from that chars and append it to the outputlist like this:

def split(input_str):
    out_list = []
    word = ""
    for c in input_str:
        if c not in ("\t\n "):
            word += c
        else:
            out_list.append(word)
            word = ""
    out_list.append(word)
    return out_list

a = "please\nsplit\tme now"
print(split(a))

# will print: ['please', 'split', 'me', 'now']

Another thing you could do is by using regex:

import re

def split(input_str):
    out_list = []
    for m in re.finditer('\S+', input_str):
        out_list.append(m.group(0))

    return out_list

a = "please\nsplit\tme now"
print(split(a))

# will print: ['please', 'split', 'me', 'now']

The regex \S+ is looking for any sequence of non whitespace characters and the function re.finditer returns an iterator with MatchObject instances over all non-overlapping matches for the regex pattern.

Upvotes: 3

Pomonoli
Pomonoli

Reputation: 41

I think using regular expressions is your best option as well.

I would try something like this:

import re
def split(string):
    return re.findall('\S+',string)

This should return a list of all none whitespace characters in your string.

Example output:

>>> split('duff_beer 4.00')
['duff_beer', '4.00']
>>> split('a b c\n')
['a', 'b', 'c']
>>> split('\tx y \n z ')
['x', 'y', 'z']

Upvotes: 4

blhsing
blhsing

Reputation: 106618

You can use the following function that sticks to the basics, as your professor apparently prefers:

def split(s):
    output = []
    delimiters = {' ', '\t', '\n'}
    delimiter_found = False
    for c in s:
        if c in delimiters:
            delimiter_found = True
        elif output:
            if delimiter_found:
                output.append('')
                delimiter_found = False
            output[-1] += c
        else:
            output.append(c)
    return output

so that:

print(split('duff_beer 4.00'))
print(split('a b c\n'))
print(split('\tx y \n z '))

would output:

['duff_beer', '4.00']
['a', 'b', 'c']
['x', 'y', 'z']

Upvotes: 2

Karn Kumar
Karn Kumar

Reputation: 8816

This is what you can do with assigning a list, This is tested on python3.6

Below is Just an example..

values = 'This is a sentence'
split_values = []
tmp  = ''
for words in values:
    if words == ' ':
        split_values.append(tmp)
        tmp = ''
    else:
        tmp += words
if tmp:
    split_values.append(tmp)
print(split_values)

Desired output:

$ ./splt.py
['This', 'is', 'a', 'sentence']

Upvotes: 2

Related Questions