OzzyW
OzzyW

Reputation: 117

get text of xml element is not working: AttributeError: 'NoneType' object has no attribute 'findall'

The code below is getting information about a artist using lastfm api. Then it stores the name in bands[Name] and each tag name(ex: rock, etc) of the artist in the bands[Tags]. Its working fine to get and store the name but its not working for the tags. It appears:

  Traceback (most recent call last):
      File "C:/Users/Ozzy/PycharmProjects/getData/getData.py", line 19, in <module>
        for tag in artist.find('tags').findall('tag'):
    AttributeError: 'NoneType' object has no attribute 'findall'

Minimum working exemple to demonstrate the error:

import xml.etree.ElementTree as ET
    import requests

    ID = 1

    chosen = "U2"
    artist = requests.get(
        'http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=U2&api_key=b088cbedecd40b35dd89e90f55227ac2')
    tree = ET.fromstring(artist.content)
    for child in tree:
        for artist in child:
            print(artist)
            for tag in artist.find('tags').findall('tag'):
                print(tag.find('name').text)

The response have this format:

<lfm status="ok">
<artist>
<name>U2</name>
<tags>
<tag>
<name>rock</name>
<url>https://www.last.fm/tag/rock</url>
</tag>
<tag>
<tag>
<name>alternative</name>
<url>https://www.last.fm/tag/alternative</url>
</tag>
</tags>  
</artist>
</lfm>

Full working exemple, the code gets the top artists from a specific country and then collect info about the artist (without getting and storing the tags of each artist working because it appears the NoneType error above):

import xml.etree.ElementTree as ET
import requests
import json
ID = 1

api_key = "b088cbedecd40b35dd89e90f55227ac2" 

bands = {}
# GET TOP ARTISTS
for i in range(2, 3):
    artistslist = requests.get(
        'http://ws.audioscrobbler.com/2.0/?method=geo.gettopartists&country=spain&page='+str(i) +'&api_key=' + api_key)
    tree = ET.fromstring(artistslist.content)
    for child in tree:
        for artist in child.findall('artist'):
            name = artist.find('name').text
            url = artist.find('url').text
            bands[ID] = {}
            bands[ID]['ID'] = ID
            bands[ID]['Name'] = name
            bands[ID]['URL'] = url
            ID += 1



# GET ARTIST INFO
for i, v in bands.items():

    chosen = bands[i]['Name'].replace(" ", "+")
    artist = requests.get(
        'http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist=' + chosen + '&api_key=' + api_key)
    tree = ET.fromstring(artist.content)
    for child in tree:
        for artist in child:
            #for tag in artist.find('tags').findall('tag'):
                #print(tag.find('name').text)
                #bands[i][Tags] = tag.find('name').text


            if (artist.get('size') == "large"):
                if (artist.text is not None):
                    bands[i]['Image'] = artist.text
            for bio in artist.findall('summary'):
                if (bio.text is not None):
                    bands[i]['Description'] = bio.text
                else:
                    bands[i]['Description'] = bio.text
    print(bands[i]['Name'] + " INFO RETRIEVED")

with open('artists.json', 'w') as outfile:
    json.dump(bands, outfile)

with open('artists.json') as data_file:
    bands = json.load(data_file)

data_file.close()

Do you know how to fix this issue?

Upvotes: 2

Views: 3122

Answers (1)

Maximilian Peters
Maximilian Peters

Reputation: 31649

Your loops go one level too deep.

<lfm status="ok">       --> tree
  <artist>                --> child in tree
    <name>U2</name>         --> for artist in child
    <tags>..</tags>

<tags is already part of child and therefore artist.find('tags') will return None.

You can shorten your loop to:

for band in bands.values():       
    url = 'http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&artist={}&api_key={}'.format(band['Name'], api_key)
    artist = requests.get(url)
    tree = ET.fromstring(artist.content)
    if tree.find('artist') is None:
        continue
    for child in tree.find('artist').getchildren():
        if child.get('size') == "large":
            if (child.text is not None):
                band['Image'] = child.text
        for bio in child.findall('summary'):
            if bio.text is not None:
                band['Description'] = bio.text
            else:
                band['Description'] = ""
        for tag in child.findall('tag'):
            if band.get('Tags'):
                band['Tags'].append(tag.find('name').text)
            else:
                band['Tags'] = [tag.find('name').text]
    print(band['Name'] + " INFO RETRIEVED")

A few notes:

  • It is easier and more efficient to loop over the keys in a dict by using for k in my_dict or over the values with for val in my_dict.values()
  • You are overwriting your Tags with the last value, using a list and appending to it will make sure you save all values
  • Your if/else statement (if (bio.text is not None):) behaves identical independent of the condition

Upvotes: 1

Related Questions