Graham Klyne
Graham Klyne

Reputation: 846

Cinder volume creation failing on openstack: No valid host was found

I'm posting this as a question to report a problem (and workaround) I encountered that doesn't seem to be covered by other questions. It's probably quite specific to the software setup I'm using, but in case it helps...

This was on a single-node configuration that had been running successfully for many years (Ubuntu 12.04, Havana OpenStack), but this was the first time in a while I had attempted to create a new VM image.

The command I ran is this:

cinder create 50 --display_name bionic-test-annalist-50Gb \
                 --volume_type lvm-scsi \
                 --image-id 5121d3e9-ef3d-4ff9-a5b9-f2f31c08cbbe \
                 --availability-zone nova

Following which I see this volume status:

root@seldon:/etc/cinder# cinder list
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+
|                  ID                  | Status |      Display Name     | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+
| 26277f8f-e0cd-43e7-8e5c-c42b0be21706 | in-use |  dhoxss-annalist-50Gb |  50  |   lvm-scsi  |   true   | d436f20c-5f8f-47cb-9ad5-eacaf6bda882 |
| 852fd771-71ec-4d0a-ae62-b48b5e35ff93 | in-use |   demo-annalist-50Gb  |  50  |   lvm-scsi  |   true   | eac53b50-54f3-4e93-804d-91569e1ed337 |
| abe7e7e6-502c-48b5-95ef-207891076e11 | in-use |   test-databank-50Gb  |  50  |   lvm-scsi  |   true   | 367bddfe-da43-40f2-a23c-75a5dac5225e |
| afa05ae4-e956-446b-bb26-a1439502435c | error  |  bionic-annalist-50Gb |  50  |   lvm-scsi  |  false   |                                      |
| ce7e0d7b-dfe3-4c8a-a541-91d9b6b388d9 | in-use | fast-performance-50Gb |  50  |   lvm-scsi  |   true   | 233a8924-cfd0-4f2c-a242-d596f1bb0cee |
| da9a5222-246e-4697-b10e-02c9a912d4b6 | in-use |   dev-annalist-50Gb   |  50  |   lvm-scsi  |   true   | 463ffed0-7a31-467b-9ec6-a5acdbf72723 |
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+

The Cinder log file (I think it was /var/log/cinder/cinder-scheduler.log) shows this:

2018-10-10 18:29:49.803 2111 WARNING cinder.scheduler.host_manager [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] volume service is down or disabled. (host: seldon)
2018-10-10 18:29:49.804 2111 WARNING cinder.scheduler.host_manager [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] volume service is down or disabled. (host: seldon@lvmdriver-scsi)
2018-10-10 18:29:49.805 2111 ERROR cinder.volume.flows.create_volume [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] Failed to schedule_create_volume: No valid host was found.

Specifically note: Failed to schedule_create_volume: No valid host was found.

And the service-list confirms that the service is not running.

root@seldon:/etc/cinder# cinder service-list
+------------------+-----------------------+------+---------+-------+----------------------------+
|      Binary      |          Host         | Zone |  Status | State |         Updated_at         |
+------------------+-----------------------+------+---------+-------+----------------------------+
| cinder-scheduler |         seldon        | nova | enabled |   up  | 2018-10-10T17:30:07.000000 |
|  cinder-volume   |         seldon        | nova | enabled |  down | 2014-03-11T14:17:02.000000 |
|  cinder-volume   |  seldon@lvmdriver-sas | nova | enabled |   up  | 2018-10-10T17:30:12.000000 |
|  cinder-volume   | seldon@lvmdriver-scsi | nova | enabled |  down | 2018-10-10T17:27:55.000000 |
+------------------+-----------------------+------+---------+-------+----------------------------+

Given that system was previously working, and the existing VMs are still fine, what's going on here? Googling didn't uncover any fixes.

Upvotes: 0

Views: 2022

Answers (1)

Graham Klyne
Graham Klyne

Reputation: 846

TL;DR: tgt-admin --show was adding non-ASCII characters to its output, which was causing the output parser in Cinder to barf. A patch to the code skips lines with non-ASCII characters (see below).


Digging around in the log files uncovered this report:

2018-10-10 17:57:17.067 6970 ERROR cinder.service [req-a950a5bb-4f24-42dd-8ffc-4b2dd9153659 None None] Unhandled exception
2018-10-10 17:57:17.067 6970 TRACE cinder.service Traceback (most recent call last):
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 228, in _start_child
2018-10-10 17:57:17.067 6970 TRACE cinder.service     self._child_process(wrap.server)
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 205, in _child_process
2018-10-10 17:57:17.067 6970 TRACE cinder.service     launcher.run_server(server)
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 96, in run_server
2018-10-10 17:57:17.067 6970 TRACE cinder.service     server.start()
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 385, in start
2018-10-10 17:57:17.067 6970 TRACE cinder.service     self.manager.init_host()
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 209, in init_host
2018-10-10 17:57:17.067 6970 TRACE cinder.service     self.driver.ensure_export(ctxt, volume)
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/lvm.py", line 525, in ensure_export
2018-10-10 17:57:17.067 6970 TRACE cinder.service     old_name=old_name)
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/lvm.py", line 444, in _create_tgtadm_target
2018-10-10 17:57:17.067 6970 TRACE cinder.service     old_name=old_name)
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/brick/iscsi/iscsi.py", line 231, in create_iscsi_target
2018-10-10 17:57:17.067 6970 TRACE cinder.service     if not self._verify_backing_lun(iqn, tid):
2018-10-10 17:57:17.067 6970 TRACE cinder.service   File "/usr/lib/python2.7/dist-packages/cinder/brick/iscsi/iscsi.py", line 114, in _verify_backing_lun
2018-10-10 17:57:17.067 6970 TRACE cinder.service     if iqn in line and "Target %s" % tid in line:
2018-10-10 17:57:17.067 6970 TRACE cinder.service UnicodeDecodeError: 'ascii' codec can't decode byte 0xf1 in position 0: ordinal not in range(128)
2018-10-10 17:57:17.067 6970 TRACE cinder.service
2018-10-10 17:57:17.088 6965 INFO cinder.service [-] Child 6970 exited with status 2
2018-10-10 17:57:17.088 6965 INFO cinder.service [-] _wait_child 1
2018-10-10 17:57:17.089 6965 INFO cinder.service [-] wait wrap.failed True

Note the error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf1 in position 0.

The code at the reported error looks like this:

    for line in lines:
        if iqn in line and "Target %s" % tid in line:
            capture = True
        if capture:
            target_info.append(line)
        if iqn not in line and 'Target ' in line:
            capture = False

Looking at the stack trace and source code, I figured the code was trying to parse output generated by tgt-admin --show (see method TgtAdm._get_target, which is called from create_iscsi_target (about line 220), which then calls _verify_backing_lun where the error occurs). This was checked by running the command manually into less, and noting extra characters at the end of the output.

My patch/fix is to add a test to the output parser, in the form of a try block, thus:

    for line in lines:
        try:
            line.decode('ascii')
        except UnicodeDecodeError:
            continue # @@@@ skip lines with non-ASCII characters
        if iqn in line and "Target %s" % tid in line:
            capture = True
        if capture:
            target_info.append(line)
        if iqn not in line and 'Target ' in line:
            capture = False

This isn't ideal, but it got me out of the hole I was in.

Upvotes: 0

Related Questions