Reputation: 846
I'm posting this as a question to report a problem (and workaround) I encountered that doesn't seem to be covered by other questions. It's probably quite specific to the software setup I'm using, but in case it helps...
This was on a single-node configuration that had been running successfully for many years (Ubuntu 12.04, Havana OpenStack), but this was the first time in a while I had attempted to create a new VM image.
The command I ran is this:
cinder create 50 --display_name bionic-test-annalist-50Gb \
--volume_type lvm-scsi \
--image-id 5121d3e9-ef3d-4ff9-a5b9-f2f31c08cbbe \
--availability-zone nova
Following which I see this volume status:
root@seldon:/etc/cinder# cinder list
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+
| ID | Status | Display Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+
| 26277f8f-e0cd-43e7-8e5c-c42b0be21706 | in-use | dhoxss-annalist-50Gb | 50 | lvm-scsi | true | d436f20c-5f8f-47cb-9ad5-eacaf6bda882 |
| 852fd771-71ec-4d0a-ae62-b48b5e35ff93 | in-use | demo-annalist-50Gb | 50 | lvm-scsi | true | eac53b50-54f3-4e93-804d-91569e1ed337 |
| abe7e7e6-502c-48b5-95ef-207891076e11 | in-use | test-databank-50Gb | 50 | lvm-scsi | true | 367bddfe-da43-40f2-a23c-75a5dac5225e |
| afa05ae4-e956-446b-bb26-a1439502435c | error | bionic-annalist-50Gb | 50 | lvm-scsi | false | |
| ce7e0d7b-dfe3-4c8a-a541-91d9b6b388d9 | in-use | fast-performance-50Gb | 50 | lvm-scsi | true | 233a8924-cfd0-4f2c-a242-d596f1bb0cee |
| da9a5222-246e-4697-b10e-02c9a912d4b6 | in-use | dev-annalist-50Gb | 50 | lvm-scsi | true | 463ffed0-7a31-467b-9ec6-a5acdbf72723 |
+--------------------------------------+--------+-----------------------+------+-------------+----------+--------------------------------------+
The Cinder log file (I think it was /var/log/cinder/cinder-scheduler.log
) shows this:
2018-10-10 18:29:49.803 2111 WARNING cinder.scheduler.host_manager [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] volume service is down or disabled. (host: seldon)
2018-10-10 18:29:49.804 2111 WARNING cinder.scheduler.host_manager [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] volume service is down or disabled. (host: seldon@lvmdriver-scsi)
2018-10-10 18:29:49.805 2111 ERROR cinder.volume.flows.create_volume [req-4d12534f-abcd-499f-99cf-5f49d0308439 c570590c61be4ae5819c9b2d93986df2 1e701a6ab66141b9a64bfd963e301bc6] Failed to schedule_create_volume: No valid host was found.
Specifically note: Failed to schedule_create_volume: No valid host was found.
And the service-list confirms that the service is not running.
root@seldon:/etc/cinder# cinder service-list
+------------------+-----------------------+------+---------+-------+----------------------------+
| Binary | Host | Zone | Status | State | Updated_at |
+------------------+-----------------------+------+---------+-------+----------------------------+
| cinder-scheduler | seldon | nova | enabled | up | 2018-10-10T17:30:07.000000 |
| cinder-volume | seldon | nova | enabled | down | 2014-03-11T14:17:02.000000 |
| cinder-volume | seldon@lvmdriver-sas | nova | enabled | up | 2018-10-10T17:30:12.000000 |
| cinder-volume | seldon@lvmdriver-scsi | nova | enabled | down | 2018-10-10T17:27:55.000000 |
+------------------+-----------------------+------+---------+-------+----------------------------+
Given that system was previously working, and the existing VMs are still fine, what's going on here? Googling didn't uncover any fixes.
Upvotes: 0
Views: 2022
Reputation: 846
TL;DR: tgt-admin --show
was adding non-ASCII characters to its output, which was causing the output parser in Cinder to barf. A patch to the code skips lines with non-ASCII characters (see below).
Digging around in the log files uncovered this report:
2018-10-10 17:57:17.067 6970 ERROR cinder.service [req-a950a5bb-4f24-42dd-8ffc-4b2dd9153659 None None] Unhandled exception
2018-10-10 17:57:17.067 6970 TRACE cinder.service Traceback (most recent call last):
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 228, in _start_child
2018-10-10 17:57:17.067 6970 TRACE cinder.service self._child_process(wrap.server)
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 205, in _child_process
2018-10-10 17:57:17.067 6970 TRACE cinder.service launcher.run_server(server)
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 96, in run_server
2018-10-10 17:57:17.067 6970 TRACE cinder.service server.start()
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 385, in start
2018-10-10 17:57:17.067 6970 TRACE cinder.service self.manager.init_host()
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 209, in init_host
2018-10-10 17:57:17.067 6970 TRACE cinder.service self.driver.ensure_export(ctxt, volume)
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/lvm.py", line 525, in ensure_export
2018-10-10 17:57:17.067 6970 TRACE cinder.service old_name=old_name)
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/volume/drivers/lvm.py", line 444, in _create_tgtadm_target
2018-10-10 17:57:17.067 6970 TRACE cinder.service old_name=old_name)
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/brick/iscsi/iscsi.py", line 231, in create_iscsi_target
2018-10-10 17:57:17.067 6970 TRACE cinder.service if not self._verify_backing_lun(iqn, tid):
2018-10-10 17:57:17.067 6970 TRACE cinder.service File "/usr/lib/python2.7/dist-packages/cinder/brick/iscsi/iscsi.py", line 114, in _verify_backing_lun
2018-10-10 17:57:17.067 6970 TRACE cinder.service if iqn in line and "Target %s" % tid in line:
2018-10-10 17:57:17.067 6970 TRACE cinder.service UnicodeDecodeError: 'ascii' codec can't decode byte 0xf1 in position 0: ordinal not in range(128)
2018-10-10 17:57:17.067 6970 TRACE cinder.service
2018-10-10 17:57:17.088 6965 INFO cinder.service [-] Child 6970 exited with status 2
2018-10-10 17:57:17.088 6965 INFO cinder.service [-] _wait_child 1
2018-10-10 17:57:17.089 6965 INFO cinder.service [-] wait wrap.failed True
Note the error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xf1 in position 0
.
The code at the reported error looks like this:
for line in lines:
if iqn in line and "Target %s" % tid in line:
capture = True
if capture:
target_info.append(line)
if iqn not in line and 'Target ' in line:
capture = False
Looking at the stack trace and source code, I figured the code was trying to parse output generated by tgt-admin --show
(see method TgtAdm._get_target
, which is called from create_iscsi_target
(about line 220), which then calls _verify_backing_lun
where the error occurs). This was checked by running the command manually into less
, and noting extra characters at the end of the output.
My patch/fix is to add a test to the output parser, in the form of a try
block, thus:
for line in lines:
try:
line.decode('ascii')
except UnicodeDecodeError:
continue # @@@@ skip lines with non-ASCII characters
if iqn in line and "Target %s" % tid in line:
capture = True
if capture:
target_info.append(line)
if iqn not in line and 'Target ' in line:
capture = False
This isn't ideal, but it got me out of the hole I was in.
Upvotes: 0