Reputation: 4235
I doubt that I'll get an answer here as AIX is very rare thing but I should try at least.
The background
We have the program. The program uses golang.org/x/crypto/ssh
library to connect to the remote services and do some things. The program is part of the large service and widely tested by end-users. It works without issues (at least related to connection) not only with all Linux-based clients (include quite old things like Ubuntu 12.02) but also with the clients on FreeBSD, OpenBSD, NetBSD, MacOSX, Solaris SPARC, HP-UX and other *nixes. So looks like it wasn't tested only on the Samsung refrigerators. And yesterday I was sure that it will be able to connect to the refrigerator and do what is needed without any issues. But that was yesterday...
The problem
Today we decided to add AIX support to our program. And we partly failed.
The problem description is simple: after pty
request program stops working. I mean I can do ssh.RequestPty
it executes without any issues but when I'm trying to execute commands after the app just hangs. Without errors, without nothing. Just hangs.
When it works?
requestPty
- everything works. But we need pty
for the sudo
. session.Shell
even with pty
requested. So if I write kind of interactive shell, it works perfectly.What have I tried so far
I tried to debug so far as I could. The last command that executes is ch.sendMessage(msg)
from ssh/channel.go
. I mean it writes packet and that's all. No data returned from the remote host.
For the tests, I used 3 versions of AIX - 5.3, 6.1 and 7.1. No difference.
OpenSSH versions are different:
All machines are running in LPARs but I doubt this is related to the issue.
I have no idea what is wrong. And I even can't say if this is common AIX issue or only our test machine. Here is the sample program that should write IT WORKS
if it works
package main
import (
"golang.org/x/crypto/ssh"
)
func main() {
server := "127.0.0.1:22"
user := "root"
p := "password"
config := &ssh.ClientConfig{
User: user,
Auth: []ssh.AuthMethod{ssh.Password(p)},
}
conn, err := ssh.Dial("tcp", server, config)
if err != nil {
panic(err.Error())
}
defer conn.Close()
session, err := conn.NewSession()
if err != nil {
panic(err.Error())
}
defer session.Close()
// Comment below and everything works
modes := ssh.TerminalModes{
ssh.ECHO: 0,
ssh.TTY_OP_ISPEED: 14400,
ssh.TTY_OP_OSPEED: 14400,
}
if err := session.RequestPty("xterm", 80, 40, modes); err != nil {
panic(err.Error())
}
// Comment above and everything works
session.Run("echo 1")
println("IT WORKS")
}
If you have AIX somewhere around and can run this code against it I'd appreciate your feedback.
If you have any ideas (even crazy) why it may fail and where else I can look, don't be shy.
Update (2017-03-02):
By suggestion from @LorinczyZsigmond I launched sshd
in debug mode. Results are a bit strange.
Here is part of Debian 9.0 OpenSSH_6.0p1 Debian-4+deb7u3, OpenSSL 1.0.1t 3 May 2016
log after sample program execution:
debug1: session_input_channel_req: session 0 req pty-req
debug1: Allocating pty.
debug1: session_pty_req: session 0 alloc /dev/pts/1
debug1: SELinux support disabled
debug1: server_input_channel_req: channel 0 request exec reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
debug2: fd 3 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug1: Setting controlling tty using TIOCSCTTY.
debug2: channel 0: rfd 10 isatty
debug2: fd 10 setting O_NONBLOCK
debug3: fd 8 is O_NONBLOCK
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
It works as expected.
Now the same block from AIX 7.1 OpenSSH_6.0p1, OpenSSL 1.0.1e 11 Feb 2013
log:
debug1: session_input_channel_req: session 0 req pty-req
debug1: Allocating pty.
debug1: session_pty_req: session 0 alloc /dev/pts/42
debug1: server_input_channel_req: channel 0 request exec reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
debug1: Values: options.num_allow_users: 0
debug1: RLOGIN VALUE :1
debug1: audit run command euid 0 user root command 'whoami'
setsid: Operation not permitted.
After setsid: Operation not permitted.
it does nothing until I kill it with Ctrl+C. When I kill it it returns:
debug2: fd 4 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug2: channel 0: rfd 10 isatty
debug2: fd 10 setting O_NONBLOCK
debug3: fd 8 is O_NONBLOCK
debug2: notify_done: reading
Exiting on signal 2
debug1: do_cleanup
debug1: session_pty_cleanup: session 0 release /dev/pts/42
debug1: audit session close euid 0 user root tty name /dev/pts/42
debug1: audit event euid 0 user root event 12 (SSH_connabndn)
debug1: Return Val-1 for auditproc:0
And sends the result of whoami
back to the client. This looks like a bug in SSH server, but is this possible for the 2 different versions?
Another interesting fact is when I run sshd
with truss
(kind of strace
for AIX) the output looks like this:
debug1: session_input_channel_req: session 0 req pty-req
debug1: Allocating pty.
debug1: session_pty_req: session 0 alloc /dev/pts/42
debug1: server_input_channel_req: channel 0 request exec reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req exec
debug1: Values: options.num_allow_users: 0
debug1: RLOGIN VALUE :1
debug1: audit run command euid 0 user root command 'whoami'
debug2: fd 4 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x10
debug2: channel 0: rfd 10 isatty
debug2: fd 10 setting O_NONBLOCK
debug3: fd 8 is O_NONBLOCK
setsid: Operation not permitted.
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
debug2: channel 0: close_write
debug2: channel 0: output drain -> closed
But truss
output is a bit more strange than strace
one (at least for someone who don't use *nix trace tools on daily basis) so I don't understand what is going on in the logs. If there is someone more skilled with this stuff here is the part of the trace data http://pastebin.com/YdzQwbt2 from debug1: RLOGIN VALUE :1
.
Also, in the logs, I found that ssh.Shell()
works because it doesn't request pty
. It starts an interactive session (or something like that). But in my case, the interactive session is not an option.
Upvotes: 4
Views: 2545
Reputation: 84
better late than never
IBM said it was a bug in openssh - race condition while PTY allocation https://www-01.ibm.com/support/docview.wss?uid=isg1IV82042
fixed in package openssh.base.server:7.5.102.1500
it strange that bug only occurs in aix, never in linux. nevertheless, problem is solved in my case
Upvotes: 4
Reputation: 1
I had similar problem with "Allocating pty" and then exiting from ssh session. Here is log of my sshd debug:
sshd drops connection with error :3004-010 Failed setting terminal ownership and mode.
debug1: Allocating pty.
debug1: session_pty_req: session 0 alloc /dev/pts/2
debug1: Ignoring unsupported tty mode opcode 13 (0xd)
debug1: Ignoring unsupported tty mode opcode 18 (0x12)
debug1: server_input_channel_req: channel 0 request env reply 0
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req env
debug2: Ignoring env request LANG: disallowed name
debug1: server_input_channel_req: channel 0 request shell reply 1
debug1: session_by_channel: session 0 channel 0
debug1: session_input_channel_req: session 0 req shell
debug1: Values: options.num_allow_users: 0
debug1: RLOGIN VALUE :1setsid: Operation not permitted.
The OS is AIX 7.1 (7100-04-03-1642)
The goal of my environment is to authenticate user on AIX through remote ldap user over ssh (ldap server actually is novell eDirectory). So, I had similar issue with user authentication.
I fixed login over ssh as in eDirectory Schema (rfc2703), added following object extensions to the user:
posixAccount
posixGroup
shadowAccount
uamPosixUser (as I am not sure is it necessary this object)
I just want to note that on OS AIX following user isn't local, not exist in /etc/passwd
and /etc/group
.
V.Davidov
Upvotes: 0