This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

messy death of ssh bash sesn, bash $$ and ppid in still /proc,but not in ps


Greetings:

Not expecting help; want to share a problem
I have seen repeatedly every week or so on 1 host (windows 2000 server w/latest
service packs and fixes), using 1.5.17.

I have not seen a pattern or a cause, but I seem to recall that
the shell that "goes south", often (always?) has several suspended
jobs - usually a mix of "vim" and "less".  (Notice the child processes
of pid 6084 shown below.)

I will be running fairly routine sys admin commands in a bash session on
a remote host through an ssh session.  The bash command prompt
returns after succeeding, then I type a (simple) command ("ls"
in the example below), and after the cursor moves down to the first
column in the next line, *nothing* subsequently happens.  If I look for the
bash process that was running the interactive shell, with 
cygwin 'ps', it is not there (pid is 6084 in example below).
Strangely, though /proc has both the bash session and it's
parent. "cat /proc/6084/ppid" shows value 1052, which is the parent
sshd process ("/usr/sbin/sshd -D -R";see example below).  

It turns out cygwin's ps does not show processes 1052 nor 6084.  Both
the sshd and it's child bash session mysteriously vanished, but the
child processes of the bash session remain.

( I'm not sure how much I should trust "procps -H -Ao pid,ppid,%cpu,user,bsdstart,args",
  but that's a side issue. Search ahead for "defunct".)

--
regards,
Tom

--v-v------------------C-U-T---H-E-R-E-------------------------v-v-- 
# -------------------------------------------------------------------- 
# this bash session shows my checks *after* bash session w/pid
# 6084 died. 6084's ppid was 1052
# -------------------------------------------------------------------- 
[16:36:03 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ tty  
/dev/tty1
[16:36:05 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ which ps_
ps_ is aliased to `procps -H -Ao pid,ppid,%cpu,user,bsdstart,args'
[16:36:14 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ ps_
  PID  PPID %CPU USER      START COMMAND
 3040  6084  0.0 scmcron   16:03 vim rc_startup
 5240  6084  0.0 scmcron   15:33 less -I -j4 -x2 -S /var/log/rc_startup.log
 4148  6084  0.0 scmcron   15:30 vim basename.shinc
 3196  6084  0.0 scmcron   14:51 vim _logrotate
 1624     1  0.0 SYSTEM    00:38 /usr/bin/cygrunsrv
 1696  1624  0.0 SYSTEM    00:38   /usr/sbin/sshd -D
 1052  1696  0.0 SYSTEM    09:43     /usr/sbin/sshd -D -R
 6188  1696  0.0 SYSTEM    16:14     /usr/sbin/sshd -D -R
 2768  6188  0.0 scmcron   16:14       -bash
 3396  2768  0.0 scmcron   16:32         procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
 1028     1  0.0 SYSTEM    00:38 /usr/bin/cygrunsrv
 1116  1028  0.0 SYSTEM    00:38   /usr/sbin/cron -D
[16:36:17 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ cat /proc/6084/ppid
1052
[16:36:43 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ which ps
ps is aliased to `ps -elW '
ps is /usr/bin/ps
[16:37:07 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $ ps|egrep '\<1052|6084\>'
[16:37:34 Thu Jul 21 ~ ourhost scmcron
-bash-2.05b] $

--v-v------------------C-U-T---H-E-R-E-------------------------v-v--
# -------------------------------------------------------------------- 
# Here are the last two commands typed in bash session w/pid 6084
# The "ls" command never completed.
# Sorry for the prompt, PS1 below is "\t \d \jj \l 4880 \w\n> \h \u >"
# This session has tty of tty0, and 4 suspended jobs.
# -------------------------------------------------------------------- 
> 16:15:07 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> ourhost scmcron > regtool -s set /HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment/PATH_ADM 'c:'
> 16:16:01 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> ourhost scmcron > ls

--v-v------------------C-U-T---H-E-R-E-------------------------v-v-- 
# -------------------------------------------------------------------- 
# change in "procps -H -Ao pid,ppid,%cpu,user,bsdstart,args" output
#  cause again unknown
#  (about 1 hour of time passed, and I looked at processes via a TS session and task manager)
# I waited ~10 min, and re-ran procps and the output returned to normal; ie
# all the defunct pids were replaced again w/the earlier integer values!
# -------------------------------------------------------------------- 

[17:30:49 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ ps_
  PID  PPID %CPU USER      START COMMAND
 3040  6084  0.0 scmcron   16:03 <defunct>
 5240  6084  0.0 scmcron   15:33 <defunct>
 4148  6084  0.0 scmcron   15:30 <defunct>
 3196  6084  0.0 scmcron   14:51 <defunct>
 1624     1  0.0 SYSTEM    00:38 <defunct>
 1696  1624  0.0 SYSTEM    00:38   <defunct>
 1052  1696  0.0 SYSTEM    09:43     <defunct>
 6188  1696  0.0 SYSTEM    16:14     <defunct>
 2768  6188  0.0 scmcron   16:14       <defunct>
 2936  2768  0.7 scmcron   17:27         procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
 1028     1  0.0 SYSTEM    00:38 <defunct>
 1116  1028  0.0 SYSTEM    00:38   <defunct>
[17:31:01 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ cat /proc/6084/ppid
1052
<snip>
[17:40:56 Thu Jul 21 /proc/registry/HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment c7mkes108 scmcron
-bash-2.05b] $ ps_
  PID  PPID %CPU USER      START COMMAND
 3040  6084  0.0 scmcron   16:03 vim rc_startup
 5240  6084  0.0 scmcron   15:33 less -I -j4 -x2 -S /var/log/rc_startup.log
 4148  6084  0.0 scmcron   15:30 vim basename.shinc
 3196  6084  0.0 scmcron   14:51 vim _logrotate
 1624     1  0.0 SYSTEM    00:38 /usr/bin/cygrunsrv
 1696  1624  0.0 SYSTEM    00:38   /usr/sbin/sshd -D
 6188  1696  0.0 SYSTEM    16:14     /usr/sbin/sshd -D -R
 2768  6188  0.0 scmcron   16:14       -bash
 5388  2768  0.0 scmcron   17:44         procps -H -Ao pid,ppid,%cpu,user,bsdstart,args
 1028     1  0.0 SYSTEM    00:38 /usr/bin/cygrunsrv
 1116  1028  0.0 SYSTEM    00:38   /usr/sbin/cron -D

--v-v------------------C-U-T---H-E-R-E-------------------------v-v-- 
# -------------------------------------------------------------------- 
# I finally killed the ssh client that had originated from a linux box..
# -------------------------------------------------------------------- 
> 16:15:07 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> c7mkes108 scmcron > regtool -s set /HKEY_LOCAL_MACHINE/SYSTEM/CurrentControlSet/Services/rc_startup/Parameters/Environment/PATH_ADM 'c:'
> 16:16:01 Thu Jul 21 4j tty0 6084 /drv/c/adm/config/rc
> c7mkes108 scmcron > ls
Killed by signal 15.
[17:41:43 Thu Jul 21 0j 19 17149 ~]
[cmke6-75 rodmant]$  #now we're back on the linux host


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]