This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: listen/accept/fork behavior problem between cygwin1 1.5.18 and cygwin1.dll 1.5.19


> From: Corinna Vinschen <corinna-cygwin at cygwin dot com>
> To: cygwin at cygwin dot com
> Subject: Re: listen/accept/fork behavior problem between cygwin1 1.5.18 and cygwin1.dll 1.5.19
> Message-ID: <20060310101341.GB3184@calimero.vinschen.de>
> Reply-To: cygwin at cygwin dot com
> 
> On Mar  9 17:54, BRC wrote:
> > Hi,
> > 
> > I have a perl based server that creates a TCP listen
> > socket in a parent process, then forks off an N-child
> > process pool that normally accepts connections
> > round-robin style alla apache 1.3.  Everything worked
> > great when I was based on cygwin1.dll version 1.5.18. 
> > I would see many requests being simulatenously
> > sometimes lasting up to a minute.
> > 
> > Then I did a recent upgrade and suddenly all requests
> > became serialized.  That is, an entire request would
> > need to be handled before a pending one a child began.
> >  I immediately downgraded perl to 5.8.6-4 from
> > 5.8.7-5.  Nothing changed.  Then I did a complete
> > cygwin rollback and the concurrency went back to
> > normal.  I was then able to advance just the version
> > of perl back to 5.8.7-5 and everything still works. 
> > After lots of back and forth, I was able to isolate
> > the behavior difference to cygwin1.dll.
> > 
> > [snip]
> > 
> > I realize that Perl's fork under cygwin is technically
> > using threads so I left wondering if this is a
> > blocking
> > io problem (that changed with cygwin revs) since it
> > seems to transcend perl.
> 
> I have no idea how that should happen at all and so I assume I don't
> understand exactly how your application works.
> 
> Could you please create a simple testcase in plain C, stripped to the
> bare minimum of lines, building and working OOTB, which allows to
> reproduce the problem?  That would be most helpful.
> 
> Another test you could also make is this:  Can you easily change your
> listen/accept/action code so that only the server accepts and then 
> a child is forked for the associated action?  Does that maintain
> concurrency in your server with 1.5.19?
> 
> Corinna
> 
> -- 
> Corinna Vinschen                  Please, send mails regarding Cygwin to
> Cygwin Project Co-Leader          cygwin AT cygwin DOT com
> Red Hat

I'm not using perl, but I observed this behavior the other night when trying
to debug an accept() issue with pthreads.

Test case (my original network wrappers left in, since it would create more space
not to leave them as functions; error checking stripped down for the test code
and printf()s added):

/* code */
#include <ctype.h>
#include <errno.h>
#include <fcntl.h>
#include <netdb.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <net/if.h>
#include <arpa/inet.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <sys/ioctl.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <pthread.h>

#define NE(x) (sizeof(x) / sizeof((x)[0]))

int llimit;
int pa;
int pc;

int n_nodelay(int s, socklen_t op)
{
        if (setsockopt(s, IPPROTO_TCP, TCP_NODELAY, (void *)&op, sizeof((op = 1))) == -1) {
                perror("setsockopt");
                return -1;
        }

        return 0;
}

int n_nonblock(int s)
{
        if (fcntl(s, F_SETFL, (fcntl(s, F_GETFL, 0) | O_NONBLOCK)) == -1) {
                perror("fcntl");
                return -1;
        }

        return 0;
}

int n_accept(int s)
{
        struct sockaddr_in sa_r;
        socklen_t b;
        int c;

        b = sizeof(sa_r);
        if ((c = accept(s, (struct sockaddr *)&sa_r, &b)) == -1) {
                perror("accept");
                return -1;
        }

        n_nonblock(c);

        return c;
}

int n_listen(int s, int backlog)
{
        if (listen(s, backlog) == -1) {
                perror("listen");
                return -1;
        }

        return 0;
}

int n_bind(int s, struct sockaddr_in *sa)
{
        if (bind(s, (struct sockaddr *)sa, sizeof(struct sockaddr)) == -1) {
                perror("bind");
                return -1;
        }

        return 0;
}

int n_init(struct sockaddr_in *sa, const char *addr, unsigned short port)
{
        struct hostent *hp;

        memset(sa, 0, sizeof(struct sockaddr_in));

        sa->sin_family = AF_INET;
        if (port) sa->sin_port = htons(port);
        if (isdigit(*addr)) {
                if (memcmp(addr, "0.0.0.0", 8) == 0) {
                        sa->sin_addr.s_addr = INADDR_ANY;
                } else if ((sa->sin_addr.s_addr = inet_addr(addr)) == INADDR_NONE) {
                        perror("inet_addr");
                        return -1;
                }
        } else {
                if ((hp = gethostbyname(addr)) == NULL) {
                        if (h_errno == HOST_NOT_FOUND)
                                fprintf(stderr, "gethostbyname: unknown host\n");
                        else
                                fprintf(stderr, "gethostbyname: error\n");

                        return -1;
                }

                sa->sin_family = hp->h_addrtype;
                memcpy((char *)&sa->sin_addr, hp->h_addr, hp->h_length);
        }

        return 0;
}

int n_socket(void)
{
        int s;
        int ruse;

        if ((s = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
                perror("socket");
                return -1;
        }

        ruse = 1;
        if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &ruse, sizeof(ruse)) == -1) {
                perror("setsockopt");
                return -1;
        }

        return s;
}

int n_conn(int s, struct sockaddr_in *sa, int tout)
{
        struct timeval  to;
        fd_set rfds, wfds;
        socklen_t lres;
        int res;

        FD_ZERO(&rfds);
        FD_SET(s, &rfds);
        wfds = rfds;
        to.tv_sec = tout;
        to.tv_usec = 0;

        n_nonblock(s);

        if ((res = connect(s, (struct sockaddr *)sa, sizeof(struct sockaddr))) == 0)
                return 0;
        else if (res == -1) {
                switch (errno) {
                case EINPROGRESS:
                case EWOULDBLOCK:
                case EALREADY:
                case EINTR:
                        break;
                default:
                        perror("connect");
                        return -1;
                }

                if ((res = select(s + 1, &rfds, &wfds, NULL, &to)) == 0) {
                        return -1;
                } else if (res == -1) {
                        perror("select");
                        return -1;
                } else if (FD_ISSET(s, &rfds) || FD_ISSET(s, &wfds)) {
                        lres = sizeof(res);
                        if (getsockopt(s, SOL_SOCKET, SO_ERROR, &res, &lres) == -1) {
                                perror("getsockopt");
                                return -1;
                        }

                        if (res) {
                                errno = res;
                                perror("connect");
                                return -1;
                        }
                }
        }

        return 0;
}

void *t_accept(void *arg)
{
        pthread_t t;
        int lfd, c;

        t = pthread_self();
        lfd = *(int *)arg;

        while (1) {
                if (pa) fprintf(stderr, "%-10lu: %16s(): blocking (lfd == %d)\n", (long)t, "t_accept", lfd);

                c = n_accept(lfd);

                if (pa) fprintf(stderr, "%-10lu: %16s(): accepted connection (c == %d)\n", (long)t, "t_accept", c);

                sleep(1);
                close(c);

                if (pa) fprintf(stderr, "%-10lu: %16s(): closed connection (c == %d)\n", (long)t, "t_accept", c);

                sleep(time(NULL) % 3);
        }

        return NULL;
}

void *t_connect(void *arg)
{
        struct sockaddr_in sa_c;
        pthread_t t;
        int cport, s;

        t = pthread_self();
        cport = *(unsigned short *)arg;
        n_init(&sa_c, "127.0.0.1", cport);

        while (1) {
                s = n_socket();

                if (pc) fprintf(stderr, "%-10lu: %16s(): entering (s == %d)\n", (long)t, "t_connect", s);

                n_conn(s, &sa_c, 60);

                if (pc) fprintf(stderr, "%-10lu: %16s(): connected (s == %d)\n", (long)t, "t_connect", s);

                sleep(1);
                close(s);

                if (pc) fprintf(stderr, "%-10lu: %16s(): closed (s == %d)\n", (long)t, "t_connect", s);

                sleep(time(NULL) % 4);
        }

        return NULL;
}

int main(int argc, char **argv)
{
        unsigned short lport = 9090;
        pthread_t ta[4];
        pthread_t tc[4];
        struct sockaddr_in sa_l;
        int lfd;
        size_t i;

        llimit = pa = pc = 1;
        switch (argc - 1) {
        case 3: if (isdigit(argv[3][0])) pc = atoi(argv[3]);
        case 2: if (isdigit(argv[2][0])) pa = atoi(argv[2]);
        case 1: if (isdigit(argv[1][0])) llimit = atoi(argv[1]);
        default: break;
        }

        n_init(&sa_l, "0.0.0.0", lport);
        n_bind((lfd = n_socket()), &sa_l);
        n_listen(lfd, llimit);

        for (i = 0; i < NE(ta); i++)
                pthread_create(&ta[i], NULL, t_accept, (void *)&lfd);

        for (i = 0; i < NE(tc); i++)
                pthread_create(&tc[i], NULL, t_connect, (void *)&lport);

        pthread_join(ta[0], NULL);

        return 0;
}


------------------------------------------------------------------------

/* test case */

$ uname -a
CYGWIN_NT-5.2 opteron 1.5.20s(0.155/4/2) 20060309 10:35:06 i686 Cygwin
$ md5sum pt_accept.c
3af48aebdda92068cd86f5c7c92c03a5 *pt_accept.c
$ gcc -v
Reading specs from /usr/lib/gcc/i686-pc-cygwin/3.4.4/specs
Configured with: /gcc/gcc-3.4.4/gcc-3.4.4-1/configure --verbose --prefix=/usr --exec-prefix=/usr --sysconfdir=/etc --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --enable-languages=c,ada,c++,d,f77,java,objc --enable-nls --without-included-gettext --enable-version-specific-runtime-libs --without-x --enable-libgcj --disable-java-awt --with-system-zlib --enable-interpreter --disable-libgcj-debug --enable-threads=posix --enable-java-gc=boehm --disable-win32-registry --enable-sjlj-exceptions --enable-hash-synchronization --enable-libstdcxx-debug : (reconfigured)
Thread model: posix
gcc version 3.4.4 (cygming special) (gdc 0.12, using dmd 0.125)
$ gcc -g3 -pedantic -W -Wall -O0 -o pta pt_accept.c -lpthread
$ ./pta
4981088   :         t_accept(): blocking (lfd == 3)
4981224   :         t_accept(): blocking (lfd == 3)
4981456   :         t_accept(): blocking (lfd == 3)
4981592   :         t_accept(): blocking (lfd == 3)
4981728   :        t_connect(): entering (s == 4)
4981864   :        t_connect(): entering (s == 5)
4981592   :         t_accept(): accepted connection (c == 6)
4982000   :        t_connect(): entering (s == 7)
4981728   :        t_connect(): connected (s == 4)
5006824   :        t_connect(): entering (s == 8)
5006824   :        t_connect(): connected (s == 8)
4981864   :        t_connect(): connected (s == 5)
4982000   :        t_connect(): connected (s == 7)
4981592   :         t_accept(): closed connection (c == 6)
4981592   :         t_accept(): blocking (lfd == 3)
4981592   :         t_accept(): accepted connection (c == 6)
5006824   :        t_connect(): closed (s == 8)
4981864   :        t_connect(): closed (s == 5)
4981728   :        t_connect(): closed (s == 4)
4982000   :        t_connect(): closed (s == 7)
4981592   :         t_accept(): closed connection (c == 6)
4981592   :         t_accept(): blocking (lfd == 3)
5006824   :        t_connect(): entering (s == 4)
4981592   :         t_accept(): accepted connection (c == 5)
5006824   :        t_connect(): connected (s == 4)
4981864   :        t_connect(): entering (s == 6)
4981864   :        t_connect(): connected (s == 6)
4982000   :        t_connect(): entering (s == 7)
4981728   :        t_connect(): entering (s == 8)
4982000   :        t_connect(): connected (s == 7)
4981728   :        t_connect(): connected (s == 8)
4981592   :         t_accept(): closed connection (c == 5)
5006824   :        t_connect(): closed (s == 4)
4981864   :        t_connect(): closed (s == 6)
4982000   :        t_connect(): closed (s == 7)
4981728   :        t_connect(): closed (s == 8)
4981592   :         t_accept(): blocking (lfd == 3)
4981592   :         t_accept(): accepted connection (c == 4)
4981592   :         t_accept(): closed connection (c == 4)
4981592   :         t_accept(): blocking (lfd == 3)
5006824   :        t_connect(): entering (s == 4)
4981592   :         t_accept(): accepted connection (c == 5)
5006824   :        t_connect(): connected (s == 4)
4981864   :        t_connect(): entering (s == 6)
4982000   :        t_connect(): entering (s == 7)
4982000   :        t_connect(): connected (s == 7)
4981864   :        t_connect(): connected (s == 6)
4981728   :        t_connect(): entering (s == 8)
4981728   :        t_connect(): connected (s == 8)
4981592   :         t_accept(): closed connection (c == 5)
5006824   :        t_connect(): closed (s == 4)
4981864   :        t_connect(): closed (s == 6)
4982000   :        t_connect(): closed (s == 7)
4981728   :        t_connect(): closed (s == 8)
^C

$ ./pta 0
4981224   :         t_accept(): blocking (lfd == 3)
4981088   :         t_accept(): blocking (lfd == 3)
4981544   :         t_accept(): blocking (lfd == 3)
4981408   :         t_accept(): blocking (lfd == 3)
4981680   :        t_connect(): entering (s == 4)
4981816   :        t_connect(): entering (s == 5)
4981408   :         t_accept(): accepted connection (c == 6)
4981816   :        t_connect(): connected (s == 5)
5006736   :        t_connect(): entering (s == 7)
4981680   :        t_connect(): connected (s == 4)
4994368   :        t_connect(): entering (s == 8)
4994368   :        t_connect(): connected (s == 8)
5006736   :        t_connect(): connected (s == 7)
4981408   :         t_accept(): closed connection (c == 6)
4981408   :         t_accept(): blocking (lfd == 3)
4981408   :         t_accept(): accepted connection (c == 6)
4981816   :        t_connect(): closed (s == 5)
4981680   :        t_connect(): closed (s == 4)
4994368   :        t_connect(): closed (s == 8)
5006736   :        t_connect(): closed (s == 7)
4981408   :         t_accept(): closed connection (c == 6)
4981408   :         t_accept(): blocking (lfd == 3)
4981816   :        t_connect(): entering (s == 4)
4981408   :         t_accept(): accepted connection (c == 5)
4981680   :        t_connect(): entering (s == 6)
4981680   :        t_connect(): connected (s == 6)
4994368   :        t_connect(): entering (s == 7)
4994368   :        t_connect(): connected (s == 7)
4981816   :        t_connect(): connected (s == 4)
5006736   :        t_connect(): entering (s == 8)
5006736   :        t_connect(): connected (s == 8)
4981408   :         t_accept(): closed connection (c == 5)
4981680   :        t_connect(): closed (s == 6)
4981408   :         t_accept(): blocking (lfd == 3)
4981408   :         t_accept(): accepted connection (c == 5)
4994368   :        t_connect(): closed (s == 7)
4981816   :        t_connect(): closed (s == 4)
5006736   :        t_connect(): closed (s == 8)
4981680   :        t_connect(): entering (s == 4)
4981680   :        t_connect(): connected (s == 4)
4981408   :         t_accept(): closed connection (c == 5)
4994368   :        t_connect(): entering (s == 5)
4994368   :        t_connect(): connected (s == 5)
4981816   :        t_connect(): entering (s == 6)
5006736   :        t_connect(): entering (s == 7)
4981816   :        t_connect(): connected (s == 6)
5006736   :        t_connect(): connected (s == 7)
^C

Serial behavior observed. I tried various combinations of reinitializing the socket
structure that connect() uses to within the while loop (which shouldn't be necessary),
nagle on and off, reuse on and off, listen() called per-thread, etc. and only serial
behavior is observed. I suspect something thread specific. The connect() call will
actually connect, but it's like being held in a listen queue - as accept() itself
doesn't return until the previous connection is closed. The second test is done with
a backlog of 0.


/* alternate test case */

$ uname -a
Linux ns1.anodized.com 2.6.9-34.EL #1 Wed Mar 8 00:07:35 CST 2006 i686 i686 i386 GNU/Linux
$ md5sum pt_accept.c
3af48aebdda92068cd86f5c7c92c03a5  pt_accept.c
$ gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.0.2/configure --prefix=/usr/local --srcdir=../gcc-4.0.2 --enable-languages=c,c++,objc
Thread model: posix
gcc version 4.0.2
$ gcc -g3 -pedantic -W -Wall -O0 -o pta pt_accept.c -lpthread
$ ./pta
3086691248:         t_accept(): blocking (lfd == 3)
3076201392:         t_accept(): blocking (lfd == 3)
3065711536:         t_accept(): blocking (lfd == 3)
3055221680:         t_accept(): blocking (lfd == 3)
3044731824:        t_connect(): entering (s == 4)
3044731824:        t_connect(): connected (s == 4)
3086691248:         t_accept(): accepted connection (c == 5)
3034241968:        t_connect(): entering (s == 6)
3076201392:         t_accept(): accepted connection (c == 7)
3034241968:        t_connect(): connected (s == 6)
3023752112:        t_connect(): entering (s == 8)
3065711536:         t_accept(): accepted connection (c == 9)
3023752112:        t_connect(): connected (s == 8)
3013262256:        t_connect(): entering (s == 10)
3055221680:         t_accept(): accepted connection (c == 11)
3013262256:        t_connect(): connected (s == 10)
3044731824:        t_connect(): closed (s == 4)
3086691248:         t_accept(): closed connection (c == 5)
3076201392:         t_accept(): closed connection (c == 7)
3034241968:        t_connect(): closed (s == 6)
3065711536:         t_accept(): closed connection (c == 9)
3023752112:        t_connect(): closed (s == 8)
3055221680:         t_accept(): closed connection (c == 11)
3013262256:        t_connect(): closed (s == 10)
3086691248:         t_accept(): blocking (lfd == 3)
3076201392:         t_accept(): blocking (lfd == 3)
3065711536:         t_accept(): blocking (lfd == 3)
3055221680:         t_accept(): blocking (lfd == 3)
3044731824:        t_connect(): entering (s == 4)
3044731824:        t_connect(): connected (s == 4)
3086691248:         t_accept(): accepted connection (c == 5)
3034241968:        t_connect(): entering (s == 6)
3034241968:        t_connect(): connected (s == 6)
3076201392:         t_accept(): accepted connection (c == 7)
3023752112:        t_connect(): entering (s == 8)
3023752112:        t_connect(): connected (s == 8)
3065711536:         t_accept(): accepted connection (c == 9)
3013262256:        t_connect(): entering (s == 10)
3013262256:        t_connect(): connected (s == 10)
3055221680:         t_accept(): accepted connection (c == 11)
3044731824:        t_connect(): closed (s == 4)
3086691248:         t_accept(): closed connection (c == 5)
3086691248:         t_accept(): blocking (lfd == 3)
3034241968:        t_connect(): closed (s == 6)
3076201392:         t_accept(): closed connection (c == 7)
3076201392:         t_accept(): blocking (lfd == 3)
3023752112:        t_connect(): closed (s == 8)
3065711536:         t_accept(): closed connection (c == 9)
3065711536:         t_accept(): blocking (lfd == 3)
3013262256:        t_connect(): closed (s == 10)
3055221680:         t_accept(): closed connection (c == 11)
3055221680:         t_accept(): blocking (lfd == 3)
3044731824:        t_connect(): entering (s == 4)
3044731824:        t_connect(): connected (s == 4)
3086691248:         t_accept(): accepted connection (c == 5)
3034241968:        t_connect(): entering (s == 6)
3034241968:        t_connect(): connected (s == 6)
3076201392:         t_accept(): accepted connection (c == 7)
3023752112:        t_connect(): entering (s == 8)
3023752112:        t_connect(): connected (s == 8)
3065711536:         t_accept(): accepted connection (c == 9)
3013262256:        t_connect(): entering (s == 10)
3013262256:        t_connect(): connected (s == 10)
3055221680:         t_accept(): accepted connection (c == 11)
3044731824:        t_connect(): closed (s == 4)
3086691248:         t_accept(): closed connection (c == 5)
3034241968:        t_connect(): closed (s == 6)
3076201392:         t_accept(): closed connection (c == 7)
3023752112:        t_connect(): closed (s == 8)
3065711536:         t_accept(): closed connection (c == 9)
3013262256:        t_connect(): closed (s == 10)
3055221680:         t_accept(): closed connection (c == 11)
^C

$ ./pta 0
3086515120:         t_accept(): blocking (lfd == 3)
3076025264:         t_accept(): blocking (lfd == 3)
3065535408:         t_accept(): blocking (lfd == 3)
3055045552:         t_accept(): blocking (lfd == 3)
3044555696:        t_connect(): entering (s == 4)
3044555696:        t_connect(): connected (s == 4)
3034065840:        t_connect(): entering (s == 5)
3034065840:        t_connect(): connected (s == 5)
3086515120:         t_accept(): accepted connection (c == 6)
3023575984:        t_connect(): entering (s == 7)
3076025264:         t_accept(): accepted connection (c == 8)
3023575984:        t_connect(): connected (s == 7)
3013086128:        t_connect(): entering (s == 9)
3065535408:         t_accept(): accepted connection (c == 10)
3013086128:        t_connect(): connected (s == 9)
3044555696:        t_connect(): closed (s == 4)
3034065840:        t_connect(): closed (s == 5)
3055045552:         t_accept(): accepted connection (c == 4)
3086515120:         t_accept(): closed connection (c == 6)
3076025264:         t_accept(): closed connection (c == 8)
3023575984:        t_connect(): closed (s == 7)
3065535408:         t_accept(): closed connection (c == 10)
3013086128:        t_connect(): closed (s == 9)
3044555696:        t_connect(): entering (s == 5)
3044555696:        t_connect(): connected (s == 5)
3034065840:        t_connect(): entering (s == 6)
3034065840:        t_connect(): connected (s == 6)
3055045552:         t_accept(): closed connection (c == 4)
3086515120:         t_accept(): blocking (lfd == 3)
3086515120:         t_accept(): accepted connection (c == 4)
3076025264:         t_accept(): blocking (lfd == 3)
3023575984:        t_connect(): entering (s == 7)
3023575984:        t_connect(): connected (s == 7)
3076025264:         t_accept(): accepted connection (c == 8)
3065535408:         t_accept(): blocking (lfd == 3)
3013086128:        t_connect(): entering (s == 9)
3013086128:        t_connect(): connected (s == 9)
3065535408:         t_accept(): accepted connection (c == 10)
3044555696:        t_connect(): closed (s == 5)
3034065840:        t_connect(): closed (s == 6)
3086515120:         t_accept(): closed connection (c == 4)
3086515120:         t_accept(): blocking (lfd == 3)
3086515120:         t_accept(): accepted connection (c == 4)
3023575984:        t_connect(): closed (s == 7)
3076025264:         t_accept(): closed connection (c == 8)
3076025264:         t_accept(): blocking (lfd == 3)
3013086128:        t_connect(): closed (s == 9)
3065535408:         t_accept(): closed connection (c == 10)
3065535408:         t_accept(): blocking (lfd == 3)
3055045552:         t_accept(): blocking (lfd == 3)
3086515120:         t_accept(): closed connection (c == 4)
3086515120:         t_accept(): blocking (lfd == 3)
^C

Concurrency.

-cl

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]