I'm taking an operating systems course and I'm having a hard time how input is redirected with dup2 when you have forks. I wrote this small program to try and get a sense for it but I wasn't successful in passing the output of a grand-child to a child. I am trying to mimick the unix command: ps -A | wc -l. I'm new to Unix, but I believe this should count the lines of the list of running processes I get. So my output should be a single number.
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <iostream>
using namespace std;
int main( int argc, char *argv[] ) {
char *searchArg = argv[ 1 ];
pid_t pid;
if ( ( pid = fork() ) > 0 ) {
wait( NULL );
cout << "Inside parent" << endl;
}
else if ( pid == 0 ) {
int fd1[ 2 ];
pipe( fd1 );
cout << "Inside child" << endl;
if ( pid = fork() > 0 ) {
dup2( fd1[ 0 ], 0 );
close( fd1[ 0 ] );
execlp( "/bin/wc", "-l", NULL );
}
else if ( pid == 0 ) {
cout << "Inside grand child" << endl;
execlp( "/bin/ps", "-A", NULL );
}
}
return 0;
}
I don't have it in the code above, but here is my guess on how things should go down:
Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?
Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?
Which one of these is closed and left open for wc to receive and process ps's output? I keep thinking this needs to be thought of backwards since ps needs to give its output to wc...but that doesn't seem to make sense since both child and grand-child are being processed in parallel.
First off, let's fix your code so that we add a tiny bit more error-checking to it, and so that it works; replace the bottom bit with:
else if ( pid == 0 ) {
int fd1[ 2 ];
pipe( fd1 );
cout << "Inside child" << endl;
if ( (pid = fork()) > 0 ) {
if (dup2( fd1[ 0 ] , 0 ) < 0) {
cerr << "Err dup2 in child" << endl;
}
close( fd1[ 0 ] );
close( fd1[ 1 ] ); // important; see below
// Note: /usr/bin, not /bin
execlp( "/usr/bin/wc", "wc", "-l", NULL );
cerr << "Err execing in child" << endl;
}
else if ( pid == 0 ) {
cout << "Inside grand child" << endl;
if (dup2( fd1[ 1 ] , 1 ) < 0) {
cerr << "Err dup2 in gchild" << endl;
}
close( fd1[ 0 ] );
close( fd1[ 1 ] );
execlp( "/bin/ps", "ps", "-A", NULL );
cerr << "Err execing in grandchild" << endl;
}
}
Now, your questions:
Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?
The filedescriptors 0
, 1
, and 2
are special in Unix in that they are the standard input, standard output, and standard error. wc
reads from standard input, so whatever is dup
ed to 0
.
Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?
In general, after a process has had its image swapped out with exec
, it will have all the open file descriptors it had before exec
. (Except for those descriptors with the CLOSE_ON_EXEC
flag set, but ignore that for now) Therefore, if you dup2
something to 0
, then wc
will read it.
Which one of these is closed and left open for wc to receive and process ps's output?
As shown above, you can close both ends of the pipe in both child and grandchild, and that'll be fine. In fact, standard practice would recommend that you do that. However, the only truly necessary close
line in this specific example is the one I comment as "important" - that's closing the write end of the pipe in the child.
The idea is this: both child and grand-child have both ends of the pipe open when they start. Now, through dup
we've connected wc
to the read end of the pipe. wc
is going to keep sucking on that pipe until all descriptors on the write end of the pipe are closed, at which point it'll see that it came to the end of the file and stop. Now, in the grand-child, we can get away with not closing anything because ps -A
isn't going to do anything with any of the descriptors but write to descriptor 1
, and after ps -A
finishes spitting out stuff about some processes it'll exit, closing everything it had. In the child, we don't really need to close the read descriptor stored in fd[0]
because wc
isn't going to try to read from anything but descriptor 0
. However, we do need to close the write end of the pipe in the child because otherwise wc
is just going to sit there with a pipe that's never completely closed.
As you can see, the reasoning for why we didn't really need any of the close
lines except the one marked "important" depend on the details of how wc
and ps
are going to behave, so the standard practice is to close the end of a pipe you aren't using completely, and keep open the end you are using only with one descriptor. Since you're using dup2
in both processes, that means four close
statements as above.
EDIT: Updated the arguments to execlp
.