summaryrefslogtreecommitdiffstats
path: root/qarsh.c
Commit message (Collapse)AuthorAgeFilesLines
* [qarsh] Add recv_packet with selects and hbeatNathan Straz2014-09-051-1/+67
| | | | | | | | There was an issue when we rebooted a node via qarsh. The packet size would make it back from qarshd, but none of the data so we would get stuck in a read(). We needed to avoid a blocking read() and check for a heart beat while we wait for the rest of the packet.
* [qarsh] Mark the node alive after we get a packetNathan Straz2014-09-051-6/+5
|
* [qarsh] Remove extra ':' from error messageNathan Straz2014-09-051-1/+1
|
* Include host name in reboot and host down messagesNathan Straz2014-02-251-5/+5
| | | | | | | After a long running command or when multiple qarsh processes are running in parallel, it's hard to know which host is having trouble when a "Remote host rebooted" message shows up. Include the host name in these message to make them more informative.
* Add -T option as a no-opNathan Straz2014-02-251-1/+3
| | | | | | This is an option from ssh which disables pseudo-tty allocation. Since we don't allocate them in the first place, we're compatible with it.
* Add a warning in qarsh about btimedNathan Straz2013-10-021-0/+3
| | | | | | If we don't get a heart beat when qarsh starts, print a warning to check on the btimed service. Commands could be stopped early if it doesn't produce any output and qarsh thinks the host is down.
* Remove offset from data packetNathan Straz2013-10-021-2/+2
| | | | | We really don't need this field since we always copy data sequentially.
* Only print an error if we didn't hit a broken pipeNathan Straz2013-09-191-3/+6
|
* Handle stdin pipe closing on usNathan Straz2013-09-191-0/+6
| | | | | | | If xiogen is flooding requests across qarsh and xdoio decides to stop, we need to handle that gracefully. Also, making the pipe non-blocking was not a good idea, xdoio gets the read error EAGAIN and stops there.
* Creat a thin logging layerNathan Straz2013-09-191-3/+12
| | | | | | | When qarshd is run via xinetd, stderr still goes out the socket and messages from sockutil.c or qarsh_packet.c can interfere with the protocol. Create a thin wrapper which qacp and qarsh can send to stderr and qarshd can send to syslog.
* Check all returns from send_packet in qarshNathan Straz2013-09-171-8/+41
|
* Say something if we can't write to stdoutNathan Straz2013-09-171-0/+2
|
* Increase buffer and packet sizesNathan Straz2013-09-161-1/+1
| | | | | | | | | This coordinates the buffer sizes with the max packet size. qarshd and qarsh will probably break if this value does not match between client and server builds. Also increase the value to reduce overhead. A max packet size of 16k only yields 40MB/s. Increase that to 128k and we can do 500MB/s.
* Only check if stdin is a tty onceNathan Straz2013-09-161-1/+2
|
* Don't need to copy this stringNathan Straz2013-09-111-1/+1
|
* Fix up stdin handlingNathan Straz2013-09-111-18/+27
|
* Don't bother freeing remuser before we exitNathan Straz2013-09-111-1/+0
| | | | | | | If the user is specified as part of the host, we don't need to free it and if it was a separate option, it will get freed when the process ends
* Get commands running over one socketNathan Straz2013-09-111-130/+94
| | | | Added a new packet to limit data sent from the other side.
* Clean up unused variable warningNathan Straz2013-09-111-3/+0
|
* Move packet sequence numbering to send_packetNathan Straz2013-09-111-4/+0
|
* Move to a new port for the new protocolNathan Straz2013-09-111-1/+1
| | | | I don't see any way to coexist with the old "protocol"
* Fill in sequence numbers with a real sequenceNathan Straz2012-12-181-2/+4
|
* Spotted a missing freeNathan Straz2012-04-191-0/+1
|
* Move error message and exit from signal handler.Nathan Straz2012-02-201-2/+6
|
* Check return of send_package and exit on errorNathan Straz2012-02-201-1/+6
|
* Fix exit code for connection failures.Nathan Straz2012-01-261-1/+1
|
* Merge branch 'ipv6' of ssh://sts-a//home/msp/djansa/src/git/qarsh into ipv6Nathan Straz2010-09-301-4/+5
|\ | | | | | | | | | | Conflicts: qarsh.c sockutil.c
| * First crack at ipv6/ipv4 agnostic qarsh/qacp.Dean Jansa2010-09-281-4/+5
| |
* | Wait up to 30 second to establish a connectionNathan Straz2010-09-281-1/+2
|/ | | | | | The user specified time for holding a connection only. If the user uses too small a time, like if they are rebooting a node, the initial connection may fail.
* Close file descriptors left open by parent processNathan Straz2010-04-281-0/+4
| | | | | | Running things in parallel with pthreads in perl can lead to file descriptor leaks which may cause hangs in qarsh.
* Only look up local username if remote not specifiedNathan Straz2009-10-081-4/+4
| | | | | | In rare cases the getpwuid() call will fail because of a YP or LDAP timeout. If we're not using the local username we shouldn't even bother looking it up.
* [qarsh] Handle a very broken qarshdNathan Straz2009-03-271-1/+5
| | | | | | If qarshd is broken enough that it can't load libxml2.so, it won't return an XML packet which we can parse. set_remote_user() really needs to error out of we didn't get a packet back.
* [qarsh] Remove continue which could make qarsh hangNathan Straz2009-01-081-4/+0
| | | | | | | | | I don't know how, but I found one instance of qarsh looping through the pselect loop with a one second timeout. If the command has exited and the output file descriptors are all closed, we fall onto this continue which prevents us from getting to the break at the end of the loop. The only thing the continue skips over is that check which we really should check, so remove the continue.
* Don't wait until after the pselect to see if we're done.Nathan Straz2008-11-071-7/+8
| | | | | | All the actions which need to be done before we exit are done after the pselect. Waiting until after the next pselect can cause us to sit for a second before we exit, which slows down things which use qarsh.
* [qarsh] Fix double-free and command hangNathan Straz2008-10-201-11/+14
| | | | | | | | | | | | | | We need to test all exit conditions at once so we fall back into the hbeat code. I was falling into a case when running "reboot -fin &" where the command would exit, but the sockets would not close and we weren't getting to hbeat() to detect the reboot. In another case which I can't completely explain, we were getting a double-free error from glibc in the qpfree() at the end of run_remote_cmd(). Instead of waiting until the very end to read the exit status, save it off as soon as we get the packet and use cmd_exitted to determine if we have an exit code or if something went horribly wrong.
* Let's wait one second instead of spinning hard on pselect().Nate Straz2008-09-231-1/+1
|
* Bob found a problem with commands that exit quickly, the pselect times out soNate Straz2008-09-231-6/+10
| | | | quickly that we do a hbeat then go back to the pselect.
* Make sure qarsh waits for all output from the remote host.Nate Straz2008-09-231-1/+1
|
* Fix up copyright dates.Nate Straz2008-09-231-1/+1
|
* Add write file descriptors to main select() call.Nate Straz2008-09-231-21/+35
| | | | | | | | When running rsync on an existing directory structure, rsync may be too busy to read everything that qarsh is writing to it from the remote rsync daemon. Create a buffer for each of stdin, stdout, and stderr and keep it around until we are able to write it, holding off further reads until it can be written. We still don't handle partial writes.
* We don't need to check the heartbeat anymore when the commandNate Straz2008-09-231-4/+2
| | | | has exitted and we've processed all the output.
* Increase the buffer size we use to read output from the host. This shouldNate Straz2008-09-231-6/+7
| | | | | help get all output before we exit. There is still a race if the cmdexit packet returns before all output where we could truncate output.
* Update copyright dates.Nate Straz2008-09-231-1/+1
|
* We need to reset the signal handlers and sigmask so the raise() works.Nate Straz2008-09-231-0/+21
|
* Push processing of the remote command status to main() so it can beNate Straz2008-09-231-9/+10
| | | | | reproduced properly by qarsh. Return 127 as an exit code on internal error cases.
* Add copyright notices and GPL headerNate Straz2008-09-231-0/+17
|
* When no args were given, argv[0] would return NULL and the programNate Straz2008-09-231-7/+6
| | | | | name in the usage output would show "(null)." qarsh isn't called anything else so just hard code qarsh in the usage message.
* Fix a minor whitespace issueNate Straz2008-09-231-2/+2
|
* rsync puts the hostname before the -l <user> arg so we need to act more likeNate Straz2008-09-231-4/+19
| | | | | | OpenSSH. After they get to the end of the args and they haven't gotten a host name yet they chew the next arg as the hostname and restart parsing the command line. Now we do too.
* Doc the QARSH_TIMEOUT env varDean Jansa2008-09-231-0/+1
|