[syslinux] TFTP forking thousands of processes

Nikola Ciprich nikola.ciprich at linuxbox.cz
Tue Nov 13 04:49:46 PST 2012


Hi,

I'm trying to debug crashing server, and I'm starting to suspect tftp
server as one of possible culprits.
The box is quad core Xeon with 4GB of memory serving as Asterisk PBX + some
additional services (reporting, IS integration etc). Among others, it's also
serving as provisioning box for over hundred of SPA504 phones. Phones are
checking configuration files every 60s meaning average of a bit over one TFTP
request/second.

It happened few times, that box stopped replying and had to be rebooted. Checking
various logs and especially atop files showed, that minutes before the box became
unreachable, number of in.tftpd processes has grown to ~4000. Checking pcap files
showed that phones were sending file requests, but were not getting any replies
thus retrying every 5s (so tftp request rade growed to ~100files each 5s).

We were trying to simulate the problem on test virtual guest and successfully
reproduced it:

- VM got setup as tftp server (tftp started as xinet service, details below)
- on host, we've blocked all outgoing tftp traffic using iptables.

then trying tftp request from some other machine spawns hanging tftp process
(running client 1000 times spawns 1000 in.tftpd processes)

I see various problems here:
- phones behave just stupid, retrying each 5s over and over. this seems to be
fixed by newer firmware though (after 5 retries, it sleeps for a minute). but newer
firmware has some other problems, but it's not important here.

- I'm not sure whether the root cause is not somewhere in network stack or so, but
other applications seem to be communicating without problems, only tftp stops replying.

Anyways, I wonder whether forking so many processes hanging on select() is correct behaviour.
My inet configuration looks like this:

service tftp
{ 
	disable			= no
	socket_type		= dgram 
	protocol		= udp 
	wait			= yes 
	user			= root
	server					= /usr/sbin/in.tftpd
	server_args		= -s /tftpboot -v -v -v -c
	per_source		= 11
	cps			= 100 2 
	flags			= IPv4
	instances		= 500 
} 

Which I'd undetstand that no more then 500 in.tftpd processes should be spawned.
I guess tftp is forking on it's own, right? Is it possible it could block somewhere
unable to send replis, but forking on (repeated) requests?

Does somebody have an idea on where the problem could be, or how should I proceed
with debugging?

The box is x86_64 centos, tried both 0.49 and 5.2 versions of tftp.

I'd be very gratefull for any help.

thanks in advance

with best regards

nik
-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:  +420 591 166 214
fax:   +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis at linuxbox.cz
-------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://www.zytor.com/pipermail/syslinux/attachments/20121113/681c2178/attachment.sig>


More information about the Syslinux mailing list