trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: March 2014

Re: [trinity-devel] Bug 1902 - tdeio_imap will exhaust available imap (anvil) if kmail open

From: François Andriot <francois.andriot@...>
Date: Mon, 03 Mar 2014 21:06:54 +0100
Le 03/03/2014 15:17, David C. Rankin a écrit :
> Yes, I saw where you wrote that opening a 'single' remote file with a unique URL
> does not generate additional tdeio_x slaves. I have confirmed that if I use
> konqueror in type the complete URL:
>
> sftp://somehost.tld/path/to/a/filename.ext
>
> No stale tdeio_sftp processes are created. I have also, confirmed that the
> tdeio_http processes are ultimately killed by something (presumably the failsafe
> idle_timeout), but I don't think that behavior is correct.

It looks like tdeio_http is a special case with hardcoded longer 
timeouts than the default ones.

>
> The dirlist on remote hosts does look like it is part of the problem. It's like
> TDE loses track of all the tdeio_x processes created to build the remote
> '/path/to/some/' before getting to 'filename.ext'
>
>


To be more precise, TDE does not lose tracks of the tdeio process. The 
tdeio scheduler is always aware of its slave threads.
The actual problem is that the tdeio scheduler never receives the "job 
is finished" notification from some slaves.
So it considers this slave as being eternally busy and keeps spawning 
new ones ...

The nominal scenario looks like:
1) an application requests an URL to the tdeio scheduler (e.g. konqueror 
asks "directory listing for sftp://remotehost/")
2) the tdeio scheduler instantiates a "job"
3) the job looks for an idle "slave" that can do the job (e.g. correct 
protocol), uses one if it exists, or else asks the scheduler to 
instantiate a new slave.
4) the slave spawns a 3rd party process (ssh in my case) and waits for 
text output. (note: stheome slaves do the job directly without spawning 
a 3rd party process)
5) The 3rd party process does its job (remote directory listing for 
example) and writes output to the slave.
6) After the command is complete, the slave ceases receiving data 
because nothing is written anymore by the 3rd party process.
7) The slave sends "finished" to the job.
8) the job sends "finished" to the scheduler.
9) the scheduler deletes the job and puts the slave in the "idle slave 
list" so that it can be reused by another job, or will be killed after 
some minutes of idleness.

What happens with my "kdirlist" problem (and probably your imap problem 
too), is that step 7 never occurs.
For an unknown reason, the slave, after having received the correct 
data, never notifies the job that it has finished, then the job never 
notifies the scheduler, then the scheduler think the slave is still 
active and does not mark it as "idle" ... and here is our stale 
tdeioslave ... (note: the slave eventually gets killed if the remote 
host closes network connection for idleness ... but it looks like this 
does not happen with ssh protocol)

I'm currently looking into the cache mechanism of the "kdirlist" job class.
I believe (to be confirmed) that when kdirlist uses its internal cache, 
it still spawns a slave but does NOT uses it at all, since it already 
has the data it is looking for in its cache.
Then it returns the cached data and ignores the spawned slave, which 
sits there forever, waiting for a query from the job that never comes  ...

Francois