trinity-devel@lists.pearsoncomputing.net

Message: previous - next
Month: March 2014

Re: [trinity-devel] Bug 1902 - tdeio_imap will exhaust available imap (anvil) if kmail open

From: "David C. Rankin" <drankinatty@...>
Date: Wed, 05 Mar 2014 23:45:24 -0600
On 03/03/2014 02:06 PM, François Andriot wrote:
> To be more precise, TDE does not lose tracks of the tdeio process. The tdeio
> scheduler is always aware of its slave threads.
> The actual problem is that the tdeio scheduler never receives the "job is
> finished" notification from some slaves.
> So it considers this slave as being eternally busy and keeps spawning new ones ...
> 
> The nominal scenario looks like:
> 1) an application requests an URL to the tdeio scheduler (e.g. konqueror asks
> "directory listing for sftp://remotehost/")
> 2) the tdeio scheduler instantiates a "job"
> 3) the job looks for an idle "slave" that can do the job (e.g. correct
> protocol), uses one if it exists, or else asks the scheduler to instantiate a
> new slave.
> 4) the slave spawns a 3rd party process (ssh in my case) and waits for text
> output. (note: stheome slaves do the job directly without spawning a 3rd party
> process)
> 5) The 3rd party process does its job (remote directory listing for example) and
> writes output to the slave.
> 6) After the command is complete, the slave ceases receiving data because
> nothing is written anymore by the 3rd party process.
> 7) The slave sends "finished" to the job.
> 8) the job sends "finished" to the scheduler.
> 9) the scheduler deletes the job and puts the slave in the "idle slave list" so
> that it can be reused by another job, or will be killed after some minutes of
> idleness.
> 
> What happens with my "kdirlist" problem (and probably your imap problem too), is
> that step 7 never occurs.
> For an unknown reason, the slave, after having received the correct data, never
> notifies the job that it has finished, then the job never notifies the
> scheduler, then the scheduler think the slave is still active and does not mark
> it as "idle" ... and here is our stale tdeioslave ... (note: the slave
> eventually gets killed if the remote host closes network connection for idleness
> ... but it looks like this does not happen with ssh protocol)
> 
> I'm currently looking into the cache mechanism of the "kdirlist" job class.
> I believe (to be confirmed) that when kdirlist uses its internal cache, it still
> spawns a slave but does NOT uses it at all, since it already has the data it is
> looking for in its cache.
> Then it returns the cached data and ignores the spawned slave, which sits there
> forever, waiting for a query from the job that never comes  ...
> 
> Francois

Francois,

  I'll say it again, "you're good!"

  Is the slave in #7 actually sending the "finished" and it never makes it to
the job? Or is it not sending "finished" at all?

  Is there any possibility that #7 does not occur because the slave does not
know where to send "finished"? What links/connects the slave to the job? Is it a
signal/slot, or some memory address that the job originally passed to the slave
in #3? Or does the slave just generate the "finished" and pass some type of job
number along with it after #6??

  Could the slave/job connection created in #3 be broken somehow such that the
reverse path in #7 no longer exits after the delay in #6?

-- 
David C. Rankin, J.D.,P.E.