| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
at least in important cases (not for non-direct action queues and some
other minor things). This version is definitely buggy, but may be tried
with success on a non-production system. I will continue to work on the
correctness, but needed to commit now to get a baseline.
|
| |
|
|
|
|
|
|
|
|
| |
We now manage to cancel threads that block inside a retry loop to
terminate without the need to cancel the thread. Avoiding cancellation
helps keep the system complexity minimal and thus provides for better
stability. This also solves some issues with improper shutdown when
inside an action retry loop.
|
| |
|
|
|
|
| |
code did not compile after merge from v4
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
runtime/Makefile.am
runtime/atomic.h
runtime/queue.c
runtime/queue.h
runtime/wti.c
runtime/wti.h
runtime/wtp.c
runtime/wtp.h
|
| |
| |
| |
| |
| |
| | |
replaced atomic operation emulation with new code. The previous code
seemed to have some issue and also limited concurrency severely. The
whole atomic operation emulation has been rewritten.
|
| |
| |
| |
| |
| | |
This is for another prctl() call, not present in the beta version (looks like it
would make sense to stick these into a utility function)
|
| |
| |
| |
| |
| |
| |
| |
| | |
- bugfix: subtle (and usually irrelevant) issue in timout processing
timeout could be one second too early if nanoseconds wrapped
- set a more sensible timeout for shutdow, now 1.5 seconds to complete
processing (this also removes those cases where the shutdown message
was not written because the termination happened before it)
|
| |
| |
| |
| | |
issues
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
Problems could happen if the queue worker needed to be cancelled
and this cancellation happened inside queue-code (including
wtp, wti). We have now solved this by disabling cancellation while
in this code and only enabling it when working inside the user consumer.
This exactly matches the use case for which cancellation may be needed.
|
| |
| |
| |
| |
| |
| |
| | |
these occured in very unusual scenarios where we had a DA-queue running
in parallel and very lengthy actions. Then, in some situations, the
shutdown could hang. The code needs some addition lab time, but
is believed to be much better than any previous version.
|
| |
| |
| |
| |
| |
| |
| |
| | |
support for enhancing probability of memory addressing failure by
using non-NULL default value for malloced memory (optional, only if
requested by configure option). This helps to track down some
otherwise undetected issues within the testbench and is expected
to be very useful in the future.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
simplified and thus speeded up the queue engine, also fixed some
potential race conditions (in very unusual shutdown conditions)
along the way. The threading model has seriously changes, so there may
be some regressions.
NOTE: the code passed basic tests, but there is still more work
and testing to be done. This commit should be treated with care.
|
| |
| |
| |
| | |
... non-working version!
|
| |
| |
| |
| |
| |
| |
| |
| | |
- bugfix: solved potential (temporary) stall of messages when the queue was
almost empty and few new data added (caused testbench to sometimes hang!)
- fixed some race condition in testbench
- added more elaborate diagnostics to parts of the testbench
- solved a potential race inside the queue engine
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
code review brought up some few places where we may have run into a race.
They have most probably been introduced during the recent set of changes. But
I do not look at older versions because of the changed architecture, one can
not simply backport this patch.
|
| |
| |
| |
| |
| | |
This did NOT leak based on message volume. Also, did some cleanup during
the commit.
|
| |
| |
| |
| |
| | |
... greater performance and was able to remove a potential troublespot
in a cancel cleanup handler.
|
| |
| |
| |
| |
| |
| | |
...if not running in direct mode. Previous versions could run without
any active workers. This simplifies the code at a very small expense.
See v5 compatibility note document for more in-depth discussion.
|
| |
| |
| |
| |
| | |
... by utilizing that we need to modify a state variable only in
a sequential way during shutdown.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
... could even remove one mutex by using a better algorithm. I think I also
spotted some situation in which a hang could have happened. As I can't fix it
in v4 and less without moving to the new engine, I make no effort in testing
this out. Hangs occur during shutdown, only (if at all). The code changes
should also result in some mild performance improvement. Some bug potential,
but overall the bug potential should have been greatly reduced.
|
| |
| |
| |
| | |
reducing the number of thread cancellation state changes
|
| | |
|
|\ \
| | |
| | |
| | |
| | | |
Conflicts:
tests/nettester.c
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
based on now working with detached threads. This is probably the biggest
patch in this series and with large bug potential.
|
|/ / |
|
| |
| |
| |
| |
| |
| | |
... first commit in a series of more. Makes worker threads detached. Needs more
testing (will be done soon) and if it works as expected, we can further reduce
code.
|
| |
| |
| |
| |
| |
| |
| |
| | |
- bugfix: subtle (and usually irrelevant) issue in timout processing
timeout could be one second too early if nanoseconds wrapped
- set a more sensible timeout for shutdow, now 1.5 seconds to complete
processing (this also removes those cases where the shutdown message
was not written because the termination happened before it)
|
|\|
| |
| |
| |
| |
| | |
Conflicts:
runtime/atomic.h
runtime/wti.c
|
| |
| |
| |
| |
| | |
... as far as I think this mostly is to keep the thread debuggers
happy
|
| |
| |
| |
| |
| | |
mostly to get thread debugger errors clean (plus, of course, it
makes things more deterministic)
|
| | |
|
|\|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This was a complex manual merge, especially in action.c. So if
there occur some problems, this would be a good point to start
troubleshooting. I run a couple of tests before commiting and
they all went well.
Conflicts:
action.c
action.h
runtime/queue.c
runtime/queue.h
runtime/wti.c
runtime/wti.h
|
| |
| |
| |
| |
| |
| |
| | |
we usually stay long enough inside the actions, so there should be
no problem with reaching a cancellation point. Actually, if we
really need to cancel, the thread is in an output action (otherwise
it would have willingly terminated).
|
| |
| |
| |
| |
| |
| | |
... as it was not even optimal on uniprocessors any longer ;) I keep
the config directive in, maybe we can utilize it again at some later
point in time (questionable).
|
| |
| |
| |
| | |
and another problem, both introduced today. Also did some general cleanup.
|
| |
| |
| |
| | |
The enhanced testbench now runs without failures, again
|
| |
| |
| |
| |
| | |
also changed DA queue mode in that the regular workers now run
concurrently.
|
| |
| |
| |
| |
| |
| | |
... in preparation for some larger changes - I need to apply some
serious design changes, as the current system does not play well
at all with ultra-reliable queues. Will do that in a totally new version.
|
| |
| |
| |
| | |
slightly improved situation, would like to save it before carrying on
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
... and also improved the test suite. There is a design issue in the
v3 queue engine that manifested to some serious problems with the new
processing mode. However, in v3 shutdown may take eternally if a queue
runs in DA mode, is configured to preserve data AND the action fails and
retries immediately. There is no cure available for v3, it would
require doing much of the work we have done on the new engine. The window
of exposure, as one might guess from the description, is very small. That
is probably the reason why we have not seen it in practice.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
so far, the last processed message was only freed when the next
one was processed. This has been changed now. More precisely, a
better algorithm has been selected for the queue worker process, which
also involves less overhead than the previous one. The fix for
"free last processed message" as then more or less a side-effect
(easy to do) of the new algorithm.
|
| | |
|
| | |
|
| | |
|