Bug #7162
capsule: synchronize command never times out/silently fails.
Description
Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1102763
Description of problem:
if a capsule runs into some issue that keeps syncs from completing, there is nothing to indicate this
Version-Release number of selected component (if applicable):
Satellite-6.0.3-RHEL-6-20140528.4
How reproducible:
unsure.
Steps to Reproduce:
1. Attempt to sync sat content to a capsule. It may (or may not) help to reproduce this if you have two servers over wide geographical locations
2. Wait
3. View results.
Actual results:
In the synchronize process... user sees really nothing, other than the progress bar never moving -- in my case at 50%.
In the pulp logs on sat server we see:
May 29 15:48:44 ibm-x3550m3-07 pulp: pulp.server.async.scheduler:ERROR: Workers 'reserved_resource_worker-23@ibm-x3550m3-07.lab.eng.brq.redhat.com' has gone missing, removing from list of workers
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: 7461b684-4048-4e72-94dd-3b82956c6fab
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: Traceback (most recent call last):
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py", line 113, in get
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: return self.__receiver.fetch(timeout=timeout)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "<string>", line 6, in fetch
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 1030, in fetch
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: self._ecwait(lambda: self.linked)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: result = self._ewait(lambda: self.closed or predicate(), timeout)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 994, in _ewait
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: self.check_error()
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 983, in check_error
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: raise self.error
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: NotFound: no such queue: pulp.task
Expected results:
The log above may have something to do with what's causing this, but is not the main issue; rather, the issue is the synchronizer never "gives up" or indicates an issue
Additional info:
Related issues
Associated revisions
Fixes #7162 - timeout capsule sync task
refs #7162 - fixing capsule sync timeout
History
#1
Updated by Brad Buckingham over 7 years ago
- Assignee set to Brad Buckingham
- Triaged changed from No to Yes
#2
Updated by The Foreman Bot over 7 years ago
- Status changed from New to Ready For Testing
- Target version set to 63
- Pull request https://github.com/Katello/katello/pull/4595 added
- Pull request deleted (
)
#3
Updated by Eric Helms over 7 years ago
- Target version changed from 63 to 55
#4
Updated by Brad Buckingham over 7 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 0 to 100
Applied in changeset katello|f778714b1e934857301ec977c31ec3e9075a3c4a.
#5
Updated by Eric Helms over 7 years ago
- Legacy Backlogs Release (now unused) set to 13
#6
Updated by Steve Loranz about 7 years ago
- Related to Bug #10009: Trying to add a repo with a missing/unresponsive capsule hangs indefinitely/for a very long time. added
#7
Updated by Steve Loranz about 7 years ago
- Status changed from Closed to Assigned
- Assignee changed from Brad Buckingham to Steve Loranz
- Target version changed from 55 to 69
- % Done changed from 100 to 30
- Legacy Backlogs Release (now unused) changed from 13 to 23
The fix failed acceptance testing and was reopened.
#8
Updated by Eric Helms about 7 years ago
- Target version deleted (
69)
#9
Updated by Eric Helms about 7 years ago
- Legacy Backlogs Release (now unused) changed from 23 to 51
#10
Updated by Eric Helms about 7 years ago
- Related to Bug #10295: Capsule syncing should timeout if it is not picked up within a certain amount of time added
#11
Updated by Eric Helms about 7 years ago
- Legacy Backlogs Release (now unused) changed from 51 to 55
#12
Updated by The Foreman Bot almost 7 years ago
- Status changed from Assigned to Ready For Testing
#13
Updated by dustin tsang almost 7 years ago
- Status changed from Ready For Testing to Closed
- % Done changed from 30 to 100
Applied in changeset katello|f6405f285efe6c40622ee1aa6ebc2c3d7307f4ba.
#14
Updated by Eric Helms almost 7 years ago
- Assignee changed from Steve Loranz to dustin tsang
#15
Updated by dustin tsang almost 7 years ago
new pull request to ensure that the task times out based on configurations (default 12hrs)
https://github.com/Katello/katello/pull/5278
fixes #7162 / BZ 1102763 - capsule - treat task as failed if sync times out with capsule
Without this change, when an error (such as timeout) occurs attempting to
'capsule content synchronize' an environment, the synchronize task is reported
as successful. E.g. from cli, this might look like:
hammer> capsule content synchronize --id 3 --environment-id 5
[.....................................................................] [100%]
Task 6c80df66-af10-44aa-9eca-c96992148811: success
With this change, if an error occurs that is reported back from pulp as
succeeded=false, we'll treat this as an error (because it is). In
this scenario, the cli might look like:
hammer> capsule content synchronize --id 3 --environment-id 5
[.....................................................................] [100%]
Task 2246bfb5-131f-4171-a7c3-6e16e3276ddd: warning