Project

General

Profile

Bug #7162

capsule: synchronize command never times out/silently fails.

Added by Brad Buckingham over 5 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Foreman Proxy Content
Target version:
Difficulty:
Triaged:
Yes
Bugzilla link:
Fixed in Releases:
Found in Releases:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1102763
Description of problem:
if a capsule runs into some issue that keeps syncs from completing, there is nothing to indicate this

Version-Release number of selected component (if applicable):

Satellite-6.0.3-RHEL-6-20140528.4

How reproducible:

unsure.

Steps to Reproduce:
1. Attempt to sync sat content to a capsule. It may (or may not) help to reproduce this if you have two servers over wide geographical locations
2. Wait
3. View results.

Actual results:

In the synchronize process... user sees really nothing, other than the progress bar never moving -- in my case at 50%.

In the pulp logs on sat server we see:

May 29 15:48:44 ibm-x3550m3-07 pulp: pulp.server.async.scheduler:ERROR: Workers '' has gone missing, removing from list of workers
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: 7461b684-4048-4e72-94dd-3b82956c6fab
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: Traceback (most recent call last):
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py", line 113, in get
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: return self.__receiver.fetch(timeout=timeout)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "<string>", line 6, in fetch
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 1030, in fetch
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: self._ecwait(lambda: self.linked)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in _ecwait
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: result = self._ewait(lambda: self.closed or predicate(), timeout)
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 994, in _ewait
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: self.check_error()
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 983, in check_error
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: raise self.error
May 29 15:48:52 ibm-x3550m3-07 pulp: gofer.transport.qpid.consumer:ERROR: NotFound: no such queue: pulp.task

Expected results:
The log above may have something to do with what's causing this, but is not the main issue; rather, the issue is the synchronizer never "gives up" or indicates an issue

Additional info:


Related issues

Related to Katello - Bug #10009: Trying to add a repo with a missing/unresponsive capsule hangs indefinitely/for a very long time.Closed2015-04-02
Related to Katello - Bug #10295: Capsule syncing should timeout if it is not picked up within a certain amount of timeResolved2015-04-28

Associated revisions

Revision f778714b (diff)
Added by Brad Buckingham over 5 years ago

fixes #7162 / BZ 1102763 - capsule - treat task as failed if sync times out with capsule

Without this change, when an error (such as timeout) occurs attempting to
'capsule content synchronize' an environment, the synchronize task is reported
as successful. E.g. from cli, this might look like:

hammer> capsule content synchronize --id 3 --environment-id 5
[.....................................................................] [100%]
Task 6c80df66-af10-44aa-9eca-c96992148811: success

With this change, if an error occurs that is reported back from pulp as
succeeded=false, we'll treat this as an error (because it is). In
this scenario, the cli might look like:

hammer> capsule content synchronize --id 3 --environment-id 5
[.....................................................................] [100%]
Task 2246bfb5-131f-4171-a7c3-6e16e3276ddd: warning

Revision 0ded1754
Added by Brad Buckingham over 5 years ago

Merge pull request #4595 from bbuckingham/issue-7162

fixes #7162 / BZ 1102763 - capsule - treat task as failed if sync times times out

Revision f6405f28 (diff)
Added by dustin tsang almost 5 years ago

Fixes #7162 - timeout capsule sync task

Revision c645991f
Added by dustin tsang almost 5 years ago

Merge pull request #5278 from dustints/timeout_cap_sync

Fixes #7162 - timeout capsule sync task

Revision 3a7238da (diff)
Added by Justin Sherrill almost 5 years ago

refs #7162 - fixing capsule sync timeout

Revision 8dd55d60
Added by Justin Sherrill almost 5 years ago

Merge pull request #5304 from jlsherrill/7162

refs #7162 - fixing capsule sync timeout

History

#1 Updated by Brad Buckingham over 5 years ago

  • Assignee set to Brad Buckingham
  • Triaged changed from No to Yes

#2 Updated by The Foreman Bot over 5 years ago

  • Status changed from New to Ready For Testing
  • Target version set to 63
  • Pull request https://github.com/Katello/katello/pull/4595 added
  • Pull request deleted ()

#3 Updated by Eric Helms over 5 years ago

  • Target version changed from 63 to 55

#4 Updated by Brad Buckingham over 5 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100

#5 Updated by Eric Helms over 5 years ago

  • Legacy Backlogs Release (now unused) set to 13

#6 Updated by Steve Loranz about 5 years ago

  • Related to Bug #10009: Trying to add a repo with a missing/unresponsive capsule hangs indefinitely/for a very long time. added

#7 Updated by Steve Loranz about 5 years ago

  • Status changed from Closed to Assigned
  • Assignee changed from Brad Buckingham to Steve Loranz
  • Target version changed from 55 to 69
  • % Done changed from 100 to 30
  • Legacy Backlogs Release (now unused) changed from 13 to 23

The fix failed acceptance testing and was reopened.

#8 Updated by Eric Helms almost 5 years ago

  • Target version deleted (69)

#9 Updated by Eric Helms almost 5 years ago

  • Legacy Backlogs Release (now unused) changed from 23 to 51

#10 Updated by Eric Helms almost 5 years ago

  • Related to Bug #10295: Capsule syncing should timeout if it is not picked up within a certain amount of time added

#11 Updated by Eric Helms almost 5 years ago

  • Legacy Backlogs Release (now unused) changed from 51 to 55

#12 Updated by The Foreman Bot almost 5 years ago

  • Status changed from Assigned to Ready For Testing

#13 Updated by dustin tsang almost 5 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 30 to 100

#14 Updated by Eric Helms almost 5 years ago

  • Assignee changed from Steve Loranz to dustin tsang

#15 Updated by dustin tsang almost 5 years ago

new pull request to ensure that the task times out based on configurations (default 12hrs)
https://github.com/Katello/katello/pull/5278

Also available in: Atom PDF