Project

General

Profile

Actions

Bug #20260

closed

A remote job runs multiple times on a single server that belongs to multiple Host Collections

Added by Adam Ruzicka almost 7 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Fixed in Releases:
Found in Releases:

Description

Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=1465628

Description of problem:

When you run a remote job on a single host collection and it contains a system that belongs to multiple host collections, the job will run multiple times on that system.

How reproducible:

100%

Steps to Reproduce:

[root@sat62 ~]# hammer job-invocation create --job-template "Run Command - SSH Default" --inputs command='date +"%H:%M:%S.%N" >> /tmp/lala.txt' --search-query "host_collection = hostcol01"
Job invocation 13 created
[........................................................................................................................................] [100%]
8 task(s), 6 success, 2 fail

^^Note: 8 tasks created.

[root@sat62 ~]# hammer host-collection info --id 3
ID: 3
Name: hostcol01
Limit: None
Description:
Total Hosts: 4

^^ Note: hostcol01 has 4 hosts only.

[root@member-of-hostcol01 ~]# cat /tmp/lala.txt
18:26:55.103882174
18:26:55.104630745
18:26:55.273895571

^^ This specific member of hostcol01 belongs to 3 host collections. It ran this job 3 times.

[root@member-of-hostcol01 ~]# journalctl u sshd --since 18:20:55
-
Logs begin at Dom 2017-06-25 04:35:12 UTC, end at Ter 2017-06-27 18:45:30 UTC. --
Jun 27 18:26:54 member-of-hostcol01.example.com sshd2269: Accepted publickey for root from 172.18.0.1 port 45202 ssh2: RSA ee:a4:a8:67:65:b9:a3:
Jun 27 18:26:54 member-of-hostcol01.example.com sshd2271: Accepted publickey for root from 172.18.0.1 port 45204 ssh2: RSA ee:a4:a8:67:65:b9:a3:
Jun 27 18:26:54 member-of-hostcol01.example.com sshd2270: Accepted publickey for root from 172.18.0.1 port 45203 ssh2: RSA ee:a4:a8:67:65:b9:a3:

^^ All 3 visits to this host on the same second.

[root@other-member-of-hostcol01 ~]# cat /tmp/lala.txt
18:26:54.671381525

^^ Note, this host belongs to a single Host Collection. It ran the remote job only once.

Actual results:

It created 8 tasks

Expected results:

4 tasks and the job to run only once.

Actions #1

Updated by The Foreman Bot almost 7 years ago

  • Status changed from New to Ready For Testing
  • Assignee set to Adam Ruzicka
  • Pull request https://github.com/theforeman/foreman_remote_execution/pull/259 added
Actions #2

Updated by Adam Ruzicka almost 7 years ago

  • Status changed from Ready For Testing to Closed
  • % Done changed from 0 to 100
Actions #3

Updated by Ivan Necas almost 7 years ago

  • translation missing: en.field_release set to 279
Actions #4

Updated by Will Darton over 6 years ago

can you test this against a future scheduled job? the fix appears to work ok in a run_now instance, but when you issue a --start-at it creates multiple tasks.

$ hammer job-invocation create --job-template "Run Command - SSH Default" --inputs command='date +"%H:%M:%S.%N" >> /tmp/lala.txt' --search-query "name=client.radagast.net" --start-at '2017-07-26 15:08:00'
Job invocation 2521 created
$ hammer job-invocation info --id 2521
ID: 2521
Description: Run date +"%H:%M:%S.%N" >> /tmp/lala.txt
Status: queued
Success: N/A
Failed: N/A
Pending: N/A
Total: N/A
Start: 2017-07-26 15:08:00 -0400
Job Category: Commands
Mode: future
Cron line:
Recurring logic ID:
Hosts:
- client.radagast.net

$ date
Wed Jul 26 15:06:52 EDT 2017
$ date
Wed Jul 26 15:08:18 EDT 2017
$ hammer job-invocation info --id 2521
ID: 2521
Description: Run date +"%H:%M:%S.%N" >> /tmp/lala.txt
Status: running
Success: 1
Failed: 0
Pending: 1
Total: 2
Start: 2017-07-26 15:08:00 -0400
Job Category: Commands
Mode: future
Cron line:
Recurring logic ID:
Hosts:
- client.radagast.net
- client.radagast.net

Actions #5

Updated by Adam Ruzicka over 6 years ago

Could you please try it on some host which does not belong into multiple host collections? I believe you'll see duplicated tasks/hosts as well, which would mean it's not related to this issue, but probably to another one1.

[1] - http://projects.theforeman.org/issues/20107

Actions #6

Updated by Will Darton over 6 years ago

Thanks for pointing that out. I patched for the other issue and that seems to have resolved. I'll search a little more before commenting.
Thanks

Actions

Also available in: Atom PDF