Project

General

Profile

Feature #19389

Change TFTP filename pattern to include unique installation media ID

Added by Lukas Zapletal over 1 year ago. Updated 4 months ago.

Status:
Closed
Priority:
High
Assignee:
Category:
TFTP
Target version:

Description

Currently the TFTP file is named "OS_NAME-VERSION-vmlinuz" and the same for initramdisk. This does not work in scenarios where Installation Media gets changed as this causes redownloading of the boot files every new Host. Also there is a bug (associated) which is causing wget to corrupt files.

We need to include ID (ideally not database ID but some kind of short label) for each Media. Also we need to refactor Operating system, generating PXE names and TFTP filenames should not be responsibility of OS, but a host concern.

Once PR is filed, please create ticket for Katello, the project needs to change Synced Content source to match the changes and also include the ID.


Initially this Feature was called "*Change kernel/initram naming pattern*", for the record this is the original proposal:

Currently we download kernel and initramdisk and store it under Operating System name and version (e.g. RedHat-7.3-x86_64-initrd.img and RedHat-7.3-x86_64-vmlinuz). This does not work when Katello plugin is installed and several OS variants are in use (e.g. RHEL Server and Workstation) because Katello uses generic OS name (e.g. RedHat 7.3) with Content Source (e.g. Red_Hat_Enterprise_Linux_Workstation-Red_Hat_Enterprise_Linux_7_Workstation_Kickstart_x86_64_7_3 or Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_Kickstart_x86_64_7_3).

When user with multiple OS variants provisions one or then another, our current policy will redownload files using wget -c which will corrupt them.

This is a proposal to change download policy:

  • Store both files as sha sum of the source URL for example vmlinuz-cfa27af2f1fc0fb6f29fb75e97baaecae13b97d4
  • Create symlink to files named after host and not operating system so each one gets its own unique symlinks (after target is downloaded to prevent sending of unfinished files)
  • Implement defTftpFiles orchestration method and remove those symlinks upon host removal

Alternatively, the file name could be sha sum of URL and optionally last modification time, if the web server provides it. This way also updates could be seamlessly propagated into TFTP since most HTTP servers do provide this information.

echo http://capsule/pulp/kickstart/x/y/vmlinuz | sha1sum
cfa27af2f1fc0fb6f29fb75e97baaecae13b97d4

Related issues

Related to Smart Proxy - Bug #20823: Make -c (continue) wget flag optional when downloading kernel/initramRejected2017-09-01
Related to Smart Proxy - Feature #3034: TFTP file download should be synchronous and handle errorsReady For Testing
Related to Katello - Bug #24263: Create managed content medium providerClosed
Related to Foreman - Bug #24639: Medium URL provider assumes managed hostClosed
Related to Boot disk - Refactor #25120: use medium provider interface to get bootfiles for disk generationClosed
Related to Foreman - Bug #25567: Wrong provision method signature for RancherOSClosed
Related to Foreman - Bug #25569: Windows templates: undefined method `medium_uri' for nil:NilClassReady For Testing
Related to Foreman - Refactor #25635: fix deprecation message for OS methodsClosed

Associated revisions

Revision 2455b570 (diff)
Added by Shimon Shtein 5 months ago

Fixes #19389 - Added medium hash to TFTP files

Revision ef1bbc26 (diff)
Added by Shimon Shtein 5 months ago

Fixes #19389 - Add medium_uri provider

Revision 416eba85 (diff)
Added by Marek Hulán 5 months ago

Refs #19389 - fixes develop tests

Revision 911dcd85 (diff)
Added by Shimon Shtein 4 months ago

Refs #19389 - Use medium_provider framework

History

#1 Updated by Dominic Cleal over 1 year ago

  • Project changed from Smart Proxy to Foreman
  • Subject changed from Change downoad policy of kernel/initram to Change download policy of kernel/initram
  • Category changed from TFTP to TFTP

The file names and methods are implemented in Foreman, not the smart proxy. Can you provide more information on the OS configuration/implementation you used to cause this?

The OS name is used as the prefix, which is unique per name/major/minor - how are there multiple initrd/kernels for the same Operatingsystem object in Foreman? Are there multiple Operatingsystem objects, or one?

Does the plugin override #pxe_files or similar? If so, then it needs to provide unique prefixes. What does the OS#pxe_files method return?

#2 Updated by Lukas Zapletal over 1 year ago

This happens on latest stable Foreman/Katello or even develop. Typical use case is when you have RHEL 7 Server and Workstation, they are both associated to RedHat 7 - this association is done by Katello.

It is possible to reproduce this with Foreman - changing installation media with two different kernel/initram images back and forth for a single OS makes smart-proxy to corrupt the files eventually.

This is a proposal to change naming scheme from os-version to something more stable and more effective (causing less download attempts or whole downloads when installation media is changed).

#3 Updated by Dominic Cleal over 1 year ago

  • Status changed from New to Need more information

Lukas Zapletal wrote:

This happens on latest stable Foreman/Katello or even develop. Typical use case is when you have RHEL 7 Server and Workstation, they are both associated to RedHat 7 - this association is done by Katello.

I don't see how with any of the built in OS objects how you could have multiple initrds/kernels for a single OS. Can you provide the data requested above?

Please move this bug to Katello if it introduces changes to the OS or duplicates in the #pxe_files response, it should assign different names for the files.

It is possible to reproduce this with Foreman - changing installation media with two different kernel/initram images back and forth for a single OS makes smart-proxy to corrupt the files eventually.

This is already tracked under tickets #5069 and #12474 for installation media or content changes.

#4 Updated by Lukas Zapletal over 1 year ago

  • Status changed from Need more information to New

I don't see how with any of the built in OS objects how you could have multiple initrds/kernels for a single OS. Can you provide the data requested above?

Sure, create an OS with two installation media associated (rhel server and rhel workstation). Now every time you create new host and change installation media, smart proxy attempts to redownload the file (leading to corrupted files but this is a different issue).

This can be workarounded by creating two OSes like "RHEL x.y Workstation/Server" but since puppet does not report RHEL variant in the OS name (it reports just "RedHat x.y") some users prefer to keep with Puppet naming scheme. Until Foreman 1.15 Puppet always changed the OS back to "RedHat x.y" by the way, this was annoying.

Although I agree changing "Content Source" per variant is a bad practice and Katello should perhaps create OS for each individual RHEL variant, I think naming TFTP files after OS + version is simply not enough. I am not trying to fix Katello here, but Foreman core. Obviously you can associate multiple URLs and assumption we do today that both installation media must contain the very same kernel/ramdisk files is wrong. One can contain older version of updated kernel/image (as described in #5069) or totally different content if user chooses to do so.

For this reason I propose to change the naming pattern to something cleaner.

#5 Updated by Lukas Zapletal over 1 year ago

  • Subject changed from Change download policy of kernel/initram to Change kernel/initram naming pattern

Making the subject little bit more clear hopefully.

#6 Updated by Lukas Zapletal over 1 year ago

  • Description updated (diff)

Few updates to the proposal.

#7 Updated by Lukas Zapletal over 1 year ago

  • Bugzilla link set to 1447963

#8 Updated by Lukas Zapletal over 1 year ago

  • Project changed from Foreman to Smart Proxy
  • Category changed from TFTP to TFTP
  • Bugzilla link deleted (1447963)

This is really a smart proxy change, name can be passed unmodified from foreman, but smart proxy will name the downloaded files differently and present symlinks to them.

#9 Updated by Lukas Zapletal over 1 year ago

  • Related to Bug #20823: Make -c (continue) wget flag optional when downloading kernel/initram added

#10 Updated by Lukas Zapletal about 1 year ago

Ideally to solve the issue with "-c" flag in wget, the sha1sum should be calculated from contents instead of the URL. That would mean downloading it to temporary filename first, calculating it, renaming, symlinking.

Note this is an attempt to fix two issues:

  • wget -c corrupting files
  • Anaconda trying to load incomplete initramdisk (when network is slow)

Both should be fixed without changing proxy API contracts.

#11 Updated by Dmitri Dolguikh about 1 year ago

I don't see how creating per-host symlinks in the original proposal will prevent sending of not fully downloaded files. Other than that I largely agree with the assessment of the problem. The underlying problem seems to be that the combination of os-medium isn't unique. The most straightforward fix would be to introduce unique, non-mutable ids for both os and medium and use them to generate filenames for kernel files.

A better fix would involve fixing the relationship between os and medium objects: why is medium is a many-to-many with os? When is this valid?

Either way I do not think the fix belongs on smart-proxy, as this is a problem with how Foreman handles oses and media.

#12 Updated by Lukas Zapletal about 1 year ago

Dmitri Dolguikh wrote:

I don't see how creating per-host symlinks in the original proposal will prevent sending of not fully downloaded files. Other than that I largely agree with the assessment of the problem. The underlying problem seems to be that the combination of os-medium isn't unique. The most straightforward fix would be to introduce unique, non-mutable ids for both os and medium and use them to generate filenames for kernel files.

I agree, totally valid solution. I was trying to find something that does not require changes in both Foreman and Smart Proxy. I was hoping for something that does not even change the TFTP contract - we would reuse the input that Foreman sends (filename, source URL) and just make a symlink afterwards. Symlinks are indeed not solving the issue, it's basically dropping redownloading of content what does this.

A better fix would involve fixing the relationship between os and medium objects: why is medium is a many-to-many with os? When is this valid?

While I agree this is a bit of over-engineering, a valid use case would be mirrors in all your datacenters and during provisioning of a host or hostgroup you would select the closest mirror manually.

One important thing we need to keep in mind - when Katello is installed, Installation Media is not not used, Katello takes over and provides it's own source. When user select RHEL 7 Server or Workstation "flavor", installation media URL changes as well. So this is kinda many-to-many behavior of one individual OS (RHEL7). Even if we drop SQL relationship, Katello will still do many-to-many behavior.

Either way I do not think the fix belongs on smart-proxy, as this is a problem with how Foreman handles oses and media.

While I agree on refactoring Foreman part, if you look at current TFTP Smart Proxy API it is broken. You give it a filename and URL and it corrupts it every time you change the URL. You expect it to work, not to corrupt files at least. This is unexpected and very difficult error to track for some (random behaviour in the installer). I prefer fixing it, we will still have this API for legacy purposes for some time.

Now maybe we can figure out a different way, I don't insist on symlinks, this was just an idea. There was a PR to rewrite wget into pure Ruby (#3034) which was IMHO little bit of an overkill. Maybe there is something in the middle, I can imagine more simple pure Ruby implementation with correct handling of changed content (e.g. when URL changes and filename remains the same). One major problem of pure Ruby implementation I still have is that installation media can be anything, from ancient Debian mirror with super old Apache httpd without Content-Length support or even last modified headers to super-duper modern Pulp with everything in. For this reason I tend to find a solution based on wget or curl with symlinks ("temporary files").

#13 Updated by Lukas Zapletal about 1 year ago

  • Related to Feature #3034: TFTP file download should be synchronous and handle errors added

#14 Updated by Dmitri Dolguikh about 1 year ago

If a many-to-many relationship between os and media/source is a requirement, then information identifying the specific os-source pair has to be provided to smart-proxy (alongside with url and filename) in order for it to retrieve correct boot files. And this information has to come from foreman, it cannot be automatically generated (think hashsum) on the proxy. I don't think smart-proxy api has to be extended for this, as os-source pair id can be encoded in file paths.

#15 Updated by Lukas Zapletal about 1 year ago

So instead of RedHat-7.3 naming (OS name, version) we would do something like OS name, version media name. The problem with this is installation media is empty when Katello is installed, we can make Katello aware of this. This will end up with encoding Organization, Environment and Content View with Product Name into the filename because it needs to be unique. So I expect this to end up with something really crazy like:

MyOrganization-Library_RHEL_ATOMIC_CCV1_Red_Hat_Enterprise_Linux_Atomic_Host_7_4-RedHat-7.4.img

or

MyOrganization-Library_RHEL_SOE_CV_Red_Hat_Enterprise_Linux_Server_7_4-RedHat-7.4.img

Without Katello it will be little bit nicer incorporating only the medium name:

Debian_Main_Mirror-Debian-9.img

One more argument why I incorporated symlinks into this is that I would like to introduce more of a "atomicity" to avoid situations when Anaconda trying to load a init RAM disk which is still being downloaded. If we download into temporary file and then rename (or symlink it), then Anaconda will fail early which will be much more easier to troubleshoot than random freeze or error. But this can be as simple as a temporary download location with rename.

#16 Updated by Lukas Zapletal about 1 year ago

We could use media ID instead of name, Katello could use Content View ID as well.

I briefly looked into Foreman core codebase and this will require decent changes and refactoring, kernel/initram methods should have been moved from OperatingSystem class to Host where it belongs.

#17 Updated by Ivan Necas about 1 year ago

I also thing getting the responsibility from the operating system to something host specific is what we need to move the things forward, as currently, you don't have a way where to get the medium_id or content view id and we ran out of the hacks we can do without some decent change (and yes, the original code has been around for a while so it will require some effort to put it out or all the places, including the templates.

I'm thinking about having host.pxe_files, which could actually return different implementations based on what the host needs at the moment. So there could be OperatingSystem::PxeFilesProvider, as well as Katello::ContentView::PxeFilesProvider, and every provider would be able to provide the relative paths, as well as the urls where to get the files from. From the templates or orchestration, it would be transparent enough (just using @host.pxe_files.kernel_path or @host.pxe_files.kernel_url). Thoughts?

#18 Updated by Dmitri Dolguikh about 1 year ago

I second using IDs (probably generated, not db ids) instead of names for os-source pair matching.

#19 Updated by Lukas Zapletal about 1 year ago

  • Category changed from TFTP to TFTP
  • Description updated (diff)
  • Subject changed from Change kernel/initram naming pattern to Change TFTP filename pattern to include unique installation media ID
  • Project changed from Smart Proxy to Foreman

Sounds like a deal.

#20 Updated by The Foreman Bot 10 months ago

  • Assignee set to Shimon Shtein
  • Status changed from New to Ready For Testing
  • Pull request https://github.com/theforeman/foreman/pull/5244 added

#21 Updated by Lukas Zapletal 9 months ago

  • Bugzilla link set to 1447963

#22 Updated by Lukas Zapletal 8 months ago

  • Priority changed from Normal to High

This problem is worse than I initially thought. It's indeed a problem when changing flavor (Workstation vs Server), but more importantly when Katello is in use and upstream kickstart repository is created against RHEL "7Server" version or CentOS repo with different versioning scheme, Operating System created/associated can have incorrect version leading to this corruption bug. We are still investigating root cause, but it ends up with one Operating System associated with multiple kickstart trees leading to overwrite and eventual corruption.

The current patch will help in this case. I think we can also drop the architecture from the prefix, it's always present in the URL, thus the hash. Family and version are informative and nice to have to keep things sorted a bit in the TFTP dir.

#23 Updated by Shimon Shtein 5 months ago

  • Related to Bug #24263: Create managed content medium provider added

#24 Updated by Marek Hulán 5 months ago

  • Target version set to 1.20.0
  • Fixed in Releases 1.20.0 added

#25 Updated by Shimon Shtein 5 months ago

  • Status changed from Ready For Testing to Closed

#26 Updated by The Foreman Bot 5 months ago

  • Pull request https://github.com/theforeman/foreman/pull/5882 added

#27 Updated by Lukas Zapletal 4 months ago

  • Related to Bug #24639: Medium URL provider assumes managed host added

#28 Updated by The Foreman Bot 4 months ago

  • Pull request https://github.com/theforeman/community-templates/pull/491 added

#29 Updated by Timo Goebel 2 months ago

  • Related to Refactor #25120: use medium provider interface to get bootfiles for disk generation added

#30 Updated by Tomer Brisker 19 days ago

  • Related to Bug #25567: Wrong provision method signature for RancherOS added

#31 Updated by Tomer Brisker 14 days ago

  • Related to Bug #25569: Windows templates: undefined method `medium_uri' for nil:NilClass added

#32 Updated by Tomer Brisker 11 days ago

Also available in: Atom PDF