Project

General

Profile

InterServerSync » History » Revision 30

Revision 29 (Chris Duryee, 11/12/2015 11:56 AM) → Revision 30/34 (Chris Duryee, 11/12/2015 08:32 PM)

h1. InterServerSync 

 h3. What is ISS? 

 Inter-Server Sync is a feature designed to help users in two scenarios: 

  * users who have a connected Foreman/Katello instance and disconnected Foreman/Katello, who want to propogate data from the connected side of their network to the disconnected side. 
  * users who have a "main" Foreman/Katello and want to propogate some data (but not all data) to other instances. One example is for users who have "blessed" content views that are validated by the IT department, and want to propogate those down elsewhere. 

 More info on the original ISS is available at https://fedorahosted.org/spacewalk/wiki/InterSpacewalkServerSync. 

 h3. Goals 

 For Katello 2.5 Roughly, phase 1 is the minimal "get it working" phase where we want have a demoable end-to-end scenario that works via hammer. After that, we will collect feedback and move on to be able phase 2 which moves some of the grunt work done by hammer to export a group of repos in a given CV or Environment in tgz, chunked iso, or as a set of directories, the server, and import said export to another Katello/Foreman instance. 

 Later iterations adds web UI support. Phase 3 enhances the "connected" scenario, and Phase 4 will allow be for support of additional content types and new types of data besides yum repos or disk usage enhancements. foreman/katello data. 

 All changes should be merged by the end of each phase in order to keep PR size down. 

 h4. Phase 1 "get it working" goals 

  * target is Katello 2.5 
  * allow exporting and importing of products and repos in CVs 
  * support for import/export of products and repos via web UI 
  * export/import can occur via hammer, but some filesystem-level access is needed for accessing yum repo exports (remote mount, scp, http, etc) 
  * only yum is supported 
  * incremental dumps are supported 

 This goal Note: phase 1 replaces the katello-disconnected script, and solves the same problem that Spacewalk solves with Inter-Spacewalk Sync in a minimal fashion. Users will be able to export yum repos associated with an environment fashion (notably, without a web UI which comes in phase 2 or CV to the disk and import them on another server. We space usage optimization which comes in phase 3). Phase 1 will optionally recreate any custom products during the import, but RH products will not be re-created since they must be created via manifest. 

 We will also support date-based incremental exports as well, via creating a ISO6601 "since" field on the export call. "blessed" content view that can have its repos exported from one Foreman/Katello to other instances. 

 h4. Katello 2.6+ 

 These features require additional thought, or possibly upstream RFEs for Pulp. Phase 2 "web UI and optimization" goals 

  * Push repo contents from one Katello/Foreman target is post-2.5 
  * API clean up to another (if X can initiate connections to Y but not vice versa). allow for bulk export/import with only a few server calls - this is needed for web UI support since we can't rely on hammer 
  * Allow exporting entire export is written to local disk where hammer (both CSVs and repo contents), allowing for a single tarball or iso to have all data. 

 h4. Phase 3 "do it online and with less disk space" goals 

  * target is being run post-2.5 
  * Allow exports/imports stream export/import data from one Foreman/katello to occur another without writing entire needing to write a full export to disk 
  * Support for additional types streaming must be able to be initiated from either the source or destination foreman/katello 
  * allow export and import without filesystem access to machine (hammer or web ui can provide download of export and upload of import) 

 h4. Phase 4+ "rinse and repeat" goals 

  * allow exporting and importing of other data (docker, ostree, puppet) (environments, content views outside of the default CV for each repo in Library, etc) 
  * Support support for additional types of metadata (users, etc?) ostree, docker, puppet content 

 h3. Katello 2.5 design and stories Phase 1 Design 

 Development steps 
 # Step 1 - export 
 > # As a user, I would like an API to export specific repositories. 
 > > * "Feature #12446":http://projects.theforeman.org/issues/12446 : Add ability to export yum repositories (both custom and RH) to disk 
 > > * Specify lifecycle environment and content view. 
 > # As a user, I would like an API to export specific products. 
 > # As an admin, I would like a role to limit which users may export. 
 > > * Based on product read, repository read, export true? 
 > # "Feature #12446" As an admin, I would like a setting to specify where exported files are stored on the server. 
 > > * How are the files organized (per org, per user, per cv, per env)? 
 > # As a user, I want to choose export format of iso, tgz, or dir 
 # Step 2 - import 
 > # "Feature #12459": http://projects.theforeman.org/issues/12459 : As a user, I want to temporarily set where to sync a repository from. 
 > # As a user, I want to permanently set where to sync a repository from. 
 > # As a user, I want to temporarily/permanently set sync location for a product. 
 > # As a katello, I want to prevent import of Red Hat products into custom products. 
 > > * If the Red Hat product already exists, it may sync from the location. A custom product may not sync Red Hat content. 
 # Step 3 - repository enable 
 > # As a user, I want to change the CDN to point to an export location. 
 > # As a katello, I want the repository choices shown on the Red Hat Repositories page to be limited to what is in the export. 
 > > * Only the Red Hat repos that were exported should be displayed as available for enable. 
 > > * If a product was enabled previously, it should be shown as already enabled. 


 h4. Katello 2.5 hammer Hammer design 

 this section is getting reworked:) Phase 1 uses hammer to do some of the heavy lifting. This gets addressed in phase 2. 

 <pre> 

 # export a repository to disk on the Katello server (need to do this for each repo but can be a repo in a CV) 
 hammer repository export --id <id> 

 # export products to CSV 
 hammer csv products --csv-export --csv-file products.csv  

 # replace URL with on-disk location for place to sync from on destination katello 
 sed -i 's#https://repos.fedorapeople.org/#file:///mnt/imports/#g' /tmp/foo.csv 

 # import steps 

 # ensure export is available, then run 
 hammer csv products --csv-import --csv-file products.csv  

 # kick off syncs for imported products 
 hammer repository synchronize --id <id> 
 </pre> 

 h3. Phase 2 Design 

 The main user-facing output of Phase 2 is the ability to do import/export from the web UI. An additional important feature is that data processing does not occur on the hammer client anymore. This greatly improves resiliency of the import/export process. 


 h4. API modifications 

 API calls will need to be added to kick off the product/repo export task, with an optional "start at" date for incremental exports. A call will also need to be added to start the import task. 

 h4. Web UI design/mockup 

 (need to fill in) 

 h4. User Stories 

  * as a hammer user, I would like product CSV creation and processing to occur server-side. Ideally, I will be able to make a call that kicks off a dynflow task to create or import the CSV. The CSV can be uploaded/downloaded, or written to disk in the export directory 
  * as a hammer user, I would like to run a single command that performs a product CSV export and writes any needed repo data to disk 
  * as a hammer user, I would like to run a single command that performs a product CSV import and reads any needed repo data from disk via repo sync. I will be able to override the sync URL (perhaps using something defined in katello.yml)    in the call so I don't have to modify the product CSV with a sed statement. 
  * as a web UI user, I would like to be able to perform the same two import/export actions as above, but without hammer. 

 h3. Phase 3 Design 

 This phase is further optimization, focusing on the "connected" scenario where the two foreman/katello instances can talk to each other. Previous phases treated the connected scenario as being solved by the disconnected scenario, which meant that all data was written to disk and optionally transferred via scp or http. 

 NOTE: I added this area just to give an idea of roadmap, but it is complex enough to need its own design iteration. 

 h3. Phase 4 Design 

 Phase 4 adds additional content types and data to ISS. Detailed design will be added here after phase 2 or phase 3 is complete.