Integration in TOSCA

eFlows4HPC uses TOSCA to describe the high-level execution lifecycle of a workflow, enabling the orchestration of tasks with diverse nature. For the Pillar I use case, TOSCA is used to coordinate the creation of a container image, its transfer to a target cluster, stage-in of input data, PyCOMPSs computation, and stage-out the computation result.

An exhaustive list of TOSCA components developed in the context of the eFlows4HPC project and their configurable properties can be found in section eFlows4HPC TOSCA Components.

Section ROM Pillar I topology template describes how these components are assembled together in a TOSCA topology template to implement the ROM Pillar I use case. More specifically you can refer to Code 21 to see how properties of the TOSCA components are used in this particular context.

ROM Pillar I topology template

The source code of ROM Pillar I application template is available in the workflow-registry github repository in the eFlows4HPC organization. Code 21 shows how are defined the components and how they are connected together in order to run in sequence. Figure 21 shows the same topology in a graphical way.

This topology template contains the different TOSCA components that allows the deployment and execution of the ROM Pilla I workflow. First, we have the AbstractEnvironment component to model the HPC cluster where the worklfow will be executed. It contains the properties required to automated the deployment and execution, such as the address of the login node, the description of the machine, etc. All components which require to know some of those HPC machine properties have a dependency with this component. To model the deployment, the TOSCA topology contains the ImageCreation and DLSDAGImageTransfer components. They are used to model how the Image Creation Service is invoked to generate a workflow container image containing the required softwares and how it is transferred to the target HPC cluster using the transfer_image pipeline of the Data Logistic Service (Section DAG Definition: Singularity image transfer). The DLSDAGImageTransfer component needs to know the generated image, so a dependency between these two components have been set. To model the execution, we have set: the Simulation_data_download component which is modeling the transfer of input data from an HTTP server to the HPC cluster using the plainhttp2ssh pipeline of the DLS; the PyCOMPSJob component which is modeling the execution of the PyCOMPSs application in the HPC; and the ROM_upload_center and ROM_upload_outside components which are modeling the upload of the generated ROM models from the HPC cluster to the Model Respository using the mlflow_upload_model pipeline of the DLS.

Code 21 Extract of the TOSCA topology template for ROM Pillar I workflow
topology_template:
  inputs:
    debug:
      type: boolean
      required: true
      default: false
      description: "Do not redact sensible information on logs"
    user_id:
      type: string
      required: false
      default: ""
      description: "User id to use for authentication may be replaced with workflow input"
    vault_id:
      type: string
      required: false
      default: ""
      description: "User id to use for authentication may be replaced with workflow input"
    container_image_transfer_directory:
      type: string
      required: false
      description: "path of the image on the remote host"
  node_templates:

    ImageCreation:
      type: imagecreation.ansible.nodes.ImageCreation
      properties:
        service_url: "https://bscgrid20.bsc.es/image_creation"
        insecure_tls: true
        username: { get_secret: [/secret/data/services_secrets/image_creation, data=user] }
        password: { get_secret: [/secret/data/services_secrets/image_creation, data=password] }
        machine:
          container_engine: singularity
          platform: "linux/amd64"
          architecture: sandybridge
        workflow: "rom_pillar_I"
        step_id: "reduce_order_model"
        force: false
        debug: { get_input: debug }
        run_in_standard_mode: true
    DLSDAGImageTransfer:
      type: dls.ansible.nodes.DLSDAGImageTransfer
      properties:
        target_path: { get_input: container_image_transfer_directory }
        run_in_standard_mode: true
        dls_api_username: { get_secret: [/secret/data/services_secrets/dls, data=username] }
        dls_api_password: { get_secret: [/secret/data/services_secrets/dls, data=password] }
        dag_id: "transfer_image"
        debug: { get_input: debug }
        user_id: { get_input: user_id }
        vault_id: { get_input: vault_id }
      requirements:
        - dependsOnImageCreationFeature:
            type_requirement: dependency
            node: ImageCreation
            capability: tosca.capabilities.Node
            relationship: tosca.relationships.DependsOn
        - dependsOnAbstractEnvironmentExec_env:
            type_requirement: environment
            node: AbstractEnvironment
            capability: eflows4hpc.env.capabilities.ExecutionEnvironment
            relationship: tosca.relationships.DependsOn
    AbstractEnvironment:
      type: eflows4hpc.env.nodes.AbstractEnvironment
    PyCOMPSJob:
      type: org.eflows4hpc.pycompss.plugin.nodes.PyCOMPSJob
      properties:
        submission_params:
          qos: debug
          python_interpreter: python3
          num_nodes: 2
          extra_compss_opts: "--cpus_per_task --env_script=/reduce_order_model/env.sh"
        application:
          container_opts:
            container_opts: "-e"
            container_compss_path: "/opt/view/compss"
          arguments:
            - "$(dirname ${staged_in_file_path})"
            - "/reduce_order_model/ProjectParameters_tmpl.json"
            - "${result_data_path}/RomParameters.json"
          command: "/reduce_order_model/src/UpdatedWorkflow.py"
        keep_environment: true
      requirements:
        - dependsOnDlsdagImageTransferFeature:
            type_requirement: img_transfer
            node: DLSDAGImageTransfer
            capability: tosca.capabilities.Node
            relationship: tosca.relationships.DependsOn
        - dependsOnAbstractEnvironmentExec_env:
            type_requirement: environment
            node: AbstractEnvironment
            capability: eflows4hpc.env.capabilities.ExecutionEnvironment
            relationship: tosca.relationships.DependsOn
        - dependsOnHttp2SshFeature:
            type_requirement: dependency
            node: HTTP2SSH
            capability: tosca.capabilities.Node
            relationship: tosca.relationships.DependsOn
    Simulation_data_download:
      type: dls.ansible.nodes.HTTP2SSH
      properties:
        dag_id: plainhttp2ssh
        url: "https://b2drop.bsc.es/index.php/s/fQ85ZLDztG2t5j3/download/GidExampleSwaped.mdpa"
        force: true
        input_name_for_url: url
        input_name_for_target_path: "staged_in_file_path"
        dls_api_username: { get_secret: [/secret/data/services_secrets/dls, data=username] }
        dls_api_password: { get_secret: [/secret/data/services_secrets/dls, data=password] }
        debug: { get_input: debug }
        user_id: ""
        vault_id: ""
        run_in_standard_mode: false
      requirements:
        - dependsOnAbstractEnvironmentExec_env:
            type_requirement: environment
            node: AbstractEnvironment
            capability: eflows4hpc.env.capabilities.ExecutionEnvironment
            relationship: tosca.relationships.DependsOn
    ROM_upload_center:
      metadata:
        a4c_edit_x: 139
        a4c_edit_y: "-410"
      type: dls.ansible.nodes.DLSDAGModelUpload
      properties:
        dag_id: "mlflow_upload_model"
        subfolder: center
        input_name_for_location: "rom_path"
        dls_api_username: { get_secret: [/secret/data/services_secrets/dls, data=username] }
        dls_api_password: { get_secret: [/secret/data/services_secrets/dls, data=password] }
        debug: { get_input: debug }
        user_id: ""
        vault_id: ""
        run_in_standard_mode: false
      requirements:
        - dependsOnAbstractEnvironmentExec_env:
            type_requirement: environment
            node: AbstractEnvironment
            capability: eflows4hpc.env.capabilities.ExecutionEnvironment
            relationship: tosca.relationships.DependsOn
        - dependsOnPyCompsJob2Feature:
            type_requirement: dependency
            node: PyCOMPSJob
            capability: tosca.capabilities.Node
            relationship: tosca.relationships.DependsOn
    ROM_upload_outside:
      metadata:
        a4c_edit_x: 444
        a4c_edit_y: "-410"
      type: dls.ansible.nodes.DLSDAGModelUpload
      properties:
        dag_id: "mlflow_upload_model"
        subfolder: outside
        input_name_for_location: "rom_path"
        dls_api_username: { get_secret: [/secret/data/services_secrets/dls, data=username] }
        dls_api_password: { get_secret: [/secret/data/services_secrets/dls, data=password] }
        debug: { get_input: debug }
        user_id: ""
        vault_id: ""
        run_in_standard_mode: false
      requirements:
        - dependsOnAbstractEnvironmentExec_env:
            type_requirement: environment
            node: AbstractEnvironment
            capability: eflows4hpc.env.capabilities.ExecutionEnvironment
            relationship: tosca.relationships.DependsOn
        - dependsOnPyCompsJobFeature:
            type_requirement: dependency
            node: PyCOMPSJob
            capability: tosca.capabilities.Node
            relationship: tosca.relationships.DependsOn

  workflows:
    exec_job:
      inputs:
        user_id:
          type: string
          required: true
        vault_id:
          type: string
          required: true
        data_oid:
          type: string
          required: true
        data_path:
          type: string
          required: true
        rom_path:
          type: string
          required: true
        heat_flux_parameters:
          type: string
          required: true
        num_nodes:
          type: integer
          required: false
          default: 1
      steps:
        StageOutData_executing:
          target: ROM_upload_center
          activities:
            - set_state: executing
          on_success:
            - StageOutData_run
        StageOutData_2_executing:
          target: ROM_upload_outside
          activities:
            - set_state: executing
          on_success:
            - StageOutData_2_run
        PyCOMPSJob_submitting:
          target: PyCOMPSJob
          activities:
            - set_state: submitting
          on_success:
            - PyCOMPSJob_submit
        PyCOMPSJob_submit:
          target: PyCOMPSJob
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.submit
          on_success:
            - PyCOMPSJob_submitted
        StageOutData_submitted:
          target: ROM_upload_center
          activities:
            - set_state: submitted
          on_success:
            - StageOutData_executing
        StageOutData_2_submitted:
          target: ROM_upload_outside
          activities:
            - set_state: submitted
          on_success:
            - StageOutData_2_executing
        StageOutData_submitting:
          target: ROM_upload_center
          activities:
            - set_state: submitting
          on_success:
            - StageOutData_submit
        StageOutData_2_submitting:
          target: ROM_upload_outside
          activities:
            - set_state: submitting
          on_success:
            - StageOutData_2_submit
        StageOutData_run:
          target: ROM_upload_center
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.run
          on_success:
            - StageOutData_executed
        StageOutData_2_run:
          target: ROM_upload_outside
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.run
          on_success:
            - StageOutData_2_executed
        PyCOMPSJob_submitted:
          target: PyCOMPSJob
          activities:
            - set_state: submitted
          on_success:
            - PyCOMPSJob_executing
        StageOutData_submit:
          target: ROM_upload_center
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.submit
          on_success:
            - StageOutData_submitted
        StageOutData_2_submit:
          target: ROM_upload_outside
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.submit
          on_success:
            - StageOutData_2_submitted
        StageOutData_executed:
          target: ROM_upload_center
          activities:
            - set_state: executed
        StageOutData_2_executed:
          target: ROM_upload_outside
          activities:
            - set_state: executed
        PyCOMPSJob_executing:
          target: PyCOMPSJob
          activities:
            - set_state: executing
          on_success:
            - PyCOMPSJob_run
        PyCOMPSJob_executed:
          target: PyCOMPSJob
          activities:
            - set_state: executed
          on_success:
            - StageOutData_submitting
            - StageOutData_2_submitting
        PyCOMPSJob_run:
          target: PyCOMPSJob
          operation_host: ORCHESTRATOR
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.run
          on_success:
            - PyCOMPSJob_executed
        Simulation_data_download_run:
          target: Simulation_data_download
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.run
          on_success:
            - PyCOMPSJob_submitting
        Simulation_data_download_submit:
          target: Simulation_data_download
          activities:
            - call_operation: tosca.interfaces.node.lifecycle.Runnable.submit
          on_success:
            - Simulation_data_download_run
Alien4Cloud ROM Pillar I topology

Figure 21 Alien4Cloud ROM Pillar I topology