Scaffold an ML model project

You can use a Workflow in Cortex to scaffold new Machine Learning (ML) projects with best practices built in from the start.

Step 1: Create an entity type called ML Model

In Cortex, create a custom entity type called ML Model (with a unique entity type identifier ml-model).

Step 2: Register the cookiecutter template in Cortex

Follow the Cortex documentation on registering a Scaffolder template,

Add the template

You must have the Configure Scaffolder templates permission to follow these steps.

  1. In Cortex, navigate to Workflows from the main nav. In the upper right corner of the page, click Register Scaffolder template.

In the upper right corner click "Register Scaffolder template."
  1. Configure the template details:

    • Provider: Select which Git provider you are using.

    • Name: Enter a name for the template, such as "MLOps cookiecutter"

    • Description: Add a description, such as "A standardized, flexible project structure for MLOps"

    • Tags: Optionally, add tags to describe the template. We recommend adding a tag such as "mlops".

  2. Under "Configuration," fill in the fields:

    • Git provider configuration: Enter the configuration, e.g., default

      • By default, the option Use this configuration for all git operations is enabled. When enabled, Cortex uses the selected configuration for all Git operations (e.g., fetching the template details, populating the form options, and creating the new repo or Pull Request). The form options will be filtered to only those organizations or repositories that the selected configuration has access to.

      • With this setting disabled, you can use a single Scaffolder template with multiple organizations/configurations. While the selected configuration must still have to access the template's repository, the remaining Git operations will be run dynamically based on the user's form selections. The Scaffolder form options will be populated using data from all configurations. Then, Cortex will automatically use a configuration with access to the selected organization (if scaffolding a new repository) or repository (if scaffolding a new PR) for all remaining Git operations.

    • Git URL: Enter the git URL where your template lives.

    • Configuration requirements: Select Neither, as this Scaffolder will create entities for the ml-model type rather than creating entities of the service type.

    • Visibility: When creating a new repository choose the default visibility of the repository. If you do not specify, the template defaults to setting a new repository as private.

      • The available options depend on the Git provider. The Public setting only works if the org containing the repository allows it. The Internal setting only works for GitHub Enterprise accounts.

      • When running a Workflow that contains a Scaffolder block, you can configure an override to change the repo visibility.

    • Create cortex.yaml in git when creating a new service: Disable this setting, as the Workflow you run will include a block to add the new entity to your workspace.

    • Show README.md during Scaffolding: When enabled, the project's README.md will be displayed when inputting the variables to use while rendering the template. If there is no README.md, nothing will be shown.

  3. Click Register Scaffolder template.

Step 3: Create a Workflow with a Scaffolder block

You can use the Cortex CLI to add the example Workflow to your workspace. This allows you to quickly set up the example configuration then iterate on it for your own use case. Expand the tile below to learn more:

Import the Workflow via CLI
  1. Save the Workflow example YAML file below:

name: New ML Model
tag: scaffolder-workflow-a1d4d2d3-624c-473f-a9f2-b933a44a92df
description: "Workflow that leverages the [cookiecutter-mlops](https://github.com/Chim-SO/cookiecutter-mlops)\
  \ template to create new logical, reasonably standardized, but flexible project\
  \ structure for MLOps."
isDraft: true
filter:
  type: GLOBAL
runResponseTemplate: "Your repository is ready! \nPlease browse to {{actions.create-repo.outputs.response.gitURL}}\n\
  The Model is also being tracked in Cortex. Please browse to https://app.getcortexapp.com/admin/index?tag={{actions.slugify-project-name.outputs.result}}"
failedRunResponseTemplate: null
restrictActionCompletionToRunnerUser: false
actions:
- name: Gathering Model Details...
  slug: model-details
  schema:
    inputs:
    - name: Project Name
      description: null
      key: project-name
      required: false
      defaultValue: null
      placeholder: null
      validationRegex: null
      type: INPUT_FIELD
    - name: Description
      description: null
      key: description
      required: false
      defaultValue: null
      placeholder: null
      validationRegex: null
      type: TEXTAREA_FIELD
    - name: License
      description: null
      key: license
      required: false
      options:
      - MIT
      - BSD-3-Clause
      - No license file
      optionsLabels:
      - MIT
      - BSD-3-Clause
      - No license file
      defaultValue: null
      placeholder: null
      allowAdditionalOptions: false
      type: SELECT_FIELD
    - name: s3 Bucket
      description: null
      key: s3-bucket
      required: false
      defaultValue: null
      placeholder: "[Optional] do not include 's3://'"
      validationRegex: null
      type: INPUT_FIELD
    - name: AWS Profile
      description: null
      key: aws-profile
      required: false
      defaultValue: default
      placeholder: default
      validationRegex: null
      type: INPUT_FIELD
    - name: Python Interpreter
      description: null
      key: python-interpreter
      required: false
      options:
      - Python
      - Python3
      optionsLabels:
      - Python
      - python3
      defaultValue: null
      placeholder: null
      allowAdditionalOptions: false
      type: SELECT_FIELD
    inputOverrides: []
    jsValidatorScript: ""
    type: USER_INPUT
  outgoingActions:
  - slugify-project-name
  isRootAction: true
- name: Getting Ready to Scaffold New Repository...
  slug: slugify-project-name
  schema:
    expression: .actions."model-details".outputs."project-name" | ascii_downcase |
      gsub(" ";"-")|gsub("_";"-")
    type: JQ
  outgoingActions:
  - create-repo
  isRootAction: false
- name: Creating Repository in GitHub...
  slug: create-repo
  schema:
    scaffolderTemplateId: st38e74ce20dc23c29
    createNewRepository: true
    createService: false
    inputOverrides:
    - inputKey: project_name
      outputVariable: actions.model-details.outputs.project-name
      editable: false
      type: VALUE
    - inputKey: author_name
      outputVariable: context.initiatedBy.email
      editable: false
      type: VALUE
    - inputKey: description
      outputVariable: actions.model-details.outputs.description
      editable: false
      type: VALUE
    - inputKey: repo_name
      outputVariable: actions.slugify-project-name.outputs.result
      editable: false
      type: VALUE
    - inputKey: open_source_license
      outputVariable: actions.model-details.outputs.license
      editable: false
      type: VALUE
    - inputKey: s3_bucket
      outputVariable: actions.model-details.outputs.s3-bucket
      editable: false
      type: VALUE
    - inputKey: aws_profile
      outputVariable: actions.model-details.outputs.aws-profile
      editable: false
      type: VALUE
    - inputKey: python_interpreter
      outputVariable: actions.model-details.outputs.python-interpreter
      editable: false
      type: VALUE
    - inputKey: publisherOrg
      outputVariable: variables.org
      editable: false
      type: VALUE
    - inputKey: publisherRepoName
      outputVariable: actions.slugify-project-name.outputs.result
      editable: false
      type: VALUE
    - inputKey: publisherBranch
      outputVariable: variables.branch-name
      editable: false
      type: VALUE
    - inputKey: publisherCommitMessage
      outputVariable: variables.commit-msg
      editable: false
      type: VALUE
    - inputKey: publisherRepoVisibility
      outputVariable: variables.visibility
      editable: false
      type: VALUE
    type: SCAFFOLDER
  outgoingActions:
  - add-entity-to-catalog
  isRootAction: false
- name: Adding Entity to Catalog...
  slug: add-entity-to-catalog
  schema:
    inputs:
      body: "openapi: 3.0.1\ninfo:\n  title: {{actions.model-details.outputs.project-name}}\n\
        \  x-cortex-tag: {{actions.slugify-project-name.outputs.result}}\n  x-cortex-type:\
        \ ml-model\n  x-cortex-git:\n    github:\n      repository: {{variables.org}}/{{actions.slugify-project-name.outputs.result}}\n\
        \  x-cortex-groups:\n  - ml-model      \n      "
      dryRun: false
      appendArrays: false
      failIfEntityDoesNotExist: false
    integrationAlias: null
    actionIdentifier: cortex.createOrPatchEntity
    type: ADVANCED_HTTP_REQUEST
  outgoingActions:
  - notify-end-user
  isRootAction: false
- name: Notifying User...
  slug: notify-end-user
  schema:
    channel: demos
    message: "Your repository is ready! \nPlease browse to {{actions.create-repo.outputs.response.gitURL}}\n\
      The Model is also being tracked in Cortex. Please browse to https://app.getcortexapp.com/admin/index?tag={{actions.slugify-project-name.outputs.result}}"
    type: SLACK
  outgoingActions: []
  isRootAction: false
runRestrictionPolicies: []
iconTag: null
variables:
- slug: commit-msg
  type: STRING
  defaultValue: Initial Commit
- slug: branch-name
  type: STRING
  defaultValue: main
- slug: visibility
  type: STRING
  defaultValue: PUBLIC
- slug: org
  type: STRING
  defaultValue: <your-org>
  1. Use the Cortex CLI to run this command, using the path to your Workflow YAML file: cortex workflows create -f <path-to-your-workflow.yaml>

Step 4: Run the Workflow

While viewing the Workflow in Cortex, click Run.

When you run the Workflow, the following events happen:

  • The User Input block runs, collecting input from the user running the Workflow

  • A new ML project is scaffolded in your Git repository, using the information collected from the initial User Input block

  • A corresponding new entity (of the type ml-model) is created in your workspace

  • A notification is sent to the Slack channel you configured in the Slack block of the Workflow

Last updated

Was this helpful?