Skip to main content

Identity Provider configuration

Beta

The CDF Toolkit (cdf-tk) is currently in beta. It should be stable and mature enough to use it to bootstrap and configure Cognite Data Fusion projects. However, if you are using the tool to manage production projects, we recommend that you test out in a staging project to ensure that you know what the tool is going to do. A part of the beta is to get feedback on the tool and improve how to use the tool for project lifecycle management, so please engage with us on hub.cognite.com.

This page describes in more detail the configuration needed to use the CDF Toolkit with an identity provider.

CDF Environment

Cognite has Cognite Data Fusion deployed to many geographic regions and data centers across Google Cloud, Azure, and AWS. Each CDF project will be uniquely identified by a short project name and the name of a cluster. The cluster name typically refers to a region, like westeurope-1. The project name is a short name that is unique within the cluster, and typically contains a reference to the customer name. An example is customerA-prod.

For the cdf-tk tool, the environment variables CDF_CLUSTER and CDF_PROJECT must be set to the CDF cluster and project you want to deploy to. The tool supports multiple environments, each environment referring to a project. You define each environments in a config.<env>.yaml file and refer to it when you use the tool, e.g. cdf-tk build --env=<environment>.

Identity Provider

Second, to specify the information needed for the identity provider (and configured for the project), you need to set the IDP_TOKEN_URL. The token URL is used to to get an access token for the CDF project from the identity provider. The token basically contains everything needed to access the CDF project, including the groups that the application/service principal is a member of.

If you use Entra and cdf-tk auth verify --interactive, this variable will be set for you based on your Entra tenant id. This URL typically ends in /oauth/token and for Entra and most hosted identity providers like Auth0, the URL contains the tenant id or a part that identifies your tenant. An example is https://login.microsoftonline.com/your_tenant.onmicrosoft.com/oauth2/v2.0/token.

Service Account/Application

Third, the cdf-tk tool (or the deployment pipeline used in your CI/CD setup) needs a service account/service principal or application (they are called different things in different identity providers) with access rights to allow it to write configurations. You create an application/service principal in your identity provider with client credentials flow enabled (OAuth2). You will get a client id and a secret. These are the last two environment variables needed:

IDP_CLIENT_ID=<client_id>
IDP_CLIENT_SECRET=<secret>

Authorization Through CDF Groups

tip

If you run cdf-tk auth verify, you can either run it with --interactive or specify --create-group=<sourceId> where sourceId is the group id of the group that you need to be a member of to get the read/write configuration access to the CDF project. For subsequent deployment runs, you need to ensure that you either don't touch the newly created group by not including the cdf_auth_readwrite_all module in your environments.yaml file, or by configuring the readwrite_source_id variable in the cdf_auth_readwrite_all module config.yaml to the id of the group in the identity provider.

Fourth and final, you need to create a group in your identity provider that the application/ service principal is a member of. This group will be used to grant access to the CDF project through the ./common/cdf_auth_readwrite_all module. This module loads two useful groups that can be used with the templates. One group is the read/write group for the cdf-tk tool or your CI/CD pipeline. The other group is a read-only group that can be used for admin users to log into the the Fusion UI and verify the configurations.

In the default recommended claims configuration, the identity provider's group memberships will be included in the token that is issued to the application/service principal. The group id of the group you have created for the application/service principal must thus be set in the readwrite_source_id variable in the ./common/cdf_auth_readwrite_all/default.config.yaml file.

The .env.tmpl file can be used to set the necessary environment variables for local use of the scripts. When setting up a deployment pipeline, you should make sure that IDP_CLIENT_SECRET is not written in a file in the git repository, but set as an environment in the execution environment (Github Actions or similar). The source ids in the config file are not sensitive from a security point of view.