Getting started
The cdf-tk
tool is available as a Python package. To install it, you need a working Python installation version >=3.9 (recommended 3.11).
Here's a short summary of the command sequence that gets you started. They will be explained further below and on the next page:
Step | Command | Description |
---|---|---|
1. | cdf-tk init <proj_dir> | Create a new configuration folder, cd <proj_dir> , and initialise the project. |
2. | cdf-tk auth verify --interactive | Check that you have access to the project and create a .env .This step can be skipped if you have configured environment variables. Alternatively do cdf-tk auth verify just to verify that everything works. |
3. | Edit config.<env>.yaml in <proj_dir> | Specify the modules you want to deploy for each environment you deploy to.config.<env>.yaml also contains all variables the modules expect.Change the variables for the modules that are relevant for your deployments. |
4. | cdf-tk --verbose build --env=<env> | Build configurations to the build/ directory using the config.dev.yaml configuration file. |
5. | cdf-tk deploy --dry-run --env=<env> | Test deploy the configurations to your CDF project from the build/ directory. Then remove --dry-run to actually ppush configurations to the project. |
Step 1: Install the tool
To install the tool, run the following command in a terminal on your local machine.
pip install cognite-toolkit
Run cdf-tk --version
to verify the version of the tool and the templates that come bundled with it. You can also run cdf-tk --help
to see the available commands. Each of the commands has comprehensive help accessible using the --help
option, e.g. cdf-tk auth verify --help
.
If your terminal states command not found and you are using a virtual Python environment manager, make sure you have activated the virtual environment using source .venv/bin/activate
, poetry or similar.
If you are struggling with Python version and installation, see the Pro-Tip: Managing Python versions and virtual environments section below.
The CDF Toolkit is in active development, and new features and improvements are added regularly. The pip install cognite-toolkit
,
will only give you the latest stable version. If you want to try out the latest features, you can install alpha and beta versions
by, for example, running pip install cognite-toolkit==0.2.0a4
. Check pypi.org
for the latest available versions.
Step 2: Initialise a project template directory
cdf-tk
needs a local directory with configurations. Run this command to create the directory <proj_dir>
and populate it with the available template modules.
cdf-tk init <proj_dir>
<proj_dir>
will now contain a set of configuration files and template modules that you can pick from to configure your project.
From here on, the easiest is to run all commands from the <proj_dir>
directory:
cd <proj_dir>
Step 3: Set up credentials for your project
Before using the cdf-tk
tool, you need access to a CDF project. In this section, you will learn how to set up
the necessary credentials to access your project.
-
To use the
cdf-tk
tool, you need to have a client id and secret representing an application/service principal from the identity provider configured for the CDF project. This must be configured by somebody with administrator access to the CDF project. You can use any Identity Provider like Microsoft Entra ID (aka Active Directory), Auth0, or others supported by CDF. See Setting up an identity provider for more information. -
The standard CDF admin group has the below access rights and
cdf-tk
will help you create the required additional groups and access rights as long as the application/service principal has been granted these access rights (e.g. through being a member of the admin group in the identity provider):"projectsAcl": ["LIST", "READ"],
"groupsAcl": ["LIST", "READ", "CREATE", "UPDATE", "DELETE"] -
The information in the table is needed by the
cdf-tk
tool:What? Description Environment variable Cluster name The physical cluster where your CDF project is (e.g. westeurope-1
).CDF_CLUSTER
Project name The CDF short project name (e.g. myproject
).CDF_PROJECT
Client id The client id of the application/service principal you created in your identity provider. IDP_CLIENT_ID
Client secret* The secret of the application/service principal you created in your identity provider. IDP_CLIENT_SECRET
Token URL** The token URL of your identity provider IDP_TOKEN_URL
*Note you can use an interactive login-flow by setting
LOGIN_FLOW=interactive
, then you will prompted to login through a browser instead of using a client secret. This is available from version0.2.0
*
*Note that if you use Microsoft Entra ID, the tool only needs your tenant id, not the full URL. Thus, you can replaceIDP_TOKEN_URL
withIDP_TENANT_ID
as the tool will then create the full token URL from the tenant id. This is available from version0.2.0
* -
Once you have the above information, you can run the following command on a completely empty Cognite Data Fusion project (or see
.env.tmpl
if you want to fill in the information manually and then runcdf-tk auth verify
without the--interactive
flag):
cdf-tk auth verify --interactive
In the process, you will be prompted for the necessary information, and you will be asked if you want to store the information to a .env file locally.
Remember that .env
files can contain secrets and should never be checked into a version control repository like git. .env
is already added to the .gitignore
file created in your project directory.
Step 4: Configure your project
Congratulations!
You are now ready to start configuring modules for your project. See Using templates for the next steps.
Extras and Pro-Tips
Extra: Setting up an identity provider
Cognite offers its enterprise customers to have full control of their CDF projects and their data, and each CDF project may thus be configured with a different identity provider that controls access to the project. The identity providers role is to interactively log in users to the Cognite Data Fusion web application and to manage non-human clients (also called applications or service principals) that need to access the CDF project.
If you want to use a new Microsoft Entra ID instance, here are the steps to go through to set up Entra for your project:
- Create a new Entra tenant.
- Register Cognite API and core application.
- Create an Entra CDF full access/admin group for the
cdf-tk
service principal/application. - Create an Entra application/service principal to be used by the
cdf-tk
toolkit. - Add the new application/service principal to the admin group you created.
Alternative identity providers
There are slight varitions for each identity provider, but the general steps are the same.
You need OAuth2 support, and a token URL that ends in /oauth/token
in addition to the client id and secret for the application/service principal
set up in the identity provider.
The Identity provider documentation gives a deeper overview of how cdf-tk
interacts with the identity provider.
Pro-Tip: Using a token instead of a client id and secret
cdf-tk
also supports the CDF_TOKEN
environment variable. If you have created an OAuth2 token in some other ways,
e.g. using Postman, you can set the token in this variable. Only CDF_CLUSTER
and CDF_PROJECT
will then be needed.
Pro-Tip: Managing Python versions and virtual environments
Python is sensitive to the dependencies that are installed in your python environment. The cdf-tk
tool is built with
a minimum of dependencies, but it expects e.g. the Cognite Python SDK to be close to the latest version.
When you have a working global Python version (try python --version
, we recommend 3.11.x), we suggest that you install the
cdf-tk
tool in a virtual environment to ensure that the dependencies are not conflicting with other tools you have installed.
Unless you are familiar with a virtual environment manager like poetry, we recommend using pipx
.
Install pipx and then run: pipx install cognite-toolkit
This will install the cdf-tk
tool in a virtual environment, but still make it available for you as a command line tool without
remembering to activate the right virtual environment.
If you are comfortable with a virtual environment and package manager like poetry, you don't need pipx
but we still recommend
using it to install the cdf-tk
tool.
If you don't have a working global Python with the right version:
Many systems come installed with a system-wide Python installation that is used by default. This may or may not be the right version
(3.11 recommended). But instead of trying to upgrade the global version, it is better to install and control
additional versions of Python and manage which one to use by using a Python manager like pyenv
and for Windows.
Once you have a working version, you can install pipx
and cdf-tk
as described above.