Skip to main content

Document summary

Document summary is an AI-powered assistant that helps you create a summary for one or a maximum of 100 documents in .pdf format. You can upload PDF files to Cognite Data Fusion (CDF) and use the Document summary API to create a document summary.

You can also run the instruction and Python code in this article as a Jupyter Notebook in your CDF project.

Step 1. Upload PDF

You can upload a PDF file to CDF one of the following ways:

  • Go to CDF > Industrial tools > Canvas and drag your PDF file to the canvas or upload existing files by selecting + Add data.
    If you don't have a good file to upload, try this test file.

  • Go to CDF > Industrial tools > Data explorer > Files and select Upload.

  • Use the Python code.

response1 = client.files.upload(path="./well_report.pdf")
document_id = response1.id
print(document_id)

Step 2. Processing

Once you've uploaded the file, wait for it to pass through the RAG pipeline. You can use the Document status API to poll the status.

import time

status_path = f"/api/v1/projects/{client.config.project}/documents/status"

body = {
"items": [
{
"id": document_id
}
]
}

while True:
response2 = client.post(status_path, json=body, headers={"cdf-version": "alpha"}).json()

status = response2["items"][0]["semanticsearch"]["status"]
print(f"status: {status}")

if status in {"waiting", "progress"}:
time.sleep(5)
continue

break

Step 3. Create summary

Once the document is fully indexed, create the document summary with the Python code.

import json

summarize_path = f"/api/v1/projects/{client.config.project}/ai/tools/documents/summarize"

body = {
"items": [
{
"id": document_id
}
]
}

response3 = client.post(summarize_path, json=body, headers={"cdf-version": "beta"}).json()

print(json.dumps(response3, indent=2))

See the result of the test file.

{
"items": [
{
"id": 6716860071641521,
"summary": "This document is a well summary report for the Volve F well in Norway. It provides
information on the site position, well position, wellpath survey depths, and wellpath plan depths. The
report includes details such as latitude, longitude, position uncertainty, water depth, wellhead depth, and
survey dates. It also mentions the survey tools used for each survey. The document is dated April 11, 2018."
}
]
}