Skip to main content

Document question answering

The Document question answering is an AI-powered assistant that, with the help of the Semantic search API, helps you find relevant information for any question and passes the retrieved information into an LLM for interpretation into a natural language answer.

This article shows you how to implement the Document question answering API, upoad a PDF to Cognite Data Fusion (CDF), and use this API to ask questions about the uploaded document. All PDF documents uploaded to CDF automatically pass through a Retrieval Augmented Generation (RAG) pipeline. The documents are parsed and OCRed, and all the contained information is indexed for the Semantic search API.

You can also run the instruction and Python code in this article as a Jupyter Notebook in your CDF project.

info

You can refer to the Semantic search API as it relates to the Document question answering API.

Implement Document question qnswering

You can implement the Document question answering API in the following way:

  1. Pass the question and the list of file IDs to the Semantic search API and get a list of passages back.

  2. Construct a prompt from the original question and the list of passages.

  3. Pass the prompt to an LLM and get a natural language answer.

  4. Return the answer from the LLM.

Step 1. Upload PDF

You can upload a PDF file to CDF one of the following ways:

  • Go to CDF > Industrial tools > Canvas and drag your PDF file to the canvas or upload existing files by selecting + Add data.
    If you don't have a good file to upload, try this test file.

  • Go to CDF > Industrial tools > Data explorer > Files and select Upload.

  • Use the Python code.

response1 = client.files.upload(path="./well_report.pdf")
document_id = response1.id
print(document_id)

Step 2. Processing

Once you've uploaded the file, wait for it to pass through the RAG pipeline. You can use the Document status API to poll the status.

import time

status_path = f"/api/v1/projects/{client.config.project}/documents/status"

body = {
"items": [
{
"id": document_id
}
]
}

while True:
response2 = client.post(status_path, json=body, headers={"cdf-version": "alpha"}).json()

status = response2["items"][0]["semanticsearch"]["status"]
print(f"status: {status}")

if status in {"waiting", "progress"}:
time.sleep(5)
continue

break

Step 3. Ask questions

Once the document is fully indexed, start asking questions with the Python code.

import json

ask_path = f"/api/v1/projects/{client.config.project}/ai/tools/documents/ask"

body = {
"question": "Where is the Volve field located?",
"fileIds": [
{
"id": document_id
}
]
}

response3 = client.post(ask_path, json=body, headers={"cdf-version": "beta"}).json()

print(json.dumps(response3, indent=2))

See the response for the test file.

{
"content": [
{
"text": "The Volve field is located in the southern part of the North Sea, approximately eight kilometers north of Sleipner \u00d8st.",
"references": [
{
"fileId": 7743081064762478,
"locations": [
{
"pageNumber": 4,
"left": 57.59,
"right": 64.61,
"top": 43.58,
"bottom": 53.54
}
]
}
]
}
]
}

The response is more than a simple textual answer. The response structure allows for a multi-part answer, where each part of the answer can have one or more references to the document locations that were used to build the answer. If you are not interested in showing these references, you can iterate over the content array and combine all the text fields.