Pinecone Assistant makes it easy to build knowledgeable chat and agent-based AI applications in minutes. Simply upload documents, ask questions about them, and receive context snippets or AI-generated responses that reference the uploaded documents.
Citations or references in Pinecone Assistant help ensure responses are explainable and grounded in your proprietary knowledge. Each citation links to one or more reference, pointing to specific sections of a document. With citation highlights, Pinecone Assistant can now pinpoint the exact section or sentence used to generate a response—providing even greater transparency and trust. In this technical guide, we’ll show you how to get started with Pinecone Assistant and leverage citation highlights.
data:image/s3,"s3://crabby-images/afa5d/afa5d7ca2a7f8db4d90759595d1bae6080b28a8f" alt=""
Getting started
To get started, let’s create an assistant and load a document. Citation highlights are available in the Pinecone console or API versions 2025-04 and later, so make sure you have the latest version installed.
!pip install --upgrade pinecone pinecone-plugin-assistant
Now you’re ready to create a new assistant:
import pinecone_plugins.assistant.models
from pinecone import Pinecone
import pinecone_plugins, os
os.environ["PINECONE_API_KEY"] = api_key
# Set Assistant name
assistant_name = "citations-examples"
pc = Pinecone()
assistants_list = pc.assistant.list_assistants()
if assistant_name not in [a.name for a in assistants_list]:
assistant = pc.assistant.create_assistant(assistant_name)
else:
assistant = pc.assistant.Assistant(assistant_name=assistant_name)
assistant
Download Netflix’s 2023 10K Fillings and upload them to your assistant. Note: Pinecone Assistant supports the following file types as input: PDF, JSON, Markdown, Text, and Docx.
!wget -O netflix-10k.pdf https://s22.q4cdn.com/959853165/files/doc_financials/2023/ar/Netflix-10-K-01262024.pdf
file_names = [f.name for f in assistant.list_files()]
file_name = "netflix-10k.pdf"
if file_name not in file_names:
# Upload a file with metadata
response = assistant.upload_file(
file_path=file_name,
timeout=None
)
print(response)
else:
print(f"file {file_name} already uploaded")
assistant.list_files()
# [{'name': 'netflix-10k.pdf', ...}]
Running queries and analyzing citations
Let’s now run a simple query on our documents:
from pinecone_plugins.assistant.models import Message
messages = [Message(role= "user", content ="Who is ths Senior Vice President and Chief Financial Officer of Netflix?")]
response = assistant.chat(messages=messages, include_highlights = True)
The assistant returns a response message and citations (references) with citation highlights:
Response message
This is a simple string that is the direct answer to the question it can be accessed as follows
response.message.content
# The Senior Vice President and Chief Financial Officer of Netflix is Spencer Neumann.
Citations and citation highlights:
Citations are structured as an array, with references mapping to specific locations in a document. Each reference includes a highlight object containing the precise excerpt used.
response.citations[0].position
# 83
response.citations[0].references[0].file.name
# netflix-10k.pdf
response.citations[0].references[0].pages[0]
# 78
response.citations[0].references[0].highlight.content
# CERTIFICATION OF CHIEF FINANCIAL OFFICER PURSUANT TO SECTION 302 OF THE SARBANES-OXLEY ACT OF 2002 I, Spencer Neumann, certify that
Inline citations
Inline citations embed relevant citations directly within the text, placing them exactly where the referenced information appears.
Since the citation structure is explicit and flexible, we need to write a small helper function that will insert citations into the text with [ ] around them:
def insert_citations(response) -> str:
"""
Insert citation markers [i] at specified positions in the text.
Processes positions in order, adjusting for previous insertions.
Args:
response: Pinecone Assistant Chat Response
Returns:
Modified text with citation markers inserted
"""
result = response.message.content
citations = response.citations
offset = 0 # Keep track of how much we've shifted the text
for i, cite in enumerate(citations, start=1):
citation = f"[{i}]"
position = cite.position
adjusted_position = position + offset
result = result[:adjusted_position] + citation + result[adjusted_position:]
offset += len(citation)
return result
With inline citation, the example from above would instead be structured as below:
insert_citations(response)
# The Senior Vice President and Chief Financial Officer of Netflix is Spencer Neumann[1].
Accessing a file on a specific page
Some files and file browser viewers (e.g. Chrome on PDFs) allow you to view files on a certain page. In our example, each URL is digitally signed in request, so to query a file on a certain page (as a blue link).
from IPython.display import display, Markdown
display(Markdown(f"Page cited: [link]({response.citations[0].references[0].file.signed_url}#page={response.citations[0].references[0].pages[0]})"))
# url = f"{response.citations[0].references[0].file.signed_url}#page={response.citations[0].references[0].pages[0]}"
Start building today
Pinecone Assistant is now generally available for all users in the US and EU regions. For Standard and Enterprise users, usage starts at $0.05/Assistant per hour, and Context Processed Tokens are $5/1M tokens. See our pricing page for more information or check out the below resources to learn more: