A Code Implementation To Use Ollama Through Google Colab And Building A Local Rag Pipeline On Using Deepseek-r1 1.5b Through Ollama, Langchain, Faiss, And Chromadb For Q&a

1 week ago

ARTICLE AD BOX

In this tutorial, we’ll build a afloat functional Retrieval-Augmented Generation (RAG) pipeline utilizing open-source devices that tally seamlessly connected Google Colab. First, we will look into really to group up Ollama and usage models done Colab. Integrating nan DeepSeek-R1 1.5B ample connection exemplary served done Ollama, nan modular orchestration of LangChain, and nan high-performance ChromaDB vector shop allows users to query real-time accusation extracted from uploaded PDFs. With a operation of section connection exemplary reasoning and retrieval of actual information from PDF documents, nan pipeline demonstrates a powerful, private, and cost-effective alternative.

!pip instal colab-xterm %load_ext colabxterm

We usage nan colab-xterm hold to alteration terminal entree straight wrong nan Colab environment. By installing it pinch !pip instal collab and loading it via %load_ext colabxterm, users tin unfastened an interactive terminal model wrong Colab, making it easier to tally commands for illustration llama service aliases show section processes.

The %xterm magic bid is utilized aft loading nan collab hold to motorboat an interactive terminal model wrong nan Colab notebook interface. This allows users to execute ammunition commands successful existent time, conscionable for illustration a regular terminal, making it particularly useful for moving inheritance services for illustration llama serve, managing files, aliases debugging system-level operations without leaving nan notebook.

Here, we instal ollama utilizing curl https://ollama.ai/install.sh | sh.

Then, we commencement nan ollama utilizing ollama serve.

At last, we download nan DeepSeek-R1:1.5B done ollama locally that tin beryllium utilized for building nan RAG pipeline.

!pip instal langchain langchain-community sentence-transformers chromadb faiss-cpu

To group up nan halfway components of nan RAG pipeline, we instal basal libraries, including langchain, langchain-community, sentence-transformers, chromadb, and faiss-cpu. These packages alteration archive processing, embedding, vector storage, and retrieval functionalities required to build an businesslike and modular section RAG system.

from langchain_community.document_loaders import PyPDFLoader from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_community.vectorstores import Chroma from langchain_community.embeddings import HuggingFaceEmbeddings from langchain_community.llms import Ollama from langchain.chains import RetrievalQA from google.colab import files import os from langchain_core.prompts import ChatPromptTemplate from langchain_ollama.llms import OllamaLLM

We import cardinal modules from nan langchain-community and langchain-ollama libraries to grip PDF loading, matter splitting, embedding generation, vector retention pinch Chroma, and LLM integration via Ollama. It besides includes Colab’s record upload inferior and punctual templates, enabling a seamless travel from archive ingestion to query answering utilizing a locally hosted model.

print("Please upload your PDF file...") uploaded = files.upload() file_path = list(uploaded.keys())[0] print(f"File '{file_path}' successfully uploaded.") if not file_path.lower().endswith('.pdf'): print("Warning: Uploaded record is not a PDF. This whitethorn origin issues.")

To let users to adhd their knowledge sources, we punctual for a PDF upload utilizing google.colab.files.upload(). It verifies nan uploaded record type and provides feedback, ensuring that only PDFs are processed for further embedding and retrieval.

!pip instal pypdf import pypdf loader = PyPDFLoader(file_path) documents = loader.load() print(f"Successfully loaded {len(documents)} pages from PDF")

To extract contented from nan uploaded PDF, we instal nan pypdf room and usage PyPDFLoader from LangChain to load nan document. This process converts each page of nan PDF into a system format, enabling downstream tasks for illustration matter splitting and embedding.

text_splitter = RecursiveCharacterTextSplitter( chunk_size=1000, chunk_overlap=200 ) chunks = text_splitter.split_documents(documents) print(f"Split documents into {len(chunks)} chunks")

The loaded PDF is divided into manageable chunks utilizing RecursiveCharacterTextSplitter, pinch each chunk sized astatine 1000 characters and a 200-character overlap. This ensures amended discourse retention crossed chunks, which improves nan relevance of retrieved passages during mobility answering.

embeddings = HuggingFaceEmbeddings( model_name="all-MiniLM-L6-v2", model_kwargs={'device': 'cpu'} ) persist_directory = "./chroma_db" vectorstore = Chroma.from_documents( documents=chunks, embedding=embeddings, persist_directory=persist_directory ) vectorstore.persist() print(f"Vector shop created and persisted to {persist_directory}")

The matter chunks are embedded utilizing nan all-MiniLM-L6-v2 exemplary from sentence-transformers, moving connected CPU to alteration semantic search. These embeddings are past stored successful a persistent ChromaDB vector store, allowing businesslike similarity-based retrieval crossed sessions.

llm = OllamaLLM(model="deepseek-r1:1.5b") retriever = vectorstore.as_retriever( search_type="similarity", search_kwargs={"k": 3} ) qa_chain = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True ) print("RAG pipeline created successfully!")

The RAG pipeline is finalized by connecting nan section DeepSeek-R1 exemplary (via OllamaLLM) pinch nan Chroma-based retriever. Using LangChain’s RetrievalQA concatenation pinch a “stuff” strategy, nan exemplary retrieves nan apical 3 astir applicable chunks to a query and generates context-aware answers, completing nan section RAG setup.

def query_rag(question): consequence = qa_chain({"query": question}) print("nQuestion:", question) print("nAnswer:", result["result"]) print("nSources:") for i, doc successful enumerate(result["source_documents"]): print(f"Source {i+1}:n{doc.page_content[:200]}...n") return result question = "What is nan main taxable of this document?" result = query_rag(question)

To trial nan RAG pipeline, a query_rag usability takes a personification question, retrieves applicable discourse utilizing nan retriever, and generates an reply utilizing nan LLM. It besides displays nan apical root documents, providing transparency and traceability for nan model’s response.

In conclusion, this tutorial combines ollama, nan retrieval powerfulness of ChromaDB, nan orchestration capabilities of LangChain, and nan reasoning abilities of DeepSeek-R1 via Ollama. It showcased building a lightweight yet powerful RAG strategy that runs efficiently connected Google Colab’s free tier. The solution enables users to inquire questions grounded successful up-to-date contented from uploaded documents, pinch answers generated done a section LLM. This architecture provides a instauration for building scalable, customizable, and privacy-friendly AI assistants without incurring unreality costs aliases compromising performance.

Here is nan Colab Notebook. Also, don’t hide to travel america on Twitter and subordinate our Telegram Channel and LinkedIn Group. Don’t Forget to subordinate our 85k+ ML SubReddit.

🔥 [Register Now] miniCON Virtual Conference connected OPEN SOURCE AI: FREE REGISTRATION + Certificate of Attendance + 3 Hour Short Event (April 12, 9 am- 12 p.m. PST) + Hands connected Workshop [Sponsored]

Asif Razzaq is nan CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing nan imaginable of Artificial Intelligence for societal good. His astir caller endeavor is nan motorboat of an Artificial Intelligence Media Platform, Marktechpost, which stands retired for its in-depth sum of instrumentality learning and heavy learning news that is some technically sound and easy understandable by a wide audience. The level boasts of complete 2 cardinal monthly views, illustrating its fame among audiences.