1 Hour to LangChain: Build Your First LLM App That Actually Does Something

By the end of this hour, you'll have a working AI research assistant that can fetch any web article, summarize it, and answer specific questions about the content using LangChain and OpenAI.

🎯 What You'll Build

A command-line research assistant that takes a URL, extracts the content, creates an intelligent summary, and lets you ask follow-up questions:

$ python research_assistant.py
Enter article URL: https://example.com/article
✅ Article processed: "The Future of AI"
📝 Summary: This article discusses emerging trends in artificial intelligence...

Ask a question (or 'quit'): What are the main benefits mentioned?
🤖 The article highlights three key benefits: automation of repetitive tasks...

⏱️ Time Breakdown

0–10min

Environment setup and LangChain installation

10–25min

Build web scraper and text processor

25–40min

Create summarization chain

40–55min

Add Q&A functionality with memory

55–60min

Test and deploy your assistant

📋 Prerequisites

Python 3.8+ installed on your machine
OpenAI API key (free tier works fine)
Basic familiarity with Python and command line
Text editor or IDE of your choice

Step 1: Set Up Your LangChain Environment (0–10 min)

Create a new project directory and install the required packages:

mkdir langchain-assistant
cd langchain-assistant
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install LangChain and dependencies:

pip install langchain openai requests beautifulsoup4 python-dotenv

Create your environment file:

echo "OPENAI_API_KEY=your_api_key_here" > .env

Replace your_api_key_here with your actual OpenAI API key from platform.openai.com.

✅

Checkpoint

Run python -c "import langchain; print('LangChain installed successfully!')" - what happens?

Step 2: Build the Web Scraper (10–25 min)

Create scraper.py to fetch and clean web content:

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

class WebScraper:
    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            'User-Agent': 'Mozilla/5.0 (compatible; ResearchBot/1.0)'
        })
    
    def scrape_article(self, url):
        """Extract main content from a web article"""
        try:
            response = self.session.get(url, timeout=10)
            response.raise_for_status()
            
            soup = BeautifulSoup(response.content, 'html.parser')
            
            # Remove unwanted elements
            for element in soup(['script', 'style', 'nav', 'footer', 'header']):
                element.decompose()
            
            # Try to find main content
            content = self._extract_main_content(soup)
            
            title = soup.find('title')
            title_text = title.get_text().strip() if title else "Unknown Title"
            
            return {
                'title': title_text,
                'content': content,
                'url': url
            }
            
        except Exception as e:
            raise Exception(f"Failed to scrape {url}: {str(e)}")
    
    def _extract_main_content(self, soup):
        """Extract the main text content"""
        # Common content selectors
        selectors = ['article', 'main', '.content', '.post-content', '.entry-content']
        
        for selector in selectors:
            content_div = soup.select_one(selector)
            if content_div:
                return content_div.get_text(separator=' ', strip=True)
        
        # Fallback to body
        body = soup.find('body')
        return body.get_text(separator=' ', strip=True) if body else ""

Test your scraper:

# test_scraper.py
from scraper import WebScraper

scraper = WebScraper()
article = scraper.scrape_article("https://example.com")
print(f"Title: {article['title']}")
print(f"Content length: {len(article['content'])} characters")

✅

Checkpoint

Test scraping a simple article URL - does it return title and content without errors?

Step 3: Create the Summarization Chain (25–40 min)

Create summarizer.py to build your first LangChain chain:

import os
from dotenv import load_dotenv
from langchain.llms import OpenAI
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain
from langchain.docstore.document import Document
from langchain.prompts import PromptTemplate

load_dotenv()

class ArticleSummarizer:
    def __init__(self):
        self.llm = OpenAI(
            temperature=0.3,
            openai_api_key=os.getenv("OPENAI_API_KEY")
        )
        
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=3000,
            chunk_overlap=200
        )
        
        # Custom prompt for better summaries
        self.prompt_template = """
        Summarize this article section in a clear, concise way:
        
        {text}
        
        Focus on:
        - Main points and key insights
        - Important facts and data
        - Conclusions or recommendations
        
        Summary:
        """
        
        self.prompt = PromptTemplate(
            template=self.prompt_template,
            input_variables=["text"]
        )
    
    def summarize_article(self, article_data):
        """Create an intelligent summary of the article"""
        content = article_data['content']
        
        if len(content) < 100:
            raise ValueError("Article content too short to summarize")
        
        # Split text into manageable chunks
        texts = self.text_splitter.split_text(content)
        docs = [Document(page_content=text) for text in texts]
        
        # Create summarization chain
        chain = load_summarize_chain(
            self.llm,
            chain_type="map_reduce",
            map_prompt=self.prompt,
            combine_prompt=self.prompt
        )
        
        # Generate summary
        summary = chain.run(docs)
        
        return {
            'title': article_data['title'],
            'url': article_data['url'],
            'summary': summary.strip(),
            'original_length': len(content),
            'chunks_processed': len(docs)
        }

Test the summarizer:

# test_summary.py
from scraper import WebScraper
from summarizer import ArticleSummarizer

scraper = WebScraper()
summarizer = ArticleSummarizer()

article = scraper.scrape_article("https://example.com/your-test-article")
summary = summarizer.summarize_article(article)

print(f"Original: {summary['original_length']} chars")
print(f"Summary: {summary['summary']}")

Step 4: Add Q&A with Memory (40–55 min)

Create qa_system.py to handle questions about the article:

from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.llms import OpenAI
from langchain.docstore.document import Document

class QASystem:
    def __init__(self):
        self.llm = OpenAI(temperature=0.1)
        self.embeddings = OpenAIEmbeddings()
        self.memory = ConversationBufferMemory(
            memory_key="chat_history",
            return_messages=True
        )
        self.qa_chain = None
        self.vectorstore = None
    
    def setup_qa_chain(self, article_data):
        """Set up Q&A system for a specific article"""
        # Create document chunks
        text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,
            chunk_overlap=100
        )
        
        texts = text_splitter.split_text(article_data['content'])
        docs = [Document(page_content=text) for text in texts]
        
        # Create vector store for semantic search
        self.vectorstore = FAISS.from_documents(docs, self.embeddings)
        
        # Set up conversational chain
        self.qa_chain = ConversationalRetrievalChain.from_llm(
            self.llm,
            retriever=self.vectorstore.as_retriever(search_kwargs={"k": 3}),
            memory=self.memory,
            return_source_documents=True
        )
    
    def ask_question(self, question):
        """Ask a question about the article"""
        if not self.qa_chain:
            raise ValueError("QA system not initialized. Call setup_qa_chain first.")
        
        result = self.qa_chain({"question": question})
        
        return {
            'answer': result['answer'],
            'sources': len(result['source_documents']),
            'confidence': 'high' if len(result['source_documents']) >= 2 else 'medium'
        }
    
    def reset_conversation(self):
        """Clear conversation history"""
        self.memory.clear()

✅

Checkpoint

Initialize the QA system with an article and ask "What is this article about?" - do you get a relevant answer?

Step 5: Ship It (55–60 min)

Create the main application research_assistant.py:

#!/usr/bin/env python3
from scraper import WebScraper
from summarizer import ArticleSummarizer
from qa_system import QASystem

def main():
    print("🔬 AI Research Assistant")
    print("=" * 40)
    
    # Initialize components
    scraper = WebScraper()
    summarizer = ArticleSummarizer()
    qa_system = QASystem()
    
    try:
        # Get article URL
        url = input("Enter article URL: ").strip()
        
        print("📥 Fetching article...")
        article = scraper.scrape_article(url)
        
        print("🤖 Generating summary...")
        summary_result = summarizer.summarize_article(article)
        
        print(f"\n✅ Article processed: \"{summary_result['title'][:50]}...\"")
        print(f"📝 Summary ({summary_result['chunks_processed']} sections):")
        print(f"{summary_result['summary']}\n")
        
        # Set up Q&A
        print("🔧 Setting up Q&A system...")
        qa_system.setup_qa_chain(article)
        
        # Interactive Q&A loop
        print("💬 Ask questions about the article (type 'quit' to exit):")
        
        while True:
            question = input("\nYour question: ").strip()
            
            if question.lower() in ['quit', 'exit', 'q']:
                break
            
            if not question:
                continue
            
            try:
                response = qa_system.ask_question(question)
                print(f"🤖 {response['answer']}")
                print(f"   (Confidence: {response['confidence']}, Sources: {response['sources']})")
            
            except Exception as e:
                print(f"❌ Error answering question: {e}")
        
        print("\n👋 Thanks for using AI Research Assistant!")
        
    except KeyboardInterrupt:
        print("\n👋 Goodbye!")
    except Exception as e:
        print(f"❌ Error: {e}")

if __name__ == "__main__":
    main()

Make it executable and test:

chmod +x research_assistant.py
python research_assistant.py

🎉 Your AI research assistant is ready! Test it with a news article or blog post to see how it summarizes content and answers your questions.

🎁 Bonus

Custom extraction: Add support for PDF files using PyPDF2 for research papers
Web interface: Build a simple Flask/Streamlit web UI instead of command-line
Export feature: Save summaries and Q&A sessions to markdown files for later reference

📚 Next Steps

→

1 Hour to Pro Prompts: 10x your ChatGPT output with proven prompt patterns

Master 8 battle-tested prompt patterns that turn basic ChatGPT conversations into precise, professional outputs

60 min

🔗 Resources

LangChain Documentation - Complete framework reference
OpenAI API Guide - API usage and best practices
Vector Databases Explained - Understanding embeddings and search
Prompt Engineering Guide - Writing better prompts for LLMs
LangChain Cookbook - Real-world examples and patterns