langchain[Edit section][Copy link]
LangChain is a library designed to facilitate the development of context-aware reasoning applications. It provides a robust framework for engineers to build, manage, and deploy applications that leverage language models for complex reasoning tasks. The library addresses real-world problems where understanding and processing natural language is crucial, such as in conversational AI, document analysis, and data retrieval systems.
The most significant parts of the repo include the integration and orchestration of language models and agents, document loading and transformation, chain building for processing language data, and the creation of various tools for interaction with external services and APIs. For instance, the …/agents
directory, with a substantial file count, indicates a rich set of functionalities for managing agents that interact with language models to perform tasks.
Key functionalities of LangChain are built around the concept of "chains," which are sequences of components that process language data. These chains are constructed using various components such as API calls, conversational retrieval, and database interactions, as detailed in the Chains and Chain Components section. For example, the APIChain
class orchestrates API calls and summarizes responses, while ConversationalRetrievalChain
manages conversational interactions with retrieval systems.
The codebase relies on several key algorithms and technologies, including natural language processing models, graph databases, and vector space models for embeddings. It utilizes adapters like OpenAIEmbeddings
and HuggingFaceEmbeddings
to generate text embeddings, which are critical for semantic understanding and search capabilities.
Design choices in LangChain prioritize modularity, extensibility, and integration with a wide range of services. This is evident in the structure of the …/tools
directory, which houses a diverse set of tools for tasks such as file management, data retrieval, and user interaction. These tools enable LangChain to interact with APIs from Google, Bing, GitHub, and more, as well as manage files and handle user input.
In summary, LangChain serves as a versatile toolkit for engineers to build applications that require sophisticated language understanding and reasoning. It abstracts the complexity of integrating with language models and provides a suite of tools and components for building end-to-end language-based solutions.
Language Models and Agents[Edit section][Copy link]
References: libs/langchain/langchain/adapters
, libs/langchain/langchain/agents
Integration with various language models, such as OpenAI's API, is facilitated through adapters in …/adapters
. These adapters handle tasks like chat completions, text completions, and model fine-tuning. For instance, ChatCompletions
manages streaming responses from the OpenAI API, while utility functions like convert_dict_to_message()
bridge the gap between OpenAI-specific message formats and the general Message
object used in LangChain.
Agent Base Classes and Implementations[Edit section][Copy link]
References: libs/langchain/langchain/agents/agent_toolkits/base.py
, libs/langchain/langchain/agents/chat/base.py
, libs/langchain/langchain/agents/conversational/base.py
, libs/langchain/langchain/agents/mrkl/base.py
, libs/langchain/langchain/agents/openai_assistant/base.py
, libs/langchain/langchain/agents/react/base.py
, libs/langchain/langchain/agents/structured_chat/base.py
, libs/langchain/langchain/agents/xml/base.py
The BaseToolkit
class serves as the foundational structure for developing agent toolkits within the LangChain framework. It establishes a uniform interface for extending the capabilities of LangChain agents, allowing for the integration of diverse functionalities.
Agent Execution and Output Parsing[Edit section][Copy link]
References: libs/langchain/langchain/agents/agent_toolkits/base.py
, libs/langchain/langchain/agents/output_parsers
Agent execution within LangChain is managed by a set of output parsers located in …/output_parsers
. These parsers are responsible for interpreting the output from language models and determining the appropriate course of action. The decision-making process involves parsing the output to extract either an actionable command or a final answer.
Agent Toolkits and Utilities[Edit section][Copy link]
References: libs/langchain/langchain/agents/agent_toolkits
, libs/langchain/langchain/agents/agent_toolkits/azure_cognitive_services.py
, libs/langchain/langchain/agents/agent_toolkits/openapi
, libs/langchain/langchain/agents/agent_toolkits/powerbi
, libs/langchain/langchain/agents/agent_toolkits/spark
, libs/langchain/langchain/agents/agent_toolkits/spark_sql
, libs/langchain/langchain/agents/agent_toolkits/sql
, libs/langchain/langchain/agents/agent_toolkits/vectorstore
, libs/langchain/langchain/agents/agent_toolkits/json
, libs/langchain/langchain/agents/agent_toolkits/nla
Integration with external resources in LangChain is facilitated through a dynamic import system and a suite of agent toolkits. These toolkits enable LangChain agents to leverage external APIs, databases, and services, enhancing their capabilities beyond language processing.
Read moreConversational Agents[Edit section][Copy link]
References: libs/langchain/langchain/agents/conversational
, libs/langchain/langchain/agents/conversational_chat
Conversational agents in the LangChain library, located within …/conversational
and …/conversational_chat
, are specialized in maintaining conversation history and generating responses using integrated tools. They leverage the ConversationalAgent
and ConversationalChatAgent
classes to manage interactions with users, ensuring a seamless conversational experience.
Structured Interaction Agents[Edit section][Copy link]
References: libs/langchain/langchain/agents/structured_chat
, libs/langchain/langchain/agents/self_ask_with_search
Structured interaction agents in LangChain facilitate user engagement through predefined interaction patterns, leveraging scratchpad construction, prompt creation, and output parsing to manage the flow of conversation. The structured chat agent, implemented in …/base.py
, utilizes a scratchpad to maintain a record of previous actions and tool outputs. The scratchpad is constructed using the _construct_scratchpad()
method, which organizes the agent's internal state for reference in subsequent interactions.
Document and Environment Interaction Agents[Edit section][Copy link]
References: libs/langchain/langchain/agents/react
, libs/langchain/langchain/agents/react/textworld_prompt.py
, libs/langchain/langchain/agents/react/wiki_prompt.py
Agents designed for interaction with document stores and environments like TextWorld are encapsulated within the ReAct agent framework. The primary classes facilitating these interactions are ReActDocstoreAgent
and ReActTextWorldAgent
, both extending the Agent
class to provide specialized functionality.
Format-Specific Agents[Edit section][Copy link]
References: libs/langchain/langchain/agents/xml
, libs/langchain/langchain/agents/openai_functions_agent
, libs/langchain/langchain/agents/format_scratchpad
In LangChain, format-specific agents like XMLAgent
and OpenAIFunctionsAgent
handle interactions with tools and language models using structured formats. The XMLAgent
operates by embedding tool interactions within XML tags, while the OpenAIFunctionsAgent
leverages OpenAI's function-enabled capabilities for planning and execution.
Document Loaders and Transformers[Edit section][Copy link]
References: libs/langchain/langchain/document_loaders
, libs/langchain/langchain/document_transformers
Document loaders in LangChain facilitate the ingestion of data from a multitude of sources, transforming them into a uniform Document
object structure. These loaders are capable of handling various data types, including text, binary blobs, and structured data from databases or APIs. The loaders are designed to be extensible, allowing for custom implementations to suit specific data source requirements.
Document Loading[Edit section][Copy link]
References: libs/langchain/langchain/document_loaders
Document loading in LangChain is facilitated through a variety of loader classes, each tailored to handle specific data sources. These loaders are essential for ingesting data from file systems, web pages, databases, and cloud storage services, transforming them into a uniform format for further processing.
Read moreBlob Loaders[Edit section][Copy link]
References: libs/langchain/langchain/document_loaders/blob_loaders
The BlobLoader
abstract base class provides a standardized interface for loading binary data, such as audio and video files, into Blob
objects. Implementations of this class must define the load()
method, which is responsible for fetching the data and returning it as a Blob
object.
Parsers[Edit section][Copy link]
References: libs/langchain/langchain/document_loaders/parsers
The BS4HTMLParser
utilizes BeautifulSoup4 to parse HTML documents, extracting text and metadata. It initializes with HTML content and returns a list of Document
objects with parsed data.
Specialized Document Loaders[Edit section][Copy link]
References: libs/langchain/langchain/document_loaders/airtable.py
, libs/langchain/langchain/document_loaders/apify_dataset.py
, libs/langchain/langchain/document_loaders/arxiv.py
Individual loader classes within the LangChain framework are tailored to specific data sources, enabling the ingestion of documents from platforms like Airtable, ApifyDataset, and ArXiv. Each loader class transforms the data into a uniform Document
object format, facilitating further processing within the LangChain library.
Document Transformation[Edit section][Copy link]
References: libs/langchain/langchain/document_transformers
Transforming documents into formats suitable for language processing tasks is achieved through a variety of classes within the LangChain library. The DoctranPropertyExtractor
extracts specific properties or metadata from documents, leveraging the Doctran library's capabilities. This class is a straightforward interface to the underlying Doctran functionality, requiring no additional methods or attributes beyond the Doctran object itself.
HTML and Text Transformation[Edit section][Copy link]
References: libs/langchain/langchain/document_transformers/beautiful_soup_transformer.py
, libs/langchain/langchain/document_transformers/html2text.py
, libs/langchain/langchain/document_transformers/doctran_text_translate.py
, libs/langchain/langchain/document_transformers/google_translate.py
The BeautifulSoupTransformer
class utilizes the BeautifulSoup library to extract text from HTML documents. The class is initialized with an optional HTML parser argument, defaulting to "html.parser"
. The primary method, transform
, accepts an HTML string and returns the plain text content, making it a critical component in the document processing pipeline located at …/beautiful_soup_transformer.py
.
Document Filtering and Clustering[Edit section][Copy link]
References: libs/langchain/langchain/document_transformers/embeddings_redundant_filter.py
Embeddings play a crucial role in the filtering and clustering of documents within the LangChain framework, specifically through the …/embeddings_redundant_filter.py
file. The primary classes involved in this process are EmbeddingsRedundantFilter
and EmbeddingsClusteringFilter
.
Metadata Tagging and Long Context Reordering[Edit section][Copy link]
References: libs/langchain/langchain/document_transformers/openai_functions.py
, libs/langchain/langchain/document_transformers/long_context_reorder.py
The OpenAIMetadataTagger
class utilizes the OpenAI API to enhance documents with metadata. It is instantiated through the create_metadata_tagger()
function, which configures the tagger with the necessary API key, model, and token limits. The metadata tagging process enriches documents with additional context, aiding in the comprehension and processing by language models.
Chains and Chain Components[Edit section][Copy link]
References: libs/langchain/langchain/chains
Chains in LangChain orchestrate the flow of language data through a sequence of specialized components. The ConversationChain
in …/conversation
manages conversational states, leveraging memory modules such as ConversationBufferMemory
to maintain the context of the dialogue.
Base Chain Classes[Edit section][Copy link]
References: libs/langchain/langchain/chains/base.py
The Chain
class serves as an abstract base class for creating and using chains in the LangChain library, located at …/base.py
. It establishes a standard interface and core functionalities that all concrete chain implementations must adhere to. Key features provided by the Chain
class include:
API Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/api
API chains in LangChain facilitate the orchestration of API calls through a series of classes that manage the construction of API requests, the sending of these requests, and the summarization of the responses. The primary classes involved in this process are APIChain
, OpenAPIEndpointChain
, APIRequesterChain
, and APIResponderChain
.
Conversational Retrieval Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/conversational_retrieval
The ConversationalRetrievalChain
class orchestrates conversational interactions with retrieval systems, specifically designed to work with a knowledge base stored in a vector database. It leverages a VectorStoreRetriever
to fetch relevant information based on user input, facilitating a dialogue where the system can provide contextually appropriate responses.
Database Interaction Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/sql_database
, libs/langchain/langchain/chains/elasticsearch_database
Interacting with SQL databases in LangChain is primarily handled by the SQLDatabaseChain
class located at …/__init__.py
. This class is designed to take user input, generate SQL queries using predefined prompt templates, execute these queries against the database, and return the results. The process involves the following steps:
Question Answering Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/graph_qa
, libs/langchain/langchain/chains/qa_with_sources
GraphQAChain
serves as the base class for question-answering over graph databases, utilizing an LLMChain
for entity extraction and another LLMChain
for generating the final answer. The _call()
method orchestrates the process by extracting entities, retrieving context from the graph, and passing both to the qa_chain
.
Router Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/router
Routing within LangChain is managed by the RouterChain
class, which serves as an abstract base class defining the interface for routing inputs to different destination chains. The route()
and aroute()
methods determine the destination chain and the next inputs to be passed to that chain, with the expected output being the keys "destination" and "next_inputs".
Summarization and Verification Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/summarize
, libs/langchain/langchain/chains/qa_generation
, libs/langchain/langchain/chains/llm_summarization_checker
Summarization chains in LangChain facilitate the condensation of documents into more manageable forms. Chains like StuffDocumentsChain
, MapReduceDocumentsChain
, and RefineDocumentsChain
offer different strategies for summarization, which can be selected via load_summarize_chain()
depending on the user's needs. For instance, StuffDocumentsChain
concatenates documents before summarization, while MapReduceDocumentsChain
applies a map-reduce pattern, summarizing each document before combining them.
OpenAI Function Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/openai_functions
Specialized chains in …/openai_functions
utilize OpenAI's function-calling APIs to perform tasks such as text tagging, data extraction, and generating structured outputs. These chains are composed of various components that work together to enable complex natural language processing tasks.
Conversational Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/conversation
Conversational chains in LangChain are managed by the ConversationChain
class, which orchestrates the flow of dialogue by maintaining the state of the conversation and generating responses. This class leverages a memory object to store the conversation history, ensuring that context is preserved across interactions.
Document Combination Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/combine_documents
Combining multiple documents into a single cohesive document is facilitated by a suite of classes within the …/combine_documents
directory. These classes offer various strategies to merge documents, ensuring the process is both flexible and extensible to accommodate different use cases.
Query Constructor Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/query_constructor
In …/query_constructor
, structured queries are constructed from natural language input, with a focus on transforming these queries into an intermediate representation (IR). The process involves several key components:
LLM Checker Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/llm_checker
The LLMCheckerChain
orchestrates a multi-step process to enhance the reliability of language model-generated answers. It iterates through drafting, checking, and revising responses based on self-verification. Located in …/llm_checker
, the chain employs a SequentialChain
to manage the workflow, which includes the following steps:
Experimental Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/flare
, libs/langchain/langchain/chains/hyde
The FlareChain
class orchestrates a novel approach to response generation by integrating a retriever, a question generator, and a response generator. It operates through an iterative process where responses are generated, low-confidence spans are identified, and additional information is retrieved to refine the response. This process leverages the QuestionGeneratorChain
to pose questions based on uncertain parts of the initial response, which are then used to fetch more contextually relevant information to enhance the final output.
ERNIE Function Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/ernie_functions
ERNIE function chains facilitate the execution of modular, composable functions within a language model-based system. These chains are sequences of ERNIE functions that can be executed in order, allowing for complex operations and structured outputs. The primary utilities for working with ERNIE function chains are found in …/ernie_functions
.
Chat Vector Database Chains[Edit section][Copy link]
References: libs/langchain/langchain/chains/chat_vector_db
In …/chat_vector_db
, chains are designed to enhance chat-based applications by utilizing vector databases for efficient question rephrasing and answer generation. The directory includes prompts.py
, which defines two PromptTemplate
objects crucial for the chain's operations:
Callbacks and Run Management[Edit section][Copy link]
References: libs/langchain/langchain/callbacks
LangChain's callback system facilitates the monitoring and management of language model and chain executions, as well as the orchestration of run processes, including asynchronous operations. The system is designed to provide extensibility and integration with various external platforms and services, offering a structured approach to event tracking and analysis.
Read moreCallback Handlers Integration[Edit section][Copy link]
References: libs/langchain/langchain/callbacks
LangChain integrates with a variety of external platforms and services through callback handlers, which are essential for monitoring and analyzing the execution of language-based applications. These handlers are located in …/callbacks
.
Tracing and Observability[Edit section][Copy link]
References: libs/langchain/langchain/callbacks/tracers
LangChain provides a suite of tracing and observability tools within the …/tracers
directory, enabling developers to monitor and record the execution of models and pipelines. Key components include:
Streamlit Visualization[Edit section][Copy link]
References: libs/langchain/langchain/callbacks/streamlit
The StreamlitCallbackHandler
class provides a user interface for monitoring the execution of language models in real-time within a Streamlit application. It captures various events such as the start and end of language model executions, new token generation, tool usage, and errors. The class updates the Streamlit application with this information, organizing it into a visually appealing format for the user.
Run Management[Edit section][Copy link]
References: libs/langchain/langchain/callbacks/manager.py
LangChain's run management is facilitated through a hierarchy of manager classes, each tailored to handle callbacks for different execution contexts. The BaseRunManager
serves as the foundational class, establishing a framework for managing callbacks during the execution of LangChain components. Derived from this base are several specialized manager classes that cater to synchronous and asynchronous operations.
Utility Classes and Functions[Edit section][Copy link]
References: libs/langchain/langchain/callbacks
Utility classes and functions in the LangChain library provide foundational support for the callback system, facilitating the integration of environment variables and the management of callbacks across various components. The …/base.py
file introduces abstract base classes and mixins that serve as the backbone for creating custom callback handlers. These include BaseCallbackHandler
and AsyncCallbackHandler
, which define the interface for synchronous and asynchronous callback handling, respectively.
Embeddings and Evaluation[Edit section][Copy link]
References: libs/langchain/langchain/embeddings
, libs/langchain/langchain/evaluation
Embedding generation and evaluation within the LangChain library are handled through a variety of classes designed to interface with different models and services. The Embeddings
class serves as a foundational interface for generating vector representations of text inputs across various embedding models. Subclasses like OpenAIEmbeddings
, HuggingFaceEmbeddings
, and CohereEmbeddings
implement this interface to provide embeddings from their respective services. For instance, OpenAIEmbeddings
uses the OpenAI API to generate embeddings, while HuggingFaceEmbeddings
leverages transformer models from Hugging Face.
Embedding Models and Services[Edit section][Copy link]
References: libs/langchain/langchain/embeddings
The LangChain library provides a suite of classes for generating text embeddings, which are vector representations of text inputs. These embeddings are crucial for various natural language processing tasks as they capture the semantic meaning of the text.
Read moreEmbedding Evaluation Chains[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/embedding_distance
Embedding-based evaluation in LangChain is facilitated through classes that compute semantic similarity between text inputs. The primary classes for this purpose are EmbeddingDistanceEvalChain
and PairwiseEmbeddingDistanceEvalChain
, both of which inherit shared functionality from _EmbeddingDistanceChainMixin
.
String Evaluation Techniques[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/exact_match
, libs/langchain/langchain/evaluation/regex_match
, libs/langchain/langchain/evaluation/string_distance
Evaluating the accuracy of language model outputs involves various string-based comparison techniques. The LangChain library provides classes for exact match, regex match, and string distance evaluations.
Read moreQuestion Answering Evaluation[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/qa
Evaluating question-answering systems within LangChain involves a suite of classes that handle the generation of QA pairs and the assessment of responses. The QAGenerateChain
is pivotal for creating question and answer examples, leveraging a PromptTemplate
to structure the output. This chain is instantiated with a BaseLanguageModel
and utilizes an output_parser
, specifically a RegexParser
, to extract the "query" and "answer" from the language model's output.
Criteria and Scoring Evaluations[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/criteria
, libs/langchain/langchain/evaluation/scoring
Evaluating language model outputs involves assessing them against a set of criteria or scoring them based on their alignment with reference answers. In LangChain, this functionality is encapsulated within two main classes: CriteriaEvalChain
and LabeledCriteriaEvalChain
, located in …/criteria
.
Comparison and Preference Evaluation[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/comparison
In …/comparison
, two primary classes, PairwiseStringEvalChain
and LabeledPairwiseStringEvalChain
, facilitate the evaluation of language model outputs by comparing two strings to determine similarity or preference. These classes leverage a language model to generate a comparison and provide results that include a verdict and optionally, a label and comment.
Agent Trajectory Evaluation[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/agents
The TrajectoryEvalChain
class is designed to assess ReAct-style agents by examining their action sequences and outcomes. It operates by leveraging a language model to reason about the agent's behavior and produces a score along with reasoning for the agent's performance. The evaluation process is supported by prompts defined in …/trajectory_eval_prompt.py
, which structure the evaluation interaction.
Parsing and JSON Evaluation[Edit section][Copy link]
References: libs/langchain/langchain/evaluation/parsing
Evaluator classes in …/parsing
assess the quality of text predictions, focusing on JSON data. The classes handle validity, equality, edit distance, and schema compliance evaluations.
Tools and Utilities[Edit section][Copy link]
References: libs/langchain/langchain/tools
The ShellTool
class provides a standardized way to execute shell commands within the LangChain framework. It utilizes the ShellInput
object to capture command text and arguments, executing the command and returning the output. The class offers both synchronous _run()
and asynchronous _arun()
methods for flexibility in command execution.
API Interaction Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/ainetwork
, libs/langchain/langchain/tools/amadeus
, libs/langchain/langchain/tools/arxiv
, libs/langchain/langchain/tools/azure_cognitive_services
, libs/langchain/langchain/tools/eleven_labs
, libs/langchain/langchain/tools/office365
, libs/langchain/langchain/tools/gmail
, libs/langchain/langchain/tools/slack
, libs/langchain/langchain/tools/steam
, libs/langchain/langchain/tools/openapi
, libs/langchain/langchain/tools/multion
, libs/langchain/langchain/tools/tavily_search
, libs/langchain/langchain/tools/wolfram_alpha
, libs/langchain/langchain/tools/wikipedia
, libs/langchain/langchain/tools/zapier
, libs/langchain/langchain/tools/youtube
LangChain integrates with a variety of external APIs to enhance its language processing capabilities. These integrations allow for data retrieval, action execution, and service integration, expanding the range of applications that can be built using LangChain.
Read moreSearch and Data Analysis Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/bing_search
, libs/langchain/langchain/tools/brave_search
, libs/langchain/langchain/tools/dataforseo_api_search
, libs/langchain/langchain/tools/ddg_search
, libs/langchain/langchain/tools/e2b_data_analysis
, libs/langchain/langchain/tools/google_finance
, libs/langchain/langchain/tools/google_jobs
, libs/langchain/langchain/tools/google_lens
, libs/langchain/langchain/tools/google_places
, libs/langchain/langchain/tools/google_scholar
, libs/langchain/langchain/tools/google_search
, libs/langchain/langchain/tools/google_serper
, libs/langchain/langchain/tools/google_trends
The LangChain library provides a suite of tools for web-based searches and data analysis tasks, enabling interaction with various search engines and APIs. The tools are designed to facilitate the retrieval and processing of data from different sources, supporting a range of applications from job searches to academic research.
Read moreFile and Cloud Service Management Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/file_management
, libs/langchain/langchain/tools/github
, libs/langchain/langchain/tools/gitlab
, libs/langchain/langchain/tools/google_cloud
File management within the LangChain framework is facilitated by a suite of tools located in …/file_management
. These tools enable operations such as copying, deleting, searching, listing, moving, reading, and writing files.
User Interaction and Input Handling Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/human
, libs/langchain/langchain/tools/interaction
, libs/langchain/langchain/tools/jira
In LangChain, user interaction and input handling are facilitated through the HumanInputRun
and StdInInquireTool
classes. The HumanInputRun
class, located in …/tool.py
, is designed to capture input from a human user. It operates by prompting the user and returning their response, which can then be integrated into LangChain workflows.
EdenAI Integration Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/edenai
Integration with the EdenAI platform is facilitated through a suite of tools within the LangChain framework, located at …/edenai
. These tools enable users to access a variety of AI-powered services provided by EdenAI, including speech-to-text, text-to-speech, image analysis, document parsing, and text moderation.
Language Interpretation and Processing Tools[Edit section][Copy link]
References: libs/langchain/langchain/tools/bearly
The …/bearly
directory is dedicated to the Bearly language interpreter, which is a tool within the LangChain framework for interpreting and executing commands in specific language constructs or domain-specific languages. The directory contains the tool.py
file, which houses the main components for the Bearly interpreter's functionality.
Global Settings and Serialization[Edit section][Copy link]
References: libs/langchain/langchain/globals
, libs/langchain/langchain/load
Global settings in LangChain are managed through flags that control verbosity and debug mode. These flags are accessed using set_verbose()
, get_verbose()
, set_debug()
, and get_debug()
. These functions facilitate toggling the verbosity of output and debug status, which are essential for monitoring and troubleshooting the library's operations.
Serialization and Deserialization[Edit section][Copy link]
References: libs/langchain/langchain/load/dump.py
, libs/langchain/langchain/load/load.py
, libs/langchain/langchain/load/serializable.py
In LangChain, the transformation of Python objects to a serialized format and the reverse process are handled by a set of utility functions and classes designed to ensure consistency and ease of use across the library. The primary format for serialization is JSON, which is a widely used format for data interchange due to its text-based, human-readable nature and language independence.
Read moreGlobal Configuration[Edit section][Copy link]
References: libs/langchain/langchain/globals
In …/globals
, global settings for the LangChain library are managed through a set of functions that control verbosity, debug modes, and caching mechanisms for Language Model outputs. These settings are critical for tailoring the library's behavior to the needs of different environments and use cases.
Output Parsers and Prompts[Edit section][Copy link]
References: libs/langchain/langchain/output_parsers
, libs/langchain/langchain/prompts
Language models within LangChain can produce outputs in various formats, which necessitates a robust system for parsing these outputs. The StructuredOutputParser
class is central to this process, enabling the conversion of language model outputs into structured formats based on predefined schemas. This parser utilizes functions like parse_and_check_json_markdown()
to ensure outputs conform to expected JSON structures, aiding in the standardization and validation of data.
Output Parsers[Edit section][Copy link]
References: libs/langchain/langchain/output_parsers
…/output_parsers
houses a suite of classes for interpreting the varied outputs from Language Model (LLM) calls. The base classes, BaseLLMOutputParser
and BaseOutputParser
, establish a framework for more specialized parsers that handle specific output formats.
Prompt Templates and Management[Edit section][Copy link]
References: libs/langchain/langchain/prompts
Prompts in LangChain are managed through a hierarchy of classes that define the structure and content of the input to language models. The BasePromptTemplate
serves as the abstract foundation, requiring subclasses to implement the _format()
method, which is central to generating the prompt text. Concrete implementations like StringPromptTemplate
utilize Jinja2 templates to dynamically construct prompts, leveraging utility functions such as jinja2_formatter()
to render the template with provided variables.
Example Selectors[Edit section][Copy link]
References: libs/langchain/langchain/prompts/example_selector
In the LangChain library, example selectors are crucial for enhancing language model performance by curating relevant examples for prompts. The …/example_selector
directory houses several strategies for example selection, each tailored to different criteria.
Indexes and Memory Management[Edit section][Copy link]
References: libs/langchain/langchain/indexes
, libs/langchain/langchain/memory
Indexes in LangChain are managed through functions like index()
and aindex()
located in …/__init__.py
, which handle the indexing process, including adding, updating, and deleting documents in a vector store. These functions work in conjunction with index manager classes such as SQLRecordManager
and VectorstoreIndexCreator
to store and retrieve indexed data. The SQLRecordManager
manages records in a SQL database, while the VectorstoreIndexCreator
interacts with vector stores like Chroma or Pinecone for indexing and querying data.
Chat Message History Management[Edit section][Copy link]
References: libs/langchain/langchain/memory/chat_message_histories
Managing chat message histories is crucial for maintaining the context of conversations in language-based applications. LangChain provides a suite of classes designed to interface with various backend technologies for storing and retrieving conversation data. Each class is tailored to a specific storage solution, offering flexibility in deployment and scalability options.
Read moreMemory Management Base Classes[Edit section][Copy link]
References: libs/langchain/langchain/memory/__init__.py
BaseMemory
and BaseChatMemory
serve as the foundational classes for memory management in LangChain, enabling the storage and retrieval of context for conversational AI applications. These classes provide a standardized interface for memory operations, ensuring that different memory implementations can be used interchangeably within the framework.
Specialized Memory Management[Edit section][Copy link]
References: libs/langchain/langchain/memory/buffer.py
, libs/langchain/langchain/memory/summary.py
, libs/langchain/langchain/memory/entity.py
, libs/langchain/langchain/memory/kg.py
ConversationBufferMemory
and ConversationStringBufferMemory
manage conversation histories by storing messages as a buffer or a string, respectively. The former allows access to the buffer as both a string and a list of messages, while the latter ensures only string format is supported. Both classes provide methods to append new context to the buffer and clear it when necessary.
Utility Classes and Functions for Memory Management[Edit section][Copy link]
References: libs/langchain/langchain/memory/combined.py
, libs/langchain/langchain/memory/readonly.py
, libs/langchain/langchain/memory/utils.py
CombinedMemory
aggregates multiple memory objects, each an instance of BaseMemory
, into a single entity. This aggregation facilitates the management of disparate memory sources, ensuring no overlap of memory variables across the combined entities. The class provides methods to load, save, and clear memory contexts collectively across all included memory objects. It includes a validation step to check for unique memory variables and the presence of an input_key
when BaseChatMemory
types are part of the combination.
Indexing and Index Managers[Edit section][Copy link]
References: libs/langchain/langchain/indexes/__init__.py
, libs/langchain/langchain/indexes/_sql_record_manager.py
, libs/langchain/langchain/indexes/graph.py
, libs/langchain/langchain/indexes/vectorstore.py
Indexing in LangChain is facilitated by the index()
and aindex()
functions, which handle the indexing of data to be stored in a vector store or other storage systems. These functions return an IndexingResult
object, encapsulating the outcome of the indexing operation. For managing the indexed data, LangChain provides two index manager classes: SQLRecordManager
and VectorstoreIndexCreator
.
Prompt Templates for Indexing[Edit section][Copy link]
References: libs/langchain/langchain/indexes/prompts
In LangChain, indexing is facilitated by prompt templates that guide language models in processing text for entity extraction, entity summarization, and knowledge triple extraction. These templates are crucial for constructing indexes that enable efficient data retrieval and knowledge management.
Read moreCommunity Contributions and Experimental Features[Edit section][Copy link]
References: libs/community/langchain_community
, libs/experimental/langchain_experimental
LangChain's community contributions and experimental features showcase the ingenuity and collaborative efforts of its users, offering a range of tools and functionalities that enhance the library's capabilities. The community has developed a suite of agent toolkits, each designed to facilitate interactions with specific data sources or services. For instance, the BaseToolkit
serves as a foundational class for these toolkits, requiring the implementation of the get_tools()
method to provide access to the respective tools.
Community-Driven Language Model Adapters[Edit section][Copy link]
References: libs/community/langchain_community/adapters
The …/adapters
directory is designed to facilitate the integration of various language models into the LangChain framework. The adapters enable LangChain to communicate with external APIs, providing a uniform interface for developers. The directory includes an adapter for the OpenAI API, which is a significant contribution from the community.
Blockchain Interaction Tools[Edit section][Copy link]
References: libs/community/langchain_community/tools/ainetwork
Interacting with the AINetwork blockchain is facilitated through a suite of tools located in …/ainetwork
. These tools enable the management of blockchain applications, owner permissions, rules, and value operations, each encapsulated within specific classes.
Document Loaders and Parsers[Edit section][Copy link]
References: libs/community/langchain_community/document_loaders
The AsyncHtmlLoader
class provides asynchronous loading of HTML content from web pages, configurable with headers, SSL verification, and proxies. It uses requests
for initial fetching and aiohttp
for asynchronous operations. The fetch_all()
method fetches content for all URLs, with rate limiting controlled by asyncio.Semaphore
. The load()
method converts content into Document
objects, extracting metadata like title and description.
Experimental Language Model Extensions[Edit section][Copy link]
References: libs/experimental/langchain_experimental
LangChain's experimental features introduce a suite of innovative tools and extensions, enhancing the library's capabilities in language model applications. The …/langchain_experimental
directory serves as a hub for these experimental components, each tailored for specific functionalities.
Large Language Model (LLM) Integrations[Edit section][Copy link]
References: libs/community/langchain_community/llms
The LangChain community has developed a suite of integrations for Large Language Models (LLMs) that facilitate interaction with a variety of LLM APIs and services. These integrations are encapsulated in classes that extend the BaseLLM
class, providing a unified interface for text generation and other language-related tasks.
Vector Store Implementations[Edit section][Copy link]
References: libs/community/langchain_community/vectorstores
The VectorStore
class hierarchy serves as a foundation for various vector store implementations, enabling the storage and retrieval of text documents and embeddings. These implementations facilitate semantic search and document ranking, leveraging different backend technologies and databases to optimize performance for specific use cases.
Agent Toolkits for Data Interaction[Edit section][Copy link]
References: libs/community/langchain_community/agent_toolkits
The create_csv_agent
function, moved to the LangChain experimental module, is a key entry point for constructing a CSV agent. It combines a language model with a CSV toolkit to enable the agent to understand and manipulate CSV data.
Utility Classes for Extended Functionality[Edit section][Copy link]
References: libs/community/langchain_community/utilities
The LangChain community has contributed a variety of utility classes and functions that enhance the library's interaction capabilities with external APIs, manage sensitive data, and simulate Python REPL environments. These utilities are essential for extending LangChain's functionality to cover a broader range of language model applications.
Read moreText Embedding Models and Services[Edit section][Copy link]
References: libs/community/langchain_community/embeddings
LangChain's community contributions extend its core functionality with a variety of text embedding models and services, facilitated by classes such as OpenAIEmbeddings
, HuggingFaceEmbeddings
, and JinaEmbeddings
. These classes interface with external APIs or services to generate embeddings that capture the semantic meaning of text, which can be used for tasks like semantic search or document clustering.
Chat Model Extensions and Utilities[Edit section][Copy link]
References: libs/community/langchain_community/chat_models
LangChain's chat model extensions facilitate interaction with a variety of language models, enabling developers to build robust conversational AI applications. The community has contributed a diverse set of chat models, each tailored to specific language model APIs and providing unique functionalities.
Read moreAdvanced Retrievers for Data Sourcing[Edit section][Copy link]
References: libs/community/langchain_community/retrievers
The LangChain framework is enriched by the community-developed advanced retrievers, which facilitate the sourcing of relevant documents from diverse databases and services. These retrievers are instrumental in enhancing the data retrieval process within LangChain workflows.
Read moreDocument Parsing for Programming Languages[Edit section][Copy link]
References: libs/community/langchain_community/document_loaders/parsers/language
LangChain's community contributions extend to the domain of programming language parsing, where the library leverages a variety of segmenter classes to facilitate the analysis and processing of source code. These segmenters are specialized for different programming languages and are built upon the TreeSitterSegmenter
base class, which provides common functionality using the tree-sitter
library.
Callback Handlers and Tracers[Edit section][Copy link]
References: libs/community/langchain_community/callbacks
Callback handlers and tracers in LangChain facilitate monitoring and logging of model and workflow executions, enabling integration with a variety of platforms for enhanced observability. The …/callbacks
directory contains a suite of callback handlers, each tailored to interface with different external services, providing real-time insights and analytics.
Chat Message History Management[Edit section][Copy link]
References: libs/community/langchain_community/chat_message_histories
In LangChain applications, managing chat message histories is facilitated by a variety of backend systems, each tailored to different storage requirements and environments. The community has implemented several classes that conform to the BaseChatMessageHistory
interface, enabling consistent interaction patterns across different storage solutions.
Graph Database Integrations[Edit section][Copy link]
References: libs/community/langchain_community/graphs
Integrating with various graph databases, LangChain's community has developed a suite of classes to facilitate graph-based operations and querying. The GraphStore
abstract class serves as a blueprint for creating consistent interfaces across different graph database systems. Implementations of this class, such as Neo4jGraph
, ArangoGraph
, GremlinGraph
, HugeGraph
, MemgraphGraph
, NeptuneGraph
, and others, provide tailored methods to interact with their respective databases.
Autonomous Agent Systems[Edit section][Copy link]
References: libs/experimental/langchain_experimental/autonomous_agents
Autonomous agent systems within the LangChain library, such as AutoGPT
and BabyAGI
, represent experimental approaches to creating agents with autonomous decision-making capabilities. These agents are designed to interact with language models and perform tasks without direct human intervention, leveraging the LangChain framework for memory management, task execution, and response generation.
Agent Toolkits for Enhanced Interactions[Edit section][Copy link]
References: libs/experimental/langchain_experimental/agents/agent_toolkits
LangChain's experimental toolkit offers a suite of functions to create agents that can interact with various data sources and tools, enhancing the capabilities of LangChain agents. The toolkit includes functions such as create_csv_agent()
, create_pandas_dataframe_agent()
, create_python_agent()
, create_spark_dataframe_agent()
, and create_xorbits_agent()
, each tailored to handle specific data structures and environments.
Partner Integrations[Edit section][Copy link]
References: libs/partners
LangChain's integration with partner technologies extends its capabilities, allowing developers to leverage a variety of external services and APIs. These integrations facilitate a range of functionalities from language model interactions to database management, enhancing the LangChain library's versatility in language-based application development.
Read moreAI21 Integration[Edit section][Copy link]
References: libs/partners/ai21/langchain_ai21
Integration with AI21 language models in LangChain is facilitated through a series of wrapper classes located in …/langchain_ai21
. These classes provide interfaces to interact with AI21's language, chat, embeddings, and contextual answers models.
Cohere Integration[Edit section][Copy link]
References: libs/partners/cohere/langchain_cohere
Integration with Cohere language models enhances LangChain's capabilities, providing users with tools for chat-based interactions, text embeddings, document retrieval, and reranking.
Read moreOpenAI and Azure OpenAI Integration[Edit section][Copy link]
References: libs/partners/openai/langchain_openai
Integration with OpenAI and Azure OpenAI language models is facilitated through the LangChain library, which provides a structured approach to accessing and utilizing these models' capabilities. The integration encompasses chat models, embeddings, language models, and output parsers, each serving a distinct purpose within the LangChain framework.
Read moreMongoDB Integration[Edit section][Copy link]
References: libs/partners/mongodb/langchain_mongodb
LangChain's MongoDB integration, located at …/langchain_mongodb
, facilitates interactions with MongoDB databases, particularly MongoDB Atlas. It provides classes and utilities for vector-based search and storage, chat message history management, and caching.
Fireworks Integration[Edit section][Copy link]
References: libs/partners/fireworks/langchain_fireworks
Integration with Fireworks language models in LangChain is facilitated through the …/langchain_fireworks
directory, which houses the components necessary for utilizing Fireworks' capabilities within the LangChain ecosystem. The integration includes chat models, embeddings, and language model wrappers.
Anthropic Integration[Edit section][Copy link]
References: libs/partners/anthropic/langchain_anthropic
Integration with Anthropic's language models in LangChain is facilitated through classes that handle chat interactions and language model wrapping. The …/langchain_anthropic
directory is central to this integration, providing classes like ChatAnthropic
and AnthropicLLM
.
Together AI Integration[Edit section][Copy link]
References: libs/partners/together/langchain_together
Integration with the Together AI platform is achieved through the LangChain library, specifically within the …/langchain_together
directory. This integration enables LangChain to leverage Together AI's capabilities for embedding generation and text completion.
VoyageAI Integration[Edit section][Copy link]
References: libs/partners/voyageai/langchain_voyageai
Integration with the VoyageAI platform enhances LangChain's capabilities through two primary components: VoyageAIEmbeddings
and VoyageAIRerank
. These components leverage VoyageAI services to generate embeddings and rerank documents, respectively.
Robocorp Integration[Edit section][Copy link]
References: libs/partners/robocorp/langchain_robocorp
Integration with Robocorp's Action Server API is facilitated through the ActionServerToolkit
class, which serves as the primary interface for LangChain applications to interact with Robocorp services. The toolkit is designed to dynamically fetch and create tools based on the available endpoints of the Action Server's API specification.
MistralAI Integration[Edit section][Copy link]
References: libs/partners/mistralai/langchain_mistralai
Integration with MistralAI services is achieved through two main classes within the …/langchain_mistralai
directory: ChatMistralAI
and MistralAIEmbeddings
.
Exa Integration[Edit section][Copy link]
References: libs/partners/exa/langchain_exa
Integration with Exa's search capabilities is achieved through the ExaSearchRetriever
and associated tools within the …/langchain_exa
directory. The ExaSearchRetriever
class, extending BaseRetriever
, is tailored to interact with the Exa Search API, enabling the retrieval of documents based on user queries. It offers a range of configurable parameters such as domain inclusion/exclusion, date range filtering, and search type specification, which can be neural or keyword-based.
Groq Integration[Edit section][Copy link]
References: libs/partners/groq/langchain_groq
The ChatGroq
class serves as the primary interface for LangChain's interaction with Groq's chat models. Located in …/chat_models.py
, this class extends BaseChatModel
and is designed to communicate with Groq's API, translating LangChain's internal message formats to those expected by Groq and vice versa.
Pinecone Integration[Edit section][Copy link]
References: libs/partners/pinecone/langchain_pinecone
The PineconeVectorStore
class in …/vectorstores.py
serves as the interface for LangChain's integration with Pinecone's vector store, enabling vector-based search and storage. Key functionalities include:
PostgreSQL Integration[Edit section][Copy link]
References: libs/partners/postgres/langchain_postgres
The PostgresChatMessageHistory
class, located at …/chat_message_histories.py
, is designed to interface with a PostgreSQL database for the purpose of managing chat message histories. It extends the functionality of the BaseChatMessageHistory
class, providing both synchronous and asynchronous methods for interacting with the database.
Command-Line Interface (CLI)[Edit section][Copy link]
References: libs/cli
LangChain CLI facilitates the creation and management of LangChain-based applications. It streamlines the process of setting up new projects, adding or removing dependencies, and launching application servers. The CLI leverages the typer
library to handle user input and command execution, providing a user-friendly interface for developers.
CLI Core Functionality[Edit section][Copy link]
References: libs/cli/langchain_cli/namespaces
The LangChain CLI provides a set of commands for managing LangChain applications through the command-line interface. The CLI is built using the typer
library, which simplifies the creation of command-line interfaces.
CLI Integration and Package Templates[Edit section][Copy link]
References: libs/cli/langchain_cli/integration_template
, libs/cli/langchain_cli/package_template
Templates for integrating custom components into LangChain are provided through the LangChain CLI, facilitating the creation of new packages. These templates serve as a starting point for developers to incorporate chat models, embeddings, language models, and vector stores into the LangChain framework.
Read moreCLI Project Setup and Management[Edit section][Copy link]
References: libs/cli/langchain_cli/project_template
Setting up a new LangChain-based application involves initializing the project structure, configuring the Docker environment, and following the README instructions for management tasks. The …/project_template
directory provides the necessary scaffolding for this setup.
CLI Utilities[Edit section][Copy link]
References: libs/cli/langchain_cli/utils
Utility modules within the LangChain CLI enhance its functionality by providing event tracking, text replacement, Git repository management, and manipulation of dependencies within the pyproject.toml
file.
CLI Testing and Scripts[Edit section][Copy link]
References: libs/cli/langchain_cli/integration_template/tests
, libs/cli/langchain_cli/integration_template/scripts
The LangChain CLI provides a suite of scripts and testing frameworks to maintain the integrity of the codebase. Located within …/tests
, these tools facilitate both integration and unit testing of key components such as chat models, language models, embeddings, and vector stores.
CLI Development and Contribution[Edit section][Copy link]
References: libs/cli/langchain_cli/dev_scripts.py
, libs/cli/langchain_cli/constants.py
In the development of LangChain CLI tools, the script …/dev_scripts.py
plays a crucial role by providing a function create_demo_server()
that automates the setup of a FastAPI application. This function is designed to read the LangChain chain configuration from a package's pyproject.toml
and dynamically create routes for the defined chain. The process involves:
Templates and Documentation[Edit section][Copy link]
Reusable templates within the LangChain library facilitate the creation of language-based applications by providing pre-built functionalities for common use cases. These templates serve as starting points for developers to implement features such as question-answering systems, chatbots, and database integrations. Each template is designed with key components that define its core functionality and use case.
Read moreTemplate Implementations[Edit section][Copy link]
References: templates/anthropic-iterative-search
, templates/basic-critique-revise
, templates/bedrock-jcvd
, templates/cassandra-entomology-rag
, templates/cassandra-synonym-caching
, templates/chain-of-note-wiki
, templates/chat-bot-feedback
, templates/cohere-librarian
, templates/csv-agent
, templates/docs
, templates/elastic-query-generator
, templates/extraction-anthropic-functions
, templates/extraction-openai-functions
, templates/gemini-functions-agent
, templates/neo4j-cypher
, templates/rag-chroma
, templates/rag-redis
, templates/rag-pinecone
, templates/neo4j-semantic-ollama
, templates/rag-chroma-multi-modal
, templates/openai-functions-agent
, templates/neo4j-semantic-layer
, templates/anthropic-iterative-search
, templates/rag-conversation
, templates/cohere-librarian
LangChain templates facilitate the creation of language-based applications by providing pre-built functionalities for various use cases. For instance, the template located at …/anthropic-iterative-search
implements an Anthropic Iterative Search functionality. This template allows a language model to iteratively search for information to answer a user's query using a Wikipedia retriever tool. The core components include a ChatPromptTemplate
for generating prompts, a ChatAnthropic
model instance for language processing, and a WikipediaRetriever
tool for document retrieval.
Documentation Resources[Edit section][Copy link]
References: docs/api_reference
, docs/docs
, docs/docs/integrations
, docs/api_reference/themes
, docs/scripts
, docs/docs/modules
, docs/api_reference/themes/scikit-learn-modern
, docs/src/theme
, docs/api_reference/themes
, docs/src
, docs/api_reference
, docs/docs/integrations/callbacks
, docs/docs/integrations/document_loaders/example_data
, docs/docs/integrations/memory/remembrall.md
The LangChain library's documentation is structured to facilitate developers' understanding and utilization of its extensive features. The API reference, located at …/api_reference
, is a comprehensive resource detailing the public API, including classes, functions, and modules. It is automatically generated by the create_api_rst.py
script, ensuring up-to-date and accurate documentation. The API reference is styled with themes such as Scikit-Learn-Modern, providing a visually consistent and interactive documentation experience.