Parker-Hannifin Chatbot

At Parker-Hannifin, I built an AI agent that allows support service techs to chat with the content of catalogues and service manuals to get quick answers to questions while helping clients. The chatbot was accessed with MS Teams. Users would upload documents directly in the chat window, where they would automatically be processed with a series of tools in Azure for interaction.

I owned the entire development process, from technical architecture to ensuring smooth devops practices for deployment and iteration. I collaborated with a product manager for functionality prioritization and led one junior engineer who helped me with implementation. I handled project scoping, data pipeline design, vector retrieval logic, and all Azure infrastructure.

Documents were uploaded through the Teams interface. Teams calls an Azure Functions pipeline to handle the document ingestion process. PDFs are parsed and OCR-processed with Azure Document Intelligence, labeling paragraph text, tables, and headers and footers as such. Small tables are simply embedded as markdown, larger tables are substituted with table descriptions for better contextual retrieval later. The resulting text is split into one-page chunks before computing embeddings with an API call to OpenAI. Headers, footers, and file metadata are stored as chunk metadata; chunks containing large tables are flagged to be used with SQL tooling. With the document chunks, embeddings, and metadata prepared, everything is inserted into a MongoDB vector database.

When a user sends a message to the chatbot, the user’s message gets logged to a CosmosDB chat history database, then gets embedded. Document chunks with similar embeddings are retrieved from the vector database. We also retrieve the chunks immediately preceding and following any similar chunks to aid with context and continuity. The retrieved chunks all get added to the chat context, and sent to the LLM. Finally, the LLM’s response is logged in the chat history database, and forwarded to the Teams interface to be read by the user.

If any retrieved chunks contain flagged large tables, the system invokes a lightweight SQL query tool before final context assembly. Table content is loaded from blob storage, loaded into a lightweight Pandas DataFrame, then uses an LLM-generated SQL query against it (e.g., df.query(“valve_size == ‘2-inch’”)). Results are serialized and added to the LLM context alongside regular text chunks.

One of the biggest challenges identified in user testing was the occasional loss of context when relevant information was near the beginning or end of a page. The LLM wouldn’t have the context of the previous or following page, and this sometimes weighed on its performance. This was overcome by including neighboring document chunks in the chunk retrieval process.

All in all, this was a fun, challenging project. I am excited to dive deeper into AI workflows; currently I am working on an MCP server to use for some of my side projects at home.