Custom LLM based chatbot as learning buddy: To help every Data and AI practitioner understand role of Functional Programming, the foundation for Spark Distributed Processing & Databricks
Published:
December 6, 2024
Ruma Arabatti, Abhishek Huddar
The mass adoption of large language models (LLM), has been possible due to several factors that came together.Advent of high performance CPUs and GPUs scaled up enough computing power for distributed processing of petabyte scale datasets. To fully leverage the efficiencies of such distributed computing platforms, object oriented approach and threading constructs in traditional imperative programming languages such as Java and C/C++ is challenging even for the best of the programmers.
Functional programming offers a different take on solving such complexities and helps programmers rethink simpler, distributed processing constructs to quickly scale up computations across cores without worrying about possible side effects. Thinking in functions leads us to Haskell, a pure functional language that inspired many modern languages, including Scala — the backbone of Spark Distributed Engine. Today,Haskell is a language used primarily in academia. It begs the question: Why should an object-oriented programmer learn it?
Understanding Haskell programming makes a good programmer even better at understanding Spark framework and thus be more productive in solving complex Data Engineering and Machine Learning problems. How best to learn it, given its quirky syntax? Databricks offers a quick way out to help practitioners by providing the frameworks to create a RAG based chatbot
Implementing a Haskell Help Chatbot in Databricks to demonstrate how simple it is...
Learning Haskell is not trivial, even to the most seasoned programmers. It is a time-consuming task to comb through pages upon pages of dense text just to find an answer to a specific question you may have. Traditional web search may not be very helpful. ChatGPT offers a powerful Generative AI model, but when faced with a specific domain or question tailored to a niche topic, it may respond with an outdated, irrelevant, or even incorrect response. How can one help programmers quickly learn advanced language constructs in Haskell?
What if we could build a specialized chatbot to answer questions about Haskell, and nothing else? Retrieval Augmented Generation (RAG) allows us to do just that. The paper provides a step by step guidance to implement such a solution all on Databricks......
.... Download your copy now