Cloudera to Databricks migration
Fortune 100 MedTech Company
Problem we found
A division of a major MedTech Company had no way to increase the capacity of its existing data platform for enhanced services and adding new use cases.
Customer Vision
- All on-prem and legacy data platforms were being iterated for cloud first & price/ performance. Multi-cloud was a requirement, so it came down to Databricks vs. Snowflake. Databricks was selected
- Replace the legacy Cloudera estate which was on-prem, manual and had a relatively high total cost of ownership (TCO) with a modern data architecture that allowed for near real time processing of data and analyze using SAS and Tableau.
Technical Pain
- No real time availability of data & reporting, with the analytics team relying on old technology and incomplete data
- Data was stuck in different systems which could not communicate with each other
- Processing data was slow and manual
- No ability to add new use cases and enhanced services with existing legacy technology
Solution we implemented
- End to End implementation of the entire migration from Cloudera to Databricks, completing on time and budget.
- Reviewed current architecture and Cloudera Jobs
- Pilot migration of 10 Cloudera transformation jobs that serve as reference for rest of the migrations
- Full migration of 58 Cloudera transformation jobs in total
- Converted Scripts Sanity & Data Comparison Testing
- Highly collaborative communication allowing our teams to work transparently and successfully to deliver project in less than 3 months
100%
Of data silos removed
Positive Outcomes
- Removed of 100% of data silos
- Integrated the Vision Data Analytics application with the customer’s Central Data Layer allowing easy integration with new data sources
- Enabled the wholesale migration of other apps using Cloudera and Teradata
- Enabled the launch of new use cases without the need for upfront planning for hardware capacity
- Near real time availability of data and reporting
- Significant performance improvement due to faster processing
- Significantly lower TCO
“I am delighted to announce the successful completion of the Cloudera Migration to Databricks. A special thank you goes out to the Team, for their dedication and expertise. Their role was pivotal in ensuring smooth communication and cooperation throughout the project: Sandeep Arabatti, Aaditya Mishra, Jayraj Perumal and Jerry Lee. Thank you all for your hard work, dedication, and contributions to making this project a success. I look forward to more opportunities for collaboration and success in the future.” Snr Director