Waimak launch

Cox Automotive open sources new framework to simplify Apache Spark development

Cox Automotive is delighted to announce the release of Waimak, a new open-source framework that helps data teams specialise more efficiently. This new framework makes it easier to build, test and deploy complex data flows in Apache Spark, by abstracting away the more complex parts of Spark application development (such as orchestration) from the business logic.

According to Allison Nau, managing director of Cox Automotive Data Solutions, traditional approaches to data engineering tend to process large volumes of data in highly-dependent waves, meaning prior waves must finish before the next begin.

“This creates a problem on distributed big data systems as it leaves valuable resources sitting idle but locked,” commented Nau. “The more complex the flows of data, the worse the problem gets. And with an increasing number of interdependent data models and data flows, we were finding it took much longer to change data models, as the complexity had grown exponentially.”

The new open-source framework alleviates this problem by providing Spark functions that allow a complex data flow to be more easily broken-up into independent blocks within an application. These blocks are labelled to make reuse-without-repetition easier, and deployment to another environment, such as development or production, much simpler.

“Waimak has enabled us to automate much of the ‘rinse-and-repeat’ data pipeline and data model implementations, giving us time to focus on more critical data engineering tasks,” added Nau. “This makes collaboration between teams that use data, such as business intelligence and data science, and those that provide data, such as data engineering, less burdensome by encouraging compromise on a common set of Big Data tools.”

Data users give-up some freedom afforded by pure SQL interfaces to Hadoop, but gain ability to string together sets of data objects defined by Spark SQL and use more native Spark over time. Data engineers give-up some amount of ‘optimal computation’ but have the business logic (owned by those who understand it) on a platform where optimisation is easier to manage.

Waimak also helps organisations utilise compute resources to process more data, more cost effectively, by modularising steps in a data flow. This is a very powerful approach when combined with other tools that allow you to spin-up compute clusters dynamically on demand and then spin them down once processing has completed.

“Waimak allows us to maximise resource utilisation throughout a job’s lifecycle. Combined with tooling for on-demand clusters, we can free-up cloud resources quicker, reducing overall compute hours and therefore costs. We’ve also been able to significantly reduce the time it takes to go from prototype to production plus deliver maintainable production code faster. This therefore makes it easier for us to extract value from data in a cost-efficient manner, and we hope that other data teams will realise the same benefits and contribute to the framework,” concluded Nau.

Waimak is released under the Apache 2.0 open source license. You can find out more, download and contribute here.

Latest from Cox Automotive

13th November, 2018

WLTP backlog for new cars boosts October used car market

The used car market continued to outperform new in October, as data released today by Cox Automotive signals another positive month for dealers.

Read more
6th November, 2018

Modix partners with Google to help car dealerships grow online

Modix, part of Cox Automotive, is partnering with Google as it announces plans to make it easier for dealers to advertise direct to consumers online.

Read more
5th November, 2018

Market dynamics drive dealer demand for stock funding

A growing consumer appetite for younger used vehicles, rising retail prices and the predicted growth of the used vehicle market in 2019 is driving a demand amongst dealers for stock funding sources.

Read more
30th October, 2018

Manheim announces partnership with ALD Automotive

Manheim, one of the UK’s leading providers of integrated products and services for the vehicle remarketing sector, has today announced a new contract with ALD Automotive.

Read more
More articles