Along with the rise of Data Science as the “Most Glamorous Job Of The 21st Century” came the realization that roughly 80% of a data scientist’s time was being spent collecting, cleaning, and storing the information they needed to do their job. The emerging role of the Data Engineer was created to offload that work onto a separate team and reduce duplication of effort and inefficient use of time. This separation between the management and reliability of an organization’s data and the exploration and interpretation of that information has led to the same tensions that exist between developers and technical operations which we have been working to ease for the past decade.
As practitioners and teams concerned with the engineering and science of data have gained experience and maturity they have also begun re-learning the same lessons that the DevOps transformation has been imparting to the teams concerned with creation and delivery of software. Along the way, data engineers have begun building more processes and tools around automation, testing, monitoring, and alerting of the systems that they are responsible for.
There is a lot to be learned in both directions as data becomes increasingly critical in any successful software system and more complex systems are required to manage all of the moving parts. I’m here to discuss areas that data engineers and operations teams overlap at the technical and social level, the types of tools that can be adopted to improve effectiveness in both directions, and how we can extend the impact of DevOps transformations to more units in the business organization.
By the time you leave you will have a better appreciation of what data engineers do, that there are lots of lessons for us to teach them, and that there are lots of lessons for them to teach us. This talk will also help data engineers to identify their blind spots and how they can address them.