This episode currently has no reviews.
Submit ReviewThere’s been a lot of talk in data science circles about techniques like AutoML, which are dramatically reducing the time it takes for data scientists to train and tune models, and create reliable experiments. But that trend towards increased automation, greater robustness and reliability doesn’t end with machine learning: increasingly, companies are focusing their attention on automating earlier parts of the data lifecycle, including the critical task of data engineering.
Today, many data engineers are unicorns: they not only have to understand the needs of their customers, but also how to work with data, and what software engineering tools and best practices to use to set up and monitor their pipelines. Pipeline monitoring in particular is time-consuming, and just as important, isn’t a particularly fun thing to do. Luckily, people like Sean Knapp — a former Googler turned founder of data engineering startup Ascend.io — are leading the charge to make automated data pipeline monitoring a reality.
We had Sean on this latest episode of the Towards Data Science podcast to talk about data engineering: where it’s at, where it’s going, and what data scientists should really know about it to be prepared for the future.
This episode currently has no reviews.
Submit ReviewThis episode could use a review! Have anything to say about it? Share your thoughts using the button below.
Submit Review