Skip to main content
Spotify for Podcasters
DataTalks.Club

DataTalks.Club

By DataTalks.Club

DataTalks.Club - the place to talk about data!
Available on
Apple Podcasts Logo
Google Podcasts Logo
Pocket Casts Logo
RadioPublic Logo
Spotify Logo
Currently playing episode

Product Management Essentials for Data Professionals - Greg Coquillo

DataTalks.ClubFeb 04, 2022

00:00
53:10
Data Strategy: Key Principles and Best Practices - Boyan Angelov

Data Strategy: Key Principles and Best Practices - Boyan Angelov

We talked about:


Boyan's background What is data strategy? Due diligence and establishing a common goal Designing a data strategy Impact assessment, portfolio management, and DataOps Data products DataOps, Lean, and Agile Data Strategist vs Data Science Strategist The skills one needs to be a data strategist How does one become a data strategist? Data strategist as a translator Transitioning from a Data Strategist role to a CTO Using ChatGPT as a writing co-pilot Using ChatGPT as a starting point How ChatGPT can help in data strategy Pitching a data strategy to a stakeholder Setting baselines in a data strategy Boyan's book recommendations


Links:


LinkedIn: https://www.linkedin.com/in/angelovboyan/ Twitter: https://twitter.com/thinking_code Github: https://github.com/boyanangelov Website: https://boyanangelov.com/


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

May 26, 202355:49
Practical Data Privacy - Katharine Jarmul

Practical Data Privacy - Katharine Jarmul

We talked about:

Katharine's background Katharine's ML privacy startup GDPR, CCPA, and the “opt-in as the default” approach What is data privacy? Finding Katharine's book – Practical Data Privacy The various definitions of data privacy and “user profiles” Privacy engineering and privacy-enhancing technologies Why data privacy is important What is differential privacy? The importance of keeping privacy in mind when designing systems Data privacy on the example of ChatGPT Katharine's resource suggestions for learning about data privacy


Links:

LinkedIn: https://www.linkedin.com/in/katharinejarmul/ Twitter: https://twitter.com/kjam

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

May 19, 202357:44
Building Scalable and Reliable Machine Learning Systems - Arseny Kravchenko

Building Scalable and Reliable Machine Learning Systems - Arseny Kravchenko

We talked about:

Arseny's background Working on machine learning in startups What is Machine Learning System Design? Constraints and requirements Known unknowns vs unknown unknowns (Design stage) Writing a design document Technical problems vs product-oriented problems The solution part of the Design Document What motivated Arseny to write a book on ML System Design Examples of a Design Document in the book The types of readers for ML System Design Working with the co-author Reacting to constraints and feedback when writing a book Arseny's favorite chapter of the book Other resources where you can learn about ML System Design Twitter Giveaway


Links:

Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter Discount: poddatatalks21 (35% off)


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

May 12, 202350:59
Building an Open-Source NLP Tool - Johannes Hötter

Building an Open-Source NLP Tool - Johannes Hötter

We talked about:

Johannes’s background Johannes’s Open Source Spotlight demos – Refinery and Bricks The difficulties of working with natural language processing (NLP) Incorporating ChatGPT into a process as a heuristic What is Bricks? The process of starting a startup – Kern Making the decision to go with open source Pros and cons of launching as open source Kern’s business model Working with enterprises Johannes as a salesperson The team at Kern Johannes’s role at Kern How Johannes and Henrik separate responsibilities at Kern Working with very niche use cases The short story of how Kern got its funding Johannes’s resource recommendation


Links:

Refinery's GitHub repo: https://github.com/code-kern-ai/refinery Bricks' Github repo: https://github.com/code-kern-ai/bricks Bricks Open Source Spotlight demo: https://www.youtube.com/watch?v=r3rXzoLQy2U Refinery Open Source Spotlight demo: https://www.youtube.com/watch?v=LlMhN2f7YDg Discord: https://discord.com/invite/qf4rGCEphW Ker's Website: https://www.kern.ai


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Apr 21, 202356:27
Navigating Industrial Data Challenges - Rosona Eldred

Navigating Industrial Data Challenges - Rosona Eldred

We talked about:

Rosona’s background How mathematics knowledge helps in industry What is industrial data? Setting up an industrial process using blue paint Internet companies’ data vs industrial data Explaining industrial processes using packing peanuts Why productive industry needs data Measuring product qualities How data specialists use industrial data Defining and measuring sustainability Using data in reactionary measures to changing regulations Types of industrial data Solving problems and optimizing with industrial data Industrial solvers Tiny data vs Big data in productive industry The advantages of coming from academia into productive industry Materials and resources for industrial data Women in industry Why Rosona decided to shift to industrial data


Links:

Kaggle dataset: https://www.kaggle.com/datasets/paresh2047/uci-semcom






Apr 14, 202353:22
Mastering Self-Learning in Machine Learning - Aaisha Muhammad

Mastering Self-Learning in Machine Learning - Aaisha Muhammad

We talked about:

Aaisha’s background How homeschooling affects self-study Deciding on what to learn about Establishing whether a resource is good How Aaisha focuses on learning Deciding on what kind of project to build Find research materials Aaisha’s experience with the Data Talks Club ML Zoomcamp ML Zoomcamp projects Aaisha’s interest in bioinformatics Keeping motivated with deadlines Notes and time-tracking tools Drawbacks to self-studying Aaisha’s interest in machine learning Aaisha’s least favorable part of ML Zoomcamp Helping people as a way to learn Using ChatGPT as a “study group” Is it possible to use self-studying to learn high-level topics Switching topics to avoid burnout Aaisha’s resource recommendations


Links:

LinkedIn: https://www.linkedin.com/in/aaisha-muhammad/ Twitter: https://twitter.com/ZealousMushroom Github: https://github.com/AaishaMuhammad Website: http://www.aaishamuhammad.co.za/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Apr 07, 202351:02
The Secret Sauce of Data Science Management - Shir Meir Lador

The Secret Sauce of Data Science Management - Shir Meir Lador

We talked about:

Shir’s background Debrief culture The responsibilities of a group manager Defining the success of a DS manager The three pillars of data science management Managing up Managing down Managing across Managing data science teams vs business teams Scrum teams, brainstorming, and sprints The most important skills and strategies for DS and ML managers Making sure proof of concepts get into production


Links:

The secret sauce of data science management: https://www.youtube.com/watch?v=tbBfVHIh-38 Lessons learned leading AI teams: https://blogs.intuit.com/2020/06/23/lessons-learned-leading-ai-teams/ How to avoid conflicts and delays in the AI development process (Part I): https://blogs.intuit.com/2020/12/08/how-to-avoid-conflicts-and-delays-in-the-ai-development-process-part-i/ How to avoid conflicts and delays in the AI development process (Part II): https://blogs.intuit.com/2021/01/06/how-to-avoid-conflicts-and-delays-in-the-ai-development-process-part-ii/ Leading AI teams deck: https://drive.google.com/drive/folders/1_CnqjugtsEbkIyOUKFHe48BeRttX0uJG Leading AI teams video: https://www.youtube.com/watch?app=desktop&v=tbBfVHIh-38


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 31, 202348:43
SE4ML - Software Engineering for Machine Learning - Nadia Nahar

SE4ML - Software Engineering for Machine Learning - Nadia Nahar

We talked about:

Nadia’s background Academic research in software engineering Design patterns Software engineering for ML systems Problems that people in industry have with software engineering and ML Communication issues and setting requirements Artifact research in open source products Product vs model Nadia’s open source product dataset Failure points in machine learning projects Finding solutions to issues using Nadia’s dataset and experience The problem of siloing data scientists and other structure issues The importance of documentation and checklists Responsible AI How data scientists and software engineers can work in an Agile way


Links:

Model Card: https://arxiv.org/abs/1810.03993 Datasheets: https://arxiv.org/abs/1803.09010 Factsheets: https://arxiv.org/abs/1808.07261 Research Paper: https://www.cs.cmu.edu/~ckaestne/pdf/icse22_seai.pdf Arxiv version: https://arxiv.org/pdf/2110.


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 24, 202353:40
Starting a Consultancy in the Data Space - Aleksander Kruszelnicki

Starting a Consultancy in the Data Space - Aleksander Kruszelnicki

We talked about:

Aleksander’s background The difficulty of selling data stack as a service How Aleksander got into consulting The Mom Test – extracting feedback from people User interviews Why Aleksander’s data stack as a service startup was not viable How Aleksander decided to switch to consulting Finding clients to consult Figuring out how to position your services Geographical limitations Figuring out your target audience The importance of networking and marketing Pricing your services The pitfalls of daily and hourly pricing and how to balance incentives Is Germany a good place to found a company? Aleksander’s book recommendations


Links:

LinkedIn: https://www.linkedin.com/in/alkrusz/ Twitter: https://twitter.com/alkrusz Website: www.leukos.io


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 17, 202352:28
Biohacking for Data Scientists and ML Engineers - Ruslan Shchuchkin

Biohacking for Data Scientists and ML Engineers - Ruslan Shchuchkin

We talked about:

Ruslan’s background Fighting procrastination and perfectionism What is biohacking? The role of dopamine and other hormones in daily life How meditation can help The influence light has on our bodies Behavioral biohacking Daylight lamps and using light to wake up Sleep cycles How nutrition affects productivity Measuring productivity Examples of unsuccessful biohacking attempts Stoicism, voluntary discomfort, and self-challenges Biohacking risks and ways to prevent them Coffee and tea biohacking Using self-reflection and tracking to measure results Mindset shifting Stoicism book recommendation Work/life balance Ruslan’s biohacking resource recommendation


Links:

LinkedIn: https://www.linkedin.com/in/ruslanshchuchkin/


ree data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Mar 10, 202352:58
 Analytics for a Better World - Parvathy Krishnan

Analytics for a Better World - Parvathy Krishnan

We talked about:

Parvathy’s background Brainstorming sessions with nonprofits to establish data maturity Example of an Analytics for a Better World project The overall data maturity situation of nonprofits vs private sector Solving the skill gap Publicly available content The Analytics for a Better World Academy The Academy’s target audience How researchers can work with Analytics for a Better World Improving data maturity in nonprofit organizations People, processes, and technology Typical tools that Analytics for a Better World recommends to nonprofits Profiles in nonprofits Does Analytics for a Better World has a need for data engineers? The Analytics for a Better World team Factors that help organizations become more data-driven Parvathy’s resource recommendations


Links:

LinkedIn: https://www.linkedin.com/in/parvathykrishnank/ Twitter:  https://twitter.com/ABWInstitute Github: https://github.com/Analytics-for-a-Better-World Website:  https://analyticsbetterworld.org/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Mar 03, 202354:35
Accelerating the Adoption of AI through Diversity - Dânia Meira

Accelerating the Adoption of AI through Diversity - Dânia Meira

We talked about: 

Dania’s background Founding the AI Guild Datalift Summit Coming up with meetup topics Diversity in Berlin Other types of diversity besides gender The pitfalls of lacking diversity Creating an environment where people can safely share their experiences How the AI Guild helps organizations become more diverse How the AI guild finds women in the fields of AI and data science Advice for people in underrepresented groups Organizing a welcoming environment and creating a code of conduct AI Guild’s consulting work and community AI Guild team Dania’s resource recommendations Upcoming Datalift Summit


Links:

Call for Speakers for the #datalift summit (Berlin, 14 to 16 June 2023): https://eu1.hubs.ly/H02RXvX0 Coded Bias documentary on Netflix: https://www.netflix.com/de/title/81328723#:~:text=This%20documentary%20investigates%20the%20bias,flaws%20in%20facial%20recognition%20technology. Book Weapons of Math Destruction by Cathy O'Neil: https://en.wikipedia.org/wiki/Weapons_of_Math_Destruction Book Lean In by Sheryl Sandberg: https://en.wikipedia.org/wiki/Lean_In


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 24, 202357:01
Staff AI Engineer - Tatiana Gabruseva

Staff AI Engineer - Tatiana Gabruseva

We talked about:

Tatiana’s background Going from academia to healthcare to the tech industry What staff engineers do Transferring skills from academia to industry and learning new ones The importance of having mentors Skipping junior and mid-level straight into the staff role Convincing employers that you can take on a lead role Seeing failure as a learning opportunity Preparing for coding interviews Preparing for behavioral and system design interviews The importance of having a network and doing mock interviews How much do staff engineers work with building pipelines, data science, ETC, MPOps, etc.? Context switching Advice for those going from academia to industry The most exciting thing about working as an AI staff engineer Tatiana’s book recommendations


Links:

LinkedIn: https://www.linkedin.com/in/tatigabru/  Twitter:  https://twitter.com/tatigabru Github: https://github.com/tatigabru Website:  http://tatigabru.com/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 17, 202355:24
The Journey of a Data Generalist: From Bioinformatics to Freelancing - Jekaterina Kokatjuhha

The Journey of a Data Generalist: From Bioinformatics to Freelancing - Jekaterina Kokatjuhha

We talked about:

Jekaterina’s background How Jekaterina started freelancing Jekaterina’s initial ways of getting freelancing clients How being a generalist helped Jekaterina’s career Connecting business and data How Jekaterina’s LinkedIn posts helped her get clients Jekaterina’s work in fundraising Cohorts and KPIs Improving communication between the data and business teams Motivating every link in the company’s chain The cons of freelancing Balancing projects and networking The importance of enjoying what you do Growing the client base In the office work vs working remotely Jekaterina’s advice who people who feel stuck Jekaterina’s resource recommendations

Links:

Jekaterina's LinkedIn: https://www.linkedin.com/in/jekaterina-kokatjuhha/

Join DataTalks.Club: https://datatalks.club/slack.html

Feb 11, 202352:18
Navigating Career Changes in Machine Learning - Chris Szafranek

Navigating Career Changes in Machine Learning - Chris Szafranek

We talked about

Chris’s background Switching careers multiple times Freedom at companies Chris’s role as an internal consultant Chris’s sabbatical ChatGPT How being a generalist helped Chris in his career The cons of being a generalist and the importance of T-shaped expertise The importance of learning things you’re interested in Tips to enjoy learning new things Recruiting generalists The job market for generalists vs for specialists Narrowing down your interests Chris’s book recommendations


Links:

Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman Andrej Karpathy, former Senior Director of AI at Tesla, who's now focused on teaching and sharing his knowledge: https://www.youtube.com/@AndrejKarpathy Beautifully done videos on engineering of things in the real world: https://www.youtube.com/@RealEngineering Chris' website: https://szafranek.net/ Zalando Tech Radar: https://opensource.zalando.com/tech-radar/ Modal Labs, new way of deploying code to the cloud, also useful for testing ML code on GPUs: https://modal.com Excellent Twitter account to follow to learn more about prompt engineering for ChatGPT: https://twitter.com/goodside Image prompts for Midjourney: https://twitter.com/GuyP Machine Learning Workflows in Production - Krzysztof Szafanek: https://www.youtube.com/watch?v=CO4Gqd95j6k From Data Science to DataOps: https://datatalks.club/podcast/s11e03-from-data-science-to-dataops.html


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Feb 03, 202355:36
Preparing for a Data Science Interview - Luke Whipps

Preparing for a Data Science Interview - Luke Whipps

We talked about:

Luke’s background Luke’s podcast - AI Game Changers How Luke helps people get jobs What’s changed in the recruitment market over the last 6 months Getting ready for the interview process Stage “zero” – the filter between the candidate and the company Preparing for the introduction stage – research and communication Reviewing the fundamentals during preparation Preparing for the technical part of the interview Establishing the hiring company’s expectations Depth vs breadth Overly theoretical and mathematical questions in interviews Bombing (failing) in the middle of an interview Applying to different roles within the same company Luke’s resource recommendations


Links:

Luke's LinkedIn: https://www.linkedin.com/in/lukewhipps/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jan 27, 202354:17
Indie Hacking - Pauline Clavelloux

Indie Hacking - Pauline Clavelloux

We talked about:

Pauline’s background Pauline’s work as a manager at IBM What is indie hacking? Pauline initial indie hacking projects Getting ready for launch Responsibilities and challenges in indie hacking Pauline’s latest indie hacking project Going live and marketing Challenges with Unreal Me Staying motivated with indie hacking projects Skills Pauline picked up while doing indie hacking projects Balancing a day job and indie hacking Micro SaaS and AboutStartup.io How Pauline comes up with ideas for projects Going from an idea on paper to building a project Pauline’s Twitter success Connecting with Pauline online Pauline’s indie hacking inspiration Pauline’s resource recommendation


Links:

Website: https://wintopy.io/ Pauline's Twitter: https://twitter.com/Pauline_Cx Pauline's LinkedIn: https://www.linkedin.com/in/paulineclavelloux/  Blog about Indiehacking: https://aboutstartup.io


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 20, 202351:03
Doing Software Engineering in Academia - Johanna Bayer

Doing Software Engineering in Academia - Johanna Bayer

We talked about:

Johanna’s background Open science course and reproducible papers Research software engineering Convincing a professor to work on software instead of papers The importance of reproducible analysis Why academia is behind on software engineering The problems with open science publishing in academia The importance of standard coding practices How Johanna got into research software engineering Effective ways of learning software engineering skills Providing data and analysis for your project Johanna’s initial experience with software engineering in a project Working with sensitive data and the nuances of publishing it How often Johanna does hackathons, open source, and freelancing Social media as a source of repos and Johanna’s favorite communities Contributing to Git repos Publishing in the open in academia vs industry Johanna’s book and resource recommendations Conclusion


Links:

The Society of Research Software Engineering,  plus regional chapters: https://society-rse.org/ The RSE Association of Australia and New Zealand: https://rse-aunz.github.io/ Research Software Engineers (RSEs) The people behind research software: https://de-rse.org/en/index.html The software sustainability institute: https://www.software.ac.uk/ The Carpentries (beginner git and programming courses): https://carpentries.org/ The Turing Way Book of  Reproducible Research: https://the-turing-way.netlify.app/welcome


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 13, 202349:49
Data-Centric AI - Marysia Winkels

Data-Centric AI - Marysia Winkels

We talked about:

Marysia’s background What data-centric AI is Data-centric Kaggle competitions The mindset shift to data-centric AI Data-centric does not mean you should not iterate on models How to implement the data-centric approach Focusing on the data vs focusing on the model Resources to help implement the data-centric approach Data-centric AI vs standard data cleaning Making sure your data is representative Knowing when your data is good enough The importance of user feedback “Shadow Mode” deployment What to do if you have a lot of bad data or incomplete data Marysia’s role at PyData How Marysia joined PyData The difference between PyData and PyCon Finding Marysia online


Links:

Embetter & Bulk Demo: https://www.youtube.com/watch?v=L---nvDw9KU


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 06, 202353:08
Business Skills for Data Professionals - Loris Marini

Business Skills for Data Professionals - Loris Marini

We talked about:

Loris’ background Transitioning from physics to data Aligning people on concepts Lead indicators and stickiness Context, semantics, and meaning Communication and being memorable Making data digestible for business and building trust The importance of understanding the language of business Stakeholder mapping Attending business meetings as a data professional Organizing your stakeholder map Prioritizing How to support the business strategy Learning to speak online Resource recommendations from Loris


Links:

Discovering Data Discord server: https://bit.ly/discovering-data-discord Loris' LinkedIn: https://www.linkedin.com/in/lorismarini/ Loris' Twitter: https://twitter.com/LorisMarini
Dec 16, 202254:13
From Software Engineer to Data Science Manager - Sadat Anwar

From Software Engineer to Data Science Manager - Sadat Anwar

We talked about:

Sadat’s background Sadat’s backend engineering experience Sadat’s pivot point as a backend engineer Sadat’s exposure to ML and Data Science Sadat’s Act Before you Think approach (with safety nets) Sadat’s street cred and transition into management The hiring process as an internal candidate The importance of people management skills The Brag List The most difficult part of transitioning to management Focusing on projects and setting milestones Sadat’s transition from EM to data science management How much domain knowledge is needed for management? The main difference between engineering and management How being an EM helped Sadat transition no DS management 53:32 Transitioning to DS management from other roles How to feel accomplished as a manager Sadat’s book recommendations Sadat’s meetups


Links:

Sadat's Meetup page: https://www.meetup.com/berlin-search-technology-meetup/ Meetup event "Bias in AI: how to measure it and how to fix it event": https://www.meetup.com/data-driven-ai-berlin-meetup/events/289927565/




ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Dec 09, 202252:52
Teaching and Mentoring in Data Analytics - Irina Brudaru

Teaching and Mentoring in Data Analytics - Irina Brudaru

We talked about:

Irina’s background Irina as a mentor Designing curriculum and program management at AI Guild Other things Irina taught at AI Guild Why Irina likes teaching Students’ reluctance to learn cloud Irina as a manager Cohort analysis in a nutshell How Irina started teaching formally Irina’s diversity project in the works How DataTalks.Club can attract more female students to the Zoomcamps How to get technical feedback at work Antipatterns and overrated/overhyped topics in data analytics Advice for young women who want to get into data science/engineering Finding Irina online Fundamentals for data analysts Suggestions for DataTalks.club collaborations Conclusions


Links:

LinkedIn Account: https://www.linkedin.com/in/irinabrudaru/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Dec 02, 202253:46
Technical Writing and Data Journalism - Angelica Lo Duca

Technical Writing and Data Journalism - Angelica Lo Duca

We talked about:

Angelica’s background Angelica’s books Data journalism How Angelica got into data journalism The field of digital humanities and Angelica’s data journalism course Technical articles vs data journalism articles Transforming reports into data storytelling Are reports to stakeholders considered technical writing? Data visualization in articles Article length The process of writing an article Finding writing topics How Angelica got into writing a book (communication with publishers) The process for writing a book Brainstorming Reviews and revisions Conclusion


Links:

Data Journalism examples (FENCED OUT): https://www.washingtonpost.com/graphics/world/border-barriers/europe-refugee-crisis-border-control/??noredirect=on Data Journalism examples (La tierra esclava): https://latierraesclava.eldiario.es/ Small medium publication aiming at being Stack Overflow of Medium: https://medium.com/syntaxerrorpub Example of a self-published book on Data Visualization: https://www.amazon.com/Introduction-Data-Visualization-Storytelling-Scientist-ebook/dp/B07VYCR3Z6/ref=sr_1_4?crid=4JRJ48O7K8TK&keywords=joses+berengueres&qid=1668270728&sprefix=joses+beremguere%2Caps%2C273&sr=8-4 My novels (in Italian) La bambina e il Clown: https://www.amazon.it/Bambina-Clown-Angelica-Lo-Duca/dp/1500984515/ref=sr_1_9?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=2KGK9GMN0FAHI&keywords=la+bambina+e+il+clown&qid=1668270769&sprefix=la+bambina+e+il+clown%2Caps%2C88&sr=8-9 My novels (in Italian) Il Violinista: https://www.amazon.it/Violinista-1-Angelica-Lo-Duca/dp/1501009672/ref=sr_1_1?__mk_it_IT=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=12KTF9EF5UKIG&keywords=il+violinista+lo+duca&qid=1668270791&sprefix=il+violinista+lo+duca%2Caps%2C81&sr=8-1 Course on Data Journalism: https://www.coursera.org/learn/visualization-for-data-journalism


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 25, 202250:59
From Digital Marketing to Analytics Engineering - Nikola Maksimovic

From Digital Marketing to Analytics Engineering - Nikola Maksimovic

We talked about:

Nikola’s background Making the first steps towards a transition to BI and Analytics Engineering Learning the skills necessary to transition to Analytics Engineering The in-between period – from Marketing to Analytics Engineering Nikola’s current responsibilities Understanding what a Data Model is Tools needed to work as an Analytics Engineer The Analytics Engineering role over time The importance of DBT for Analytics Engineers Where can one learn about data modeling theory? Going from Ancient Greek and Latin to understanding Data (Just-In-Time Learning) The importance of having domain knowledge to analytics engineering Suggestion for those wishing to transition into analytics engineering The importance of having a mentor when transitioning Finding a mentor Helpful newsletters and blogs Finding Nikola online


Links:

Nikola's LinkedIn account: https://www.linkedin.com/in/nikola-maksimovic-40188183/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 18, 202246:51
Product Owners in Data Science - Anna Hannemann

Product Owners in Data Science - Anna Hannemann

We talked about:

About Anna and METRO Anna’s background The importance of a technical background for data product owners What are product owners? Product owners vs product managers Anna’s work on recommender systems at METRO Expanding the data team Types of algorithms used for recommender systems What kind of knowledge and skills data product owners need to have Problems and ideas should come from the business How Anna handles all her responsibilities The process for starting work on new domains Product portfolio management ProductTank and Anna’s role in it Anna’s resource recommendations


Links:

Data Science for Business Book: https://www.amazon.de/-/en/Foster-Provost/dp/1449361323/ref=sr_1_1?keywords=data+science+for+business&qid=1666404807&qu=eyJxc2MiOiIxLjg3IiwicXNhIjoiMS41MiIsInFzcCI6IjEuNDYifQ%3D%3D&sr=8-1 Article on Data Science Products: https://www.linkedin.com/pulse/way-create-data-science-products-lessons-learnt-anna-hannemann-phd/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 11, 202254:03
Building Data Science Practice - Andrey Shtylenko

Building Data Science Practice - Andrey Shtylenko

We talked about:

Audience Poll Andrey’s background What data science practice is Best DS practice in a traditional company vs IT-centric companies Getting started with building data science practice (finding out who you report to) Who the initiative comes from Finding out what kind of problems you will be solving (Centralized approach) Moving to a semi-decentralized approach Resources to learn about data science practice Pivoting from the role of a software engineer to data scientist The most impactful realization from data science practice Advice for individual growth Finding Andrey online


Links: 

Data Teams book: https://www.amazon.com/Data-Teams-Management-Successful-Data-Focused/dp/1484262271/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 04, 202249:49
Large-Scale Entity Resolution - Sonal Goyal

Large-Scale Entity Resolution - Sonal Goyal

We talked about:

Sonal’s background How the idea for Zingg came about What Zingg is The difference between entity resolution and identity resolution How duplicate detection relates to entity resolution How Sonal decided to start working on Zingg How Zingg works What Zingg runs on Switching from consultancy to working on a new open source solution Why Zingg is open source Open source licensing Working on Zingg initially vs now Zingg’s current and future team Sonal’s biggest current challenge Avoiding problems with entity/identity resolution through database design Identity resolution vs basic joins, data fusions, and fuzzy joins Deterministic matching vs probabilistic machine learning Identity and entity resolution applications for fraud detection Graph algorithms vs classic ML in entity resolution Identity resolution success stories What Sonal would do differently given the chance to start over with Zingg Advice for those seeking to realize their own solution to a data problem Reading suggestion from Sonal Conclusion


Links:

Open-Source Spotlight demo "Zingg":https://www.youtube.com/watch?v=zOabyZxN9b0 Creative Selection: Inside Apple's Design Process During the Golden Age of Steve Jobs book: https://www.amazon.com/Creative-Selection-Inside-Apples-Process/dp/1250194466


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 28, 202253:28
From Data Science to DataOps - Tomasz Hinc

From Data Science to DataOps - Tomasz Hinc

We talked about:

Tomasz’s background What Tomasz did before DataOps (Data Science) Why Tomasz made the transition from Data science to DataOps What is DataOps? How is DataOps related to infrastructure? How Tomasz learned the skills necessary to become DataOps Becoming comfortable with terminal The overlap between DataOps and Data Engineering Suitable/useful skills for DataOps Minimal operational skills for DataOps Similarities between DataOps and Data Science Managers Tomasz’s interesting projects Confidence in results and avoiding going too deep with edge cases Conclusion


Links:

Terminal setup video, 19 minutes long: https://www.youtube.com/watch?v=D2PSsnqgBiw Command line videos, one and a half hour to become somewhat comfy with the terminal: https://www.youtube.com/playlist?list=PLIhvC56v63IKioClkSNDjW7iz-6TFvLwS Course from MIT talking about just that (command line, git, storing secrets): https://missing.csail.mit.edu/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 21, 202251:09
Data Science Career Development - Katie Bauer

Data Science Career Development - Katie Bauer

We talked about:

Katie’s background What is a data scientist? What is a data science manager? Quality of the craft How data leaders promote career growth Supporting senior data professionals Choosing the IC route vs the management route Managing junior data professionals Talking to senior stakeholders and PMs as a junior The importance of hiring juniors What skills do data scientist managers need to get hired? How juniors that are just starting out can set themselves apart from the competition Asking senior colleagues for help and the rubber duck channel The challenges of the head of data Conclusion


Links:

Jobs at Gloss Genius: https://boards.greenhouse.io/glossgenius


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 14, 202253:37
From Testing Phones to Managing NLP Projects - Alvaro Navas Peire

From Testing Phones to Managing NLP Projects - Alvaro Navas Peire

We talked about:

Alvaro’s background Working as a QA (Quality Assurance) engineer Transitioning from QA to Machine Learning Gathering knowledge about ML field Searching for an ML job (improving soft skills and CV) Data science interview skills Zoomcamp projects Zoomcamp project deployment How to not undersell yourself during interviews Alvaro’s experience with interviews during his transition Alvaro’s Zoomcamp notes Alvaro’s coach The importance of mathematical knowledge to a transition into ML Preparing for technical interviews Alvaro’s typical workday Alvaro’s team’s tech stack The importance of a technical background to transitioning into ML


Links:

Alvaro's CV: https://www.dropbox.com/s/89hkt3ug0toqa2n/CV%20nou%20-%20angl%C3%A8s.pdf?dl=0 Github profile: https://github.com/ziritrion LinkedIn profile: https://www.linkedin.com/in/alvaronavas/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcampJoin 

DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 07, 202248:37
Responsible and Explainable AI - Supreet Kaur

Responsible and Explainable AI - Supreet Kaur

We talked about:

Supreet’s background Responsible AI Example of explainable AI Responsible AI vs explainable AI Explainable AI tools and frameworks (glass box approach) Checking for bias in data and handling personal data Understanding whether your company needs certain type of data Data quality checks and automation Responsibility vs profitability The human touch in AI The trade-off between model complexity and explainability Is completely automated AI out of the question? Detecting model drift and overfitting How Supreet became interested in explainable AI Trustworthy AI Reliability vs fairness Bias indicators The future of explainable AI About DataBuzz The diversity of data science roles Ethics in data science Conclusion


Links:

 LinkedIn: https://www.linkedin.com/in/supreet-kaur1995/ Databuzz page: https://www.linkedin.com/company/databuzz-club/ Medium Blog Page: https://medium.com/@supreetkaur_66831


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 30, 202253:01
Building Data Science Practice - Andrey Shtylenko

Building Data Science Practice - Andrey Shtylenko

We talked about:

Audience Poll Andrey’s background What data science practice is Best DS practice in a traditional company vs IT-centric companies Getting started with building data science practice (finding out who you report to) Who the initiative comes from Finding out what kind of problems you will be solving (Centralized approach) Moving to a semi-decentralized approach Resources to learn about data science practice Pivoting from the role of a software engineer to data scientist The most impactful realization from data science practice Advice for individual growth Finding Andrey online

Links:

Data Teams book: https://www.amazon.com/Data-Teams-Management-Successful-Data-Focused/dp/1484262271/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 30, 202249:49
No episode this week

No episode this week

Have a great weekend!

Sep 23, 202200:18
Leading Data Research - David Bader

Leading Data Research - David Bader

We talked about:

David’s background A day in the life of a professor David’s current projects Starting a school The different types of professors David’s recent papers Similarities and differences between research labs and startups Finding (or creating) good datasets David’s lab Balancing research and teaching as a professor David’s most rewarding research project David’s most underrated research project David’s virtual data science seminars on YouTube Teaching at universities without doing research Staying up-to-date in research David’s favorite conferences Selecting topics for research Convincing students to stay in academia and competing with industry Finding David online

Links: 

David A. Bader: https://davidbader.net/ NJIT Institute for Data Science: https://datascience.njit.edu/ Arkouda: https://github.com/Bears-R-Us/arkouda NJIT Data Science YouTube Channel: https://www.youtube.com/c/NJITInstituteforDataScience


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 16, 202258:42
Dataset Creation and Curation - Christiaan Swart

Dataset Creation and Curation - Christiaan Swart

We talked about:

Christiaan’s background Usual ways of collecting and curating data Getting the buy-in from experts and executives Starting an annotation booklet Pre-labeling Dataset collection Human level baseline and feedback Using the annotation booklet to boost annotation productivity Putting yourself in the shoes of annotators (and measuring performance) Active learning Distance supervision Weak labeling Dataset collection in career positioning and project portfolios IPython widgets GDPR compliance and non-English NLP Finding Christiaan online


Links:

My personal blog: https://useml.net/ Comtura, my company: https://comtura.ai/ LI: https://www.linkedin.com/in/christiaan-swart-51a68967/ Twitter: https://twitter.com/swartchris8/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 09, 202256:19
Data Mesh 101 - Zhamak Dehghani

Data Mesh 101 - Zhamak Dehghani

We talked about:

Zhamak’s background What is Data Mesh? Domain ownership Determining what to optimize for with Data Mesh Decentralization Data as a product Self-serve data platforms Data governance Understanding Data Mesh Adopting Data Mesh Resources on implementing Data Mesh


Links:

Free 30-day code from O'Reilly: https://learning.oreilly.com/get-learning/?code=DATATALKS22 Data Mesh book: https://learning.oreilly.com/library/view/data-mesh/9781492092384/ LinkedIn: https://www.linkedin.com/in/zhamak-dehghani


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 02, 202254:09
Growing Data Engineering Team in a Scale-Up - Mehdi OUAZZA

Growing Data Engineering Team in a Scale-Up - Mehdi OUAZZA

We talked about:

Mehdi’s background The difference between startup, scale-up and enterprise Hypergrowth Data platform engineers in a scale-up environment What a data platform is and who builds it Managing the fast pace of a scale-up while ensuring personal growth Should a senior data person consider a scale-up or an enterprise? Should a junior data person consider a scale-up or an enterprise? Sourcing talent for hyper-growth companies and developing a community culture Generating content and getting feedback Generalization vs specialization for data engineers in a scale-up The ratio of work between platform building and use case pipelines Being proactive in order to progress to mid or senior level Caps and bass guitars MehdiO DataTV and DataCreators.Club (Mehdi’s YouTube Channel and podcast)


Links:

Mehdi's YouTube channel: https://www.youtube.com/channel/UCiZxJB0xWfPBE2omVZeWPpQ Mehdi's Linkedin:  https://linkedin.com/in/mehd-io/ Mehdi's Medium Blog: https://medium.com/@mehdio Mehdi's data creators club: https://datacreators.club/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 26, 202253:13
Lessons Learned About Data & AI at Enterprises - Alexander Hendorf

Lessons Learned About Data & AI at Enterprises - Alexander Hendorf

We talked about:

Alexander’s background The role of Partner at Königsweg Being part of the data and AI community How Alexander became chair at PyData Alexander’s many talks and advice on giving them Explaining AI to managers Why being able to explain machine learning to managers is important The experimentational nature of AI and why it’s not a cure-all Innovation requires patience Convincing managers not to use AI or ML when there are better (simpler) solutions The role of MLOps in enterprises Thinking about the mid- and long-term when considering solutions Finding Alexander online


Links: 

Alexander's Twitter: https://twitter.com/hendorf Alexander's LinkedIn: https://www.linkedin.com/in/hendorf/ Königsweg: https://www.koenigsweg.com PyData Südwest: https://www.meetup.com/pydata-suedwest/ PyData Frankfurt: https://www.meetup.com/pydata-frankfurt/ PyConDE & PyData Berlin: https://pycon.de


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 19, 202254:17
MLOps Architect - Danny Leybzon

MLOps Architect - Danny Leybzon

We talked about:

Danny’s background What an MLOps Architect does The popularity of MLOps Architect as a role Convincing an employer that you can wear many different hats Interviewing for the role of an MLOps Architect How Danny prioritizes work with data scientists Coming to WhyLabs when you’ve already got something in production vs nothing in production Market awareness regarding the importance of model monitoring How Danny (WhyLabs) chooses tools ONNX Common trends in tooling setups The most rewarding thing for Danny in ML and data science Danny’s secret for staying sane while wearing so many different hats T-shaped specialist, E-shaped specialist, and the horizontal line The importance of background for the role of an MLOps Architect Key differences for WhyLogs free vs paid Conclusion and where to find Danny online


Links:

Matt Turck: https://mattturck.com/data2021/ AI Observability Platform: https://whylabs.ai/observability Danny's LinkedIn: https://www.linkedin.com/in/dleybz/ Whylabs' website: https://whylabs.ai/ AI Infrastructure Alliance: https://ai-infrastructure.org/


ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 12, 202253:31
Decoding Data Science Job Descriptions - Tereza Iofciu

Decoding Data Science Job Descriptions - Tereza Iofciu

We talked about:

DataTalks.Club intro Tereza’s background Working as a coach Identifying the mismatches between your needs and that of a company How to avoid misalignments Considering what’s mentioned in the job description, what isn’t, and why Diversity and culture of a company Lack of a salary in the job description Way of doing research about the company where you will potentially work How to avoid a mismatch with a company other than learning from your mistakes Before data, during data, after data (a company’s data maturity level) The company’s tech stack Finding Tereza online


Links: 

Decoding Data Science Job Descriptions (talk): https://www.youtube.com/watch?v=WAs9vSNTza8 Talk at ConnectForward: https://www.youtube.com/watch?v=WAs9vSNTza8 Slides: https://www.slideshare.net/terezaif/decoding-data-science-job-descriptions-250687704 Talk at DataLift: https://www.youtube.com/watch?v=pCtQ0szJiLA Slides: https://www.slideshare.net/terezaif/lessons-learned-from-hiring-and-retaining-data-practitioners


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 05, 202249:14
Data Science for Social Impact - Christine Cepelak

Data Science for Social Impact - Christine Cepelak

We talked about:

Christine’s Background Private sector vs Public sector Public policy The challenges of being a community organizer How public policy relates to political science Programs that teach data science for public policy Data science for public policy vs regular data science The importance of ethical data science in public policy How data science in social impact project differs from other projects Other resources to learn about data science for public policy Challenges with getting data in data science for public policy The problems with accessing public datasets about recycling Christine’s potential projects after Master’s degree Gender inequality in STEM fields Corporate responsibility and why organizations need social impact data scientists What you need to start making a social impact with data science 80,000 hours Other use cases for public policy data science Coffee, Ethics & AI Finding Christine online


Links:

Explore some Data Science for Social Good projects: http://www.dssgfellowship.org/projects/ Bi-weekly Ethics in AI Coffee Chat: https://www.meetup.com/coffee-ethics-ai/ Make a Social Impact with your Job: https://tinyurl.com/80khours Course in Data Ethics: https://ethics.fast.ai/ Data Science for Social Good Berlin: https://dssg-berlin.org/ CorrelAid: https://correlaid.org/ DataKind: https://www.datakind.org/ Christine's LinkedIn: https://www.linkedin.com/in/christinecepelak/ Christine's Twitter: https://twitter.com/CLcep 


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 29, 202248:23
Hiring Data Science Talent - Olga Ivina

Hiring Data Science Talent - Olga Ivina

We talked about:

Olga’s career journey Hiring data scientists now vs 7 years ago The two qualities of an excellent data scientist What makes Alexey do this podcast How Alexey get the latest information on data science How Olga checks a candidate’s technical skills How to make an answer stand out (showing your depth of knowledge) A strong mathematical background vs a strong engineering background When Auto ML will replace the need to have data scientists Should data scientists transition into management? (the importance of communication in an organization) Switching from a data analyst role to a data scientist Attracting female talent in data science Changing a job description to find talent Long gaps in the CV Eierlegende Wollmilchsau


Links:

Olga's LinkedIn: https://www.linkedin.com/in/olgaivina/  Olga's Twitter: https://twitter.com/olgaivina


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 22, 202252:45
From Open-Source Maintainer to Founder - Will McGugan

From Open-Source Maintainer to Founder - Will McGugan

We talked about: 

Will’s background Will’s open source projects S3Fs and PyFile systems Inspiration for open source projects Will as a freelancer Starting a company from a tweet (Rich and Textual) Building in public (Will’s approach to social media) The workforce and roadmap of Textualize.io The importance of working on open source for Textualize employees The workflow of and contributions to Textualize Getting your first thousand GitHub Stars (going viral) Suggestions for those who wish to start in the open-source space Finding Will online


Links: 

Twitter: https://twitter.com/willmcgugan Textualize website: https://www.textualize.io/ Textualize GitHub: https://github.com/textualize


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 15, 202249:34
Designing a Data Science Organization - Lisa Cohen

Designing a Data Science Organization - Lisa Cohen

We talked about:

Lisa’s background Centralized org vs decentralized org Hybrid org (centralized/decentralized) Reporting your results in a data organization Planning in a data organization Having all the moving parts work towards the same goals Which approach Twitter follows (centralized vs decentralized) Pros and cons of a decentralized approach Pros and cons of a centralized approach Finding a common language with all the functions of an org Finding the right approach for companies that want to implement data science How many data scientists does a company need? Who do data scientists report huge findings to? The importance of partnering closely with other functions of the org The role of Product Managers in the org and across functions Who does analytics at Twitter (analysts vs data scientists) The importance of goals, objectives and key results Conflicting objectives The importance of research Finding Lisa online


Links:

LinkedIn: https://www.linkedin.com/in/cohenlisa/ Twitter: https://twitter.com/lisafeig Medium: https://medium.com/@lisa_cohen Lisa Cohen's YouTube videos: https://www.youtube.com/playlist?list=PLRhmnnfr2bX7-GAPHzvfUeIEt2iYCbI3w


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 08, 202251:23
Developer Advocacy Engineer for Open-Source - Merve Noyan

Developer Advocacy Engineer for Open-Source - Merve Noyan

We talked about:

Merve’s background Merve’s first contributions to open source What Merve currently does at Hugging Face (Hub, Spaces) What is means to be a developer advocacy engineer at Hugging Face The best way to get open source experience (Google Summer of Code, Hacktoberfest, and sprints) The peculiarities of hiring as it relates to code contributions Best resources to learn about NLP besides Hugging Face Good first projects for NLP The most important topics in NLP right now NLP ML Engineer vs NLP Data Scientist Project recommendations and other advice to catch the eye of recruiters Merve on Twitch and her podcast Finding Merve online Merve and Mario Kart


Links:

Hugging Face Course: https://hf.co/course Natural Language Processing in TensorFlow: https://www.coursera.org/learn/natural-language-processing-tensorflow Github ML Poetry: https://github.com/merveenoyan/ML-poetry Tackling multiple tasks with a single visual language model: https://www.deepmind.com/blog/tackling-multiple-tasks-with-a-single-visual-language-model Hugging Face big science/TOpp: https://huggingface.co/bigscience/T0pp Pathways Language Model (PaLM) blog: https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 01, 202250:58
Data Scientists at Work - Mısra Turp

Data Scientists at Work - Mısra Turp

We talked about:

Misra’s background What data scientists do Consultant data scientists vs in-house data scientists (and freelancers) Expectations for data scientists The importance of keeping up to date with AI developments (FOMA) How does DALL·E 2 work and should you care? Going to conferences to stay up to date The most pressing issue for data scientists Fighting FOMA and imposter syndrome Knowing when you have enough knowledge of a framework The “best” type of data scientist Being a generalist vs a specialist Advice for entry-level data entering an oversaturated market Catching the eye of big AI companies Choosing a project for your portfolio The importance of having a Ph.D. or Master’s degree in data science Finding Misra online


Links:

Mısra's YouTube channel: https://www.youtube.com/channel/UCpNUYWW0kiqyh0j5Qy3aU7w Twitter: https://twitter.com/misraturp Hands-on Data Science: Complete Your First Portfolio Project: https://www.soyouwanttobeadatascientist.com/hods 


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.htm

Jun 24, 202258:02
Freelancing and Consulting with Data Engineering - Adrian Brudaru

Freelancing and Consulting with Data Engineering - Adrian Brudaru

We talked about:

Adrian’s background Freelancing vs Employment Risk and occupancy rate in freelancing The scariest part of freelancing Adrian’s first projects Freelancing 5 years later Pay rates in freelancing Acquiring skills while freelancing Working with recruitment agencies and networking Looking for projects and getting clients Freelancing vs consulting Clarity in clients’ expectations (scope of work) Building your network Freelancing platforms Adrian’s data loading prototype Going from freelancing to making your own product (and other investments) The usefulness of a portfolio Introverts in freelancing Is it possible to work for 3 months a year in freelancing? Choosing projects and skill-building strategy (focusing on interests) Freelancing in Berlin Clients’ expectations for freelancers vs employees Working with more than one client at the same time Adrian’s freelance cooperative on Slack Other advice for novice freelancers (networking) Finding Adrian online


Links:

Github: https://github.com/scale-vector Slack Community: https://join.slack.com/t/berlindatacol-szn7050/shared_invite/zt-19dp8msp0-pP4Av3_fVFBbsdrzPROEAg


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jun 17, 202252:02
Getting a Data Engineering Job (Summary and Q&A) - Jeff Katz
Jun 10, 202248:05
Using Data for Asteroid Mining - Daynan Crull

Using Data for Asteroid Mining - Daynan Crull

We talked about:

Daynan’s background Astronomy vs cosmology Applications of data science and machine learning in astronomy Determining signal vs noise What the data looks like in astronomy Determining the features of an object in space Ground truth for space objects Why water is an important resource in the space economy Other useful resources that can be found in asteroids Sources of asteroids The data team at an asteroid mining company Open datasets for hobbyists Mission and hardware design for asteroid mining Partnerships and hires


Links: 

LinkedIn: https://www.linkedin.com/in/daynan/ We're looking for a Sr Data Engineer: https://boards.eu.greenhouse.io/karmanplus/jobs/4027128101?gh_jid=4027128101 Minor Planet Center: https://minorplanetcenter.net/- JPL Horizons has a nice set of APIs for accessing data related to small bodies (including asteroids): https://ssd.jpl.nasa.gov/api.html ESA has NEODyS: https://newton.spacedys.com/neodys   IRSA catalog that contains image and catalog data related to the WISE/NEOWISE data (and other infrared platforms): https://irsa.ipac.caltech.edu/frontpage/ NASA also has an archive of data collected from their various missions, including a node related to small bodies: https://pds-smallbodies.astro.umd.edu/ Sub-node directly related to asteroids: https://sbn.psi.edu/pds/ Size, Mass, and Density of Asteroids (SiMDA) is a nice catalog of observed asteroid attributes (and an indication of how small our sample size is!): https://astro.kretlow.de/?SiMDA The source survey data, several are useful for asteroids: Pan-STARRS (https://outerspace.stsci.edu/display/PANSTARRS)


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 03, 202253:22
Machine Learning in Marketing - Juan Orduz

Machine Learning in Marketing - Juan Orduz

We talked about:

Juan’s background Typical problems in marketing that are solved with ML Attribution model Media Mix Model – detecting uplift and channel saturation Changes to privacy regulations and its effect on user tracking User retention and churn prevention A/B testing to detect uplift Statistical approach vs machine learning (setting a benchmark) Does retraining MMM models often improve efficiency? Attribution model baselines Choosing a decay rate for channels (Bayesian linear regression) Learning resource suggestions Bayesian approach vs Frequentist approach Suggestions for creating a marketing department Most challenging problems in marketing The importance of knowing marketing domain knowledge for data scientists Juan’s blog and other learning resources Finding Juan online


Links: 

Juan's PyData talk on uplift modeling: https://youtube.com/watch?v=VWjsi-5yc3w Juan's website: https://juanitorduz.github.io Introduction to Algorithmic Marketing book: https://algorithmic-marketing.online Preventing churn like a bandit: https://www.youtube.com/watch?v=n1uqeBNUlRM


MLOps Zoomcamp: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

May 27, 202252:52
From Academia to Data Analytics and Engineering - Gloria Quiceno
May 20, 202248:41
Teaching Data Engineers - Jeff Katz
May 13, 202252:33
From Roasting Coffee to Backend Development - Jessica Greene

From Roasting Coffee to Backend Development - Jessica Greene

We talked about: 

Jessica’s background Giving a talk at a tech conference about coffee Jessica’s transition into tech (How to get started) Going from learning to actually making money Landing your first job in tech Does your age matter when you’re trying to get a job? Challenges that Jessica faced in the beginning of her career Jessica’s role at PyLadies Fighting the Imposter Syndrome Generational differences in digital literacy and how to improve it Events organized by PyLadies Jessica’s beginnings at PyLadies (organizing events) Jessica’s experience with public speaking The impact of public speaking on your career Tips for public speaking Jessica’s work at Ecosia Discrimination in the tech industry (and in general) Finding Jessica online


Links:

Ecosia's website:
https://www.ecosia.org/ Ecosia's blog: https://blog.ecosia.org/ecosia-financial-reports-tree-planting-receipts/ PyLadies Berlin: https://berlin.pyladies.com/ PyLadies' Meetup: https://meetup.com/PyLadies-Berlin Code Academy: https://www.codecademy.com/ Freecodecamp: https://www.freecodecamp.org/ Coursera Machine Learning: https://www.coursera.org/learn/machine-learning ML Bookcamp code: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp Google Summer code: https://summerofcode.withgoogle.com/ Outreachy website: https://www.outreachy.org/ Alumni Interview: https://railsgirlssummerofcode.org/blog/2020-03-17-alumni-interview-jessica Python pizza: https://python.pizza/ Pycon: https://pycon.it/en Pycon 2022: https://2022.pycon.de/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

May 06, 202252:56
Recruiting Data Engineers - Nicolas Rassam
Apr 29, 202249:43
Storytime for DataOps - Christopher Bergh

Storytime for DataOps - Christopher Bergh

We talked about:

Christopher’s background The essence of DataOps Also known as Agile Analytics Operations or DevOps for Data Science Defining processes and automating them (defining “done” and “good”) The balance between heroism and fear (avoiding deferred value) The Lean approach Avoiding silos The 7 steps to DataOps Wanting to become replaceable DataOps is doable Testing tools DataOps vs MLOps The Head Chef at Data Kitchen What’s grilling at Data Kitchen? The DataOps Cookbook


Links:

DataOps Manifesto website: https://dataopsmanifesto.org/en/ DataOps Cookbook: https://dataops.datakitchen.io/pf-cookbook Recipes for DataOps Success: https://dataops.datakitchen.io/pf-recipes-for-dataops-success DataOps Certification Course: https://info.datakitchen.io/training-certification-dataops-fundamentals DataOps Blog: https://datakitchen.io/blog/ DataOps Maturity Model: https://datakitchen.io/dataops-maturity-model/ DataOps Webinars: https://datakitchen.io/webinars/


Join DataTalks.Club: https://datatalks.club/slack.html  

Our events: https://datatalks.club/events.html

Apr 22, 202252:11
Machine Learning and Personalization in Healthcare - Stefan Gudmundsson

Machine Learning and Personalization in Healthcare - Stefan Gudmundsson

We talked about:

Stefan’s background Applications of machine learning in healthcare Sidekick Health – gamified therapeutics How is working for King different from Sidekick Health? The rewards systems in gamified apps The importance of building a strong foundation for a data science team The challenges of building an app in the healthcare industry Dealing with ethics issues Sidekick Health’s personalized recommendations and content The importance of having the right approach in A/B tests (strong analytics and good data) The importance of having domain knowledge to work as a data professional in the healthcare industry Making a data-driven company Risks for Sidekick Health Sidekick Health growth strategy Using AI to help people live better lives


Links:

LinkedIn: https://www.linkedin.com/in/stefanfreyrgudmundsson/  Job listings: https://sidekickhealth.bamboohr.com/jobs/

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Apr 15, 202251:59
Innovation and Design for Machine Learning - Liesbeth Dingemans
Apr 08, 202255:49
Hacking Your Data Career - Marijn Markus
Apr 01, 202255:56
Visualising Machine Learning - Meor Amer
Mar 25, 202252:06
From Math Teacher to Analytics Engineer - Juan Pablo
Mar 18, 202250:16
From Data Science to Data Engineering - Ellen König
Mar 11, 202254:15
Becoming a Data Engineering Manager - Rahul Jain
Mar 04, 202251:24
A/B Testing - Jakob Graff
Feb 25, 202254:16
Machine Learning System Design Interview - Valerii Babushkin

Machine Learning System Design Interview - Valerii Babushkin

We talked about:

Valerii’s background Who goes through an ML system design interview System design VS ML System design Preparing for ML system design interviews Machine learning project checklist The importance of defining a goal and ways of measuring it What to do after you set a goal Typical components of an ML system Applying ML systems to real-world problems System design and coding in interviews for new graduates Humans in the validation of model performance


Links:

Valerii's telegram channel (in Russian): t.me/cryptovalerii

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 18, 202254:40
Career Coaching - Lindsay McQuade

Career Coaching - Lindsay McQuade

We talked about:

Lindsay’s background Spiced Academy Career coaching role Reframing your experience Helping with career problems Finding what interests you Tailoring a CV and “spray and pray” Career coaching outside a bootcamp Imposter syndrome After bootcamp Internships Working with recruiters Networking on LinkedIn


Links:

Lindsay's LinkedIn: https://www.linkedin.com/in/lindsay-mcquade/ Impostor questionnaire: http://impostortest.nickol.as/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 11, 202252:28
Product Management Essentials for Data Professionals - Greg Coquillo

Product Management Essentials for Data Professionals - Greg Coquillo

We talked about:

Greg’s background Responsibilities of Data Product Manager Understanding customer journey Interviewing business partners and decision-makers Products sense, product mindset, and product roadmap Working backwards Driving the roadmap Building a roadmap in Excel Measuring success Advice for teams that don’t have a product manager


Links:

Greg's LinkedIn: https://www.linkedin.com/in/greg-coquillo/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 04, 202253:10
Recruiting Data Professionals - Alicja Notowska

Recruiting Data Professionals - Alicja Notowska

We talked about:

Alicja’s background The hiring process Sourcing and recruiting Managing expectations Making the job description attractive Selecting profiles during sourcing Profile keywords The importance of a Master’s vs a Bachelor’s degree vs a PhD Improving CV Interview with the recruiter Salary expectations Advice for “career changers” Cover letters Data analysts Double Bachelor’s degrees The most difficult part of hiring Coursera courses on the CV Making a good impression on recruiters


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 28, 202257:01
DataTalks.Club Behind the Scenes - Eugene Yan, Alexey Grigorev

DataTalks.Club Behind the Scenes - Eugene Yan, Alexey Grigorev

We talked about:

Alexey’s background Being a principal data scientist DataTalks.Club The beginning and growth of DataTalks.Club Sustaining the pace Types of talks Popular and favorite talks Making DataTalks.Club self-sufficient Alexey’s book and course Advice for people starting in data science and staying motivated Not keeping up to date with new tools Staying productive Learning technical subjects and keeping notes Inspiration and idea generation for DataTalks.Club


Links:

https://eugeneyan.com/writing/informal-mentors-alexey-grigorev/ 


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 21, 202250:29
DTC's minis - From Data Engineering to MLOps - Sejal Vaidya

DTC's minis - From Data Engineering to MLOps - Sejal Vaidya

We don't have a new episode this week, but we have an amazing conversation with Sejal Vaidya from August


We talked about

Sejal's background Why transitioning to ML engineering Three phases of development of a project Why data engineers should get involved in ML Technologies Tips for people who want to transition Soft skills and understanding requirements Helpful resources


Resources:

ML checklist (https://twolodzko.github.io/ml-checklist.html) Machine Learning Bookcamp (https://mlbookcamp.com/) Made with ML course (https://madewithml.com) Full-stack deep learning (https://fullstackdeeplearning.com) Newsletters: mlinproduction, huyenchip.com, jeremyjordan.me, mihaileric.com Sejal's "Production ML" twitter list (https://twitter.com/i/lists/1212819218959351809)


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 14, 202216:52
Becoming a Data Science Manager - Mariano Semelman

Becoming a Data Science Manager - Mariano Semelman

We talked about:

Mariano’s background Typical day of a manager Becoming a manager Preparing for the transition Balancing projects and assumptions Search and recommendations Dealing with unfamiliar domains Structuring projects Connecting product and data science Rules of Machine Learning CRISP-DM and deployment Giving feedback Dealing with people leaving the team Doing technical work as a manager Dealing with bad hires Keeping up with the industry


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 07, 202201:05:51
Leading NLP Teams - Ivan Bilan

Leading NLP Teams - Ivan Bilan

We talked about:

Ivan’s role at Personio Ivan’s background Studying technical management Managing a software team NLP teams NLP engineers Becoming an NLP engineer Computer vision NLP engineer vs ML engineer Conversational designers Linguistics outside of chatbots When does a team need an NLP engineer or a linguist? The future of NLP NLP pipelines GPT-3 Problems of GPT-3 Does GPT-3 make everything obsolete? What NLP actually is? Does NLP solve problems better than humans? State of language translation NLP Pandect

Links:

https://github.com/ivan-bilan/The-NLP-Pandect https://github.com/ivan-bilan/The-Engineering-Manager-Pandect https://github.com/ivan-bilan/The-Microservices-Pandect Ivan's presentation about NLP: https://www.youtube.com/watch?v=VRur3xey31s


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Dec 24, 202159:10
Product Management for Machine Learning - Geo Jolly

Product Management for Machine Learning - Geo Jolly

We talked about

Geo’s background Technical Product Manager Building ML platform Working on internal projects Prioritizing the backlog Defining the problems Observability metrics Avoiding jumping into “solution mode” Breaking down the problem Important skills for product managers The importance of a technical background Data Lead vs Staff Data Scientist vs Data PM Approvals and rollout Engineering/platform teams Data scientists’ role in the engineering team Scrum and Agile in data science Transitioning from Data Scientist to Technical PM Books to read for the transition Transitioning for non-technical people Doing user research Quality assurance in ML Advice for supporting an ML team as a Scrum master


Links:

Geo's LinkedIn: https://www.linkedin.com/in/geojolly/ Product School community: https://productschool.com/ http://theleanstartup.com/  Netflix CPO Medium blog: https://gibsonbiddle.medium.com/ Glovo is hiring: https://jobs.glovoapp.com/en/?d=4040726002


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Dec 17, 202101:02:46
Moving from Academia to Industry - CJ Jenkins
Dec 10, 202159:03
Advancing Big Data Analytics: Post-Doctoral Research - Eleni Tzirita Zacharatou

Advancing Big Data Analytics: Post-Doctoral Research - Eleni Tzirita Zacharatou

We talked about:

Eleni’s background Spatial data analytics Responsibilities of a postdoc Publishing papers Best places for data management papers Differences between postdoc and PhD Helping students become successful Research at the DIMA group Identifying important research directions Reviewing papers Underrated topics in data management Research in data cleaning Collaborating with others Choosing the field for Master’s students Choosing the topic for a Master thesis Should I do a PhD? Promoting computer science to female students


Links:

https://www.user.tu-berlin.de/tzirita/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Dec 03, 202101:00:45
Becoming a Data Product Manager - Sara Menefee

Becoming a Data Product Manager - Sara Menefee

We talked about:

Sara’s background Product designer’s responsibilities Data product manager’s responsibilities Planning with the team Design thinking and product design Data PMs vs regular PMs Skill requirements for Data PMs Going from a product designer to a data product manager Case studies Resources for learning about product management Data PM’s biggest challenge Multitasking and context switching Insights from user interviews Using new, unfamiliar tools Documentation Idea generation Do Data PMs need to know ML?


Links:

Product Management Courses: https://www.lennyrachitsky.com/course and https://www.reforge.com/mastering-product-management Product Management Reading: https://svpg.com/inspired-how-to-create-products-customers-love/ and https://steveblank.com/category/customer-development/ Data Engineering for Noobs: https://www.datacamp.com/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 26, 202159:02
Data Science Manager vs Data Science Expert - Barbara Sobkowiak

Data Science Manager vs Data Science Expert - Barbara Sobkowiak

We talked about:

Barbara’s background Do you need a manager or an expert? Technical and non-technical requirements for managers Importance of technical skills for managers Responsibilities and skills of a manager Importance of technical background for managers Getting involved in business development and sales Developing the team Checking team’s work Data science expert Hiring experts Who should we hire first? Can an expert build a team? Data science managers in startups Project management Ensuring that projects provide value Questions before starting a project Women in data science Finding Barbara online General advice


Link:

Barbara's LinkedIn: https://www.linkedin.com/in/barbara-sobkowiak-1a4a9568


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 19, 202159:39
Ace Non-Technical Data Science Interviews - Nick Singh

Ace Non-Technical Data Science Interviews - Nick Singh

We talked about:

Nick’s background Being a career coach Overview of the hiring process Behavioral interviews for data scientists Preparing for behavioral interviews Handling "tricky" questions Project deep dive Business context Pacing, rambling, and honesty “What’s your favorite model?” What if I haven’t worked on a project that brought $1 mln? Different questions for different levels Product-sense interviews Identifying key metrics in unfamiliar domains Tech blogs Cold emailing



Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 12, 202101:01:45
Becoming a Solopreneur in Data - Noah Gift

Becoming a Solopreneur in Data - Noah Gift

We talked about:

Noah’s background Solopreneurship A day of a solopreneur Exponential vs linear work Escaping the office work - digging the tunnel Structuring goals Staying motivated Publishing books Planning out books Writing a book is like preparing to run a marathon Distributed income Getting started as a solopreneur Lowering expenses and adding time The right time to quit full-time Building a network Teaching at universities



Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Nov 05, 202159:19
Building Business Acumen for Data Professionals - Thom Ives

Building Business Acumen for Data Professionals - Thom Ives

Links:

https://join.slack.com/t/integratedmlai/shared_invite/zt-r3hpj44k-gfhf1pzIt3jixrATyXCWnQ https://www.linkedin.com/in/thomives/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 29, 202101:05:30
Conquering the Last Mile in Data - Caitlin Moorman
Oct 22, 202101:02:02
Similarities and Differences between ML and Analytics - Rishabh Bhargava

Similarities and Differences between ML and Analytics - Rishabh Bhargava

We talked about:

Rishabh's background Rishabh’s experience  as a sales engineer Prescriptive analytics vs predictive analytics The problem with the term ‘data science’ Is machine learning a part of analytics? Day-to-day of people that work with ML Rule-based systems to machine learning The role of analysts in rule-based systems and in data teams Do data analysts know data better than data scientists? Data analysts’ documentation and recommendations Iterative work - data scientists/ML vs data analysts Analyzing results of experiments Overlaps between machine learning and analytics Using tools to bridge the gap between ML and analytics Do companies overinvest in ML and underinvest in analystics? Do companies hire data scientists while forgetting to hire data analysts? The difficulty of finding senior data analysts Is data science sexier than data analytics? Should ML and data analytics teams work together or independently? Building data teams Rishabh’s newsletter – MLOpsRoundup


Links:

https://mlopsroundup.substack.com/ https://twitter.com/rish_bhargava


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 15, 202159:39
Building and Leading Data Teams - Tammy Liang

Building and Leading Data Teams - Tammy Liang

We talked about:

Tammy’s background Being the chief of data First projects as the first data person in a company Initial resistance Expanding the team Role of business analyst Platanomelon’s stack Order for growing the data team Demand forecasting Should analysts know machine learning Qualifications for the first data person in a company Providing accurate results Receiving insights in a timely manner Providing useful insights Giving ownership to the team Starting as the first data person in a company Data For Future podcast Supporting team members that are stuck Finding Tammy online


Links: 

Tammy's podcast: https://dataforfuture.org/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 08, 202159:10
What Researchers and Engineers Can Learn from Each Other - Mihail Eric

What Researchers and Engineers Can Learn from Each Other - Mihail Eric

We talked about:

Mihail’s background NLP and self-driving vehicles Transitioning from academia to the industry Machine learning researchers Finding open-ended problems Machine learning engineers Is data science more engineering or research? What can engineers and researchers learn from one another? Bridging the disconnect between researchers and engineers Breaking down silos Fluid roles Full-stack data scientists Advice to machine learning researchers Advice to machine learning engineers Reading papers Choosing between engineering or research if you’re just starting Confetti.ai


Links:

https://twitter.com/mihail_eric http://confetti.ai/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 01, 202101:01:44
Introducing Data Science in Startups - Marianna Diachuk

Introducing Data Science in Startups - Marianna Diachuk

We talked about:

Marianna’s background Being the only data scientist What should already be in the company How much experience do you need Identifying problems Prioritization What should the company already know? First week First month First quarter Managing expectations Solving problems without ML Project timelines Finding the best solution Evaluating performance Getting stuck Communicating with analysts Transitioning from engineering to data science Growing the team Stopping projects Questions for the company From research to production Wrapping up


Links:

Marianna's LinkedIn: https://www.linkedin.com/in/marianna-diachuk-53ba60116/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 24, 202158:33
Defining Success: Metrics and KPIs - Adam Sroka

Defining Success: Metrics and KPIs - Adam Sroka

We talked about:

Adam’s background Adam’s laser and data experience Metrics and why do we care about them Examples of metrics KPIs KPI examples Derived KPIs Creating metrics — grocery store example Metric efficiency North Star metrics Threshold metrics Health metrics Data team metrics Experiments: treatment and control groups Accelerate metrics and timeboxing


Links:

Domino's article about measuring value: http://blog.dominodatalab.com/measuring-data-science-business-value Adam's article about skills useful for data scientists: https://towardsdatascience.com/how-to-apply-your-hard-earned-data-science-skillset-812585e3cc06 Adam's article about standing out: https://towardsdatascience.com/how-to-stand-out-as-a-great-data-scientist-in-2021-3b7a732114a9


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 17, 202101:02:51
Making Sense of Data Engineering Acronyms and Buzzwords - Natalie Kwong

Making Sense of Data Engineering Acronyms and Buzzwords - Natalie Kwong

We talked about:

Natalie’s background Airbyte What is ETL? Why ELT instead of ETL? Transformations How does ELT help analysts be more independent? Data marts and Data warehouses Ingestion DB ETL vs ELT Data lakes Data swamps Data governance Ingestion layer vs Data lake Do you need both a Data warehouse and a Data lake? Airbyte and ELT Modern data stack Reverse ETL Is drag-and-drop killing data engineering jobs? Who is responsible for managing unused data? CDC – Change Data Capture Slowly changing dimension Are there cases where ETL is preferable over ELT? Why is Airbyte open source? The case of Elasticsearch and AWS


Links:

Natalie's LinkedIn: https://www.linkedin.com/in/nataliekwong/ https://airbyte.io/blog/why-the-future-of-etl-is-not-elt-but-el



Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 11, 202101:00:21
Mastering Algorithms and Data Structures - Marcello La Rocca

Mastering Algorithms and Data Structures - Marcello La Rocca

We talked about:

Learning algorithms and data structures Resources for learning algorithms and data structures Most important data structures Learning the abstractions Learning algorithms if they aren’t needed at work Common mistakes when using wrong data structures Importance of data structures for data scientists Marcello’s book - Advanced Algorithms and Data Structures Bloom filters Where Bloom filters are useful Approximate nearest neighbours Searching for most similar vectors Knowing frameworks vs knowing internals of data structures Serializing Bloom filters Algorithmic problems in job interviews Important data structures for data scientists and data engineers Learning by doing Importance of compiled languages for data scientists


Links:

Marcello's book: Advanced Algorithms and Data Structures http://mng.bz/eP79 (promo code for 35% discount: poddatatalks21) MIT, Introduction to Algorithms: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/ Algorithms specialization by Tim Roughgarden: https://www.coursera.org/specializations/algorithms


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 03, 202101:02:11
Chief Data Officer - Marco De Sa

Chief Data Officer - Marco De Sa

We talked about:

Marco’s background Role of CDO Keeping track of many things Becoming a CDO Strategy vs tactics VP of Data vs CDO How many VPs of Data could be there? Splitting the work between VP and CDO Difference between CTO, CPO, and CDO Breaking down the goals and working backwards from them Assessing if we’re moving in the right direction Dealing with many meetings Being more effective Building the data-driven culture Challenges of working remotely Does CDO need deep technical skills? Importance of MBA The key skills for becoming a CDO Biggest challenges within OLX so far Demonstrating the CDO skills on a job interview Overcoming resistance


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 27, 202101:01:55
Freelancing in Machine Learning - Mikio Braun

Freelancing in Machine Learning - Mikio Braun

We talked about:

Mikio’s background What Mikio helps with Moving from a full-time job to freelancing Finding clients and importance of a strong network Building a network Initial meetings with clients Understanding what clients need Template for the offer (Million dollar consulting) Deciding on rate type: hourly, daily, per project Taking vacations (and paying twice for them) Avoiding overworking Specializing: consulting as a product Working full-time as a principal vs being a consultant Is the overhead worth it? Getting a new client when you already have a project After freelancing: what’s next? Output of Mikio’s work Learning new things Lessons learned after finding clients Registering as a freelancer in Germany Personal liability of a freelancer Effect of globalization and remote work on consulting Advice for people who want to start freelancing Woking full-time and freelancing at the same time


Books: 

Million Dollar Consulting  by Alan Weiss Built to Sell by John Warrillow


Links:

Mikio's Twitter: https://twitter.com/mikiobraun Mikio's LinkedIn: https://www.linkedin.com/in/mikiobraun/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 20, 202101:02:05
Launching a Startup: From Idea to First Hire - Carmine Paolino

Launching a Startup: From Idea to First Hire - Carmine Paolino

We talked about:

Carmine’s background Carmine’s startup FreshFlow Doing user research Design thinking Entrepreneur first Finding co-founders: the “expertise edges” framework The structure of the EF program Coming up with the idea How important is going through a startup accelerator? Finding your first client Finding investors Consequences of having a bad investor Splitting responsibilities between co-founders Hiring The importance of delegating Making work attractive to hires Plans for the future Just-in-time supply chain What would you have done differently? Advice for people starting a startup Don’t focus on skills only Getting motivation Am I ready for a startup? Importance of a business school Advice on finding a co-founder Do I need EF if I already have an idea? Having a prototype before the pitch


Books:

The Mom Test by Rob Fitzpatrick Design Thinking by Robert Curedale

Links:

FreshFlow: https://freshflow.ai/ Carmine's LinkedIn: https://www.linkedin.com/in/carminepaolino Carmine's Twitter: https://twitter.com/paolino


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Aug 13, 202101:07:28
Approach Learning as ML Project - Vladimir Finkelshtein [mini]

Approach Learning as ML Project - Vladimir Finkelshtein [mini]

We don't have an episode lined up for this week, but we recorded a small chat with Vladimir some time ago. Enjoy it! 

We talked about:

Vladimir's background Learning by answering questions Don't be afraid of being wrong Winnings books Learning random things Approach learning as a machine learning project


Links:

Vladimir on LinkedIn: https://www.linkedin.com/in/vladimir-finkelshtein/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Aug 06, 202113:56
Humans in the Loop - Lina Weichbrodt

Humans in the Loop - Lina Weichbrodt

We talked about:

Lina’s background What we need to remember when starting a project (checklists) Make sure the problem is formalized and close to the core business Get the buy-in with stakeholders Building trust with stakeholders Don’t just focus on upsides – ask about concerns Turning a concert into a metric What happens when something goes wrong? Post mortem reporting Apply the 5 why’s If a lot of users say it’s a bug – it’s worth investigating Post mortem format Action points Debugging vs explaining the model Are there online versions of checklists? Make sure to log your inputs Talking to end-users and using your own service Your ideas vs Stakeholder ideas Should data practitioners educate the team about data? People skills and ‘dirty’ hacks Where to find Lina


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 30, 202157:55
Running from Complexity - Ben Wilson

Running from Complexity - Ben Wilson

We talked about:

Ben’s Background Building solutions for customers Why projects don’t make it to production Why do people choose overcomplicated solutions? The dangers of isolating data science from the business unit The importance of being able to explain things Maximizing chances of making into production The IKEA effect Risks of implementing novel algorithms If it can be done simply – do that first Don’t become the guinea pig for someone’s white paper The importance of stat skills and coding skills Structuring an agile team for ML work Timeboxing research Mentoring Ben’s book ‘Uncool techniques’ at AI-First companies Should managers learn data science? Do data scientists need to specialize to be successful?


Links:

Ben's book: https://www.manning.com/books/machine-learning-engineering-in-action (get 35% off with code "ctwsummer21")


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jul 23, 202101:11:43
I Want to Build a Machine Learning Startup! - Elena Samuylova

I Want to Build a Machine Learning Startup! - Elena Samuylova

We talked about:

Elena’s background Why do a startup instead of being an employee? Where to get ideas for your startup Finding a co-founder What should you consider before starting a startup? Vertical startup vs infrastructure startup ‘AI First’ startups Building tools for engineers What skills do you need to start a startup? Startup risks How to be prepared to fail Work-life balance The part-time startup approach Startup investment models No resources and no technical expertise – what to do? Productionizing your services When to hire an expert Talking to people with a problem before solving the problem Starting Elena’s startup, Evidently Elena’s role at Evidently Why is Evidently open source? “People will just copy my open source code. Should I be concerned?” Bottom-up adoption Creating value so that clients engage with your product Is there a difference between countries when creating a startup? Does open source mean the data is safer? When should you hire engineers? Following the market Startups out of genuine interest vs Just for money and for fun


Links:

EvidentlyAI: https://evidentlyai.com/ Elena's LinkedIn: https://www.linkedin.com/in/elenasamuylova/ Elena's Twitter: https://twitter.com/elenasamuylova/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jul 16, 202158:26
Big Data Engineer vs Data Scientist - Roksolana Diachuk

Big Data Engineer vs Data Scientist - Roksolana Diachuk

Links:

Twitter: https://twitter.com/dead_flowers22 LinkedIn: https://www.linkedin.com/in/roksolanadiachuk/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 09, 202101:01:30
Build Your Own Data Pipeline - Andreas Kretz

Build Your Own Data Pipeline - Andreas Kretz

We talked about:

Andreas’s background Why data engineering is becoming more popular Who to hire first – a data engineer or a data scientist? How can I, as a data scientist, learn to build pipelines? Don’t use too many tools What is a data pipeline and why do we need it? What is ingestion? Can just one person build a data pipeline? Approaches to building data pipelines for data scientists Processing frameworks Common setup for data pipelines — car price prediction Productionizing the model with the help of a data pipeline Scheduling Orchestration Start simple Learning DevOps to implement data pipelines How to choose the right tool Are Hadoop, Docker, Cloud necessary for a first job/internship? Is Hadoop still relevant or necessary? Data engineering academy How to pick up Cloud skills Avoid huge datasets when learning Convincing your employer to do data science How to find Andreas


Links:

LinkedIn: https://www.linkedin.com/in/andreas-kretz Data engieering cookbook: https://cookbook.learndataengineering.com/ Course: https://learndataengineering.com/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 02, 202101:01:53
From Software Engineering to Machine Learning - Santiago Valdarrama

From Software Engineering to Machine Learning - Santiago Valdarrama

We talked about:

Santiago’s background “Transitioning to ML” vs “Adding ML as a skill” Getting over the fear of math for software developers Learning by explaining Seven lessons I learned about starting a career in machine learning Lesson 1 – Take the first step Lesson 2 – Learning is a marathon, not a sprint Lesson 3 – If you want to go quickly, go alone. If you want to go far, go together. Lesson 4 – Do something with the knowledge you gain Lesson 5 – ML is not just math. Math is not scary. Lesson 6 – Your ability to analyze a problem is the most important skill. Coding is secondary. Lesson 7 – You don’t need to know every detail Tools and frameworks needed to transition to machine learning Problem-based learning vs Top-down learning Learning resources Santiago’s favorite books Santiago’s course on transitioning to machine learning Improving coding skills Building solutions without machine learning Becoming a better engineer What is the difference between machine learning and data science? Getting into machine learning - Reiteration Getting past the math


Links:

Santiago's Twitter: https://twitter.com/svpino Santiago's course: https://gumroad.com/svpino#kBjbC Pinned tweet with a roadmap: https://twitter.com/svpino/status/1400798154732212230


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 25, 202159:43
Analytics Engineer: New Role in a Data Team - Victoria Perez Mola

Analytics Engineer: New Role in a Data Team - Victoria Perez Mola

Links:

https://www.notion.so/Analytics-Engineer-New-Role-in-a-Data-Team-9decbf33825c4580967cf3173eb77177 https://www.linkedin.com/in/victoriaperezmola/


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Conference: https://datatalks.club/conferences/2021-summer-marathon.html

Jun 18, 202159:55
Data Governance - Jessi Ashdown, Uri Gilad
Jun 11, 202157:59
What Data Scientists Don’t Mention in Their LinkedIn Profiles - Yury Kashnitsky

What Data Scientists Don’t Mention in Their LinkedIn Profiles - Yury Kashnitsky

We talked about:

Yury’s background Failing fast: Grammarly for science Not failing fast: Keyword recommender Four steps to epiphany Lesson learned when bringing XGBoost into production When data scientists try to be engineers Joining a fintech startup: Doing NLP with thousands of GPUs Working at a Telco company Having too much freedom The importance of digital presence Work-life balance Quantifying impact of failing projects on our CVs Business trips to Perm: don’t work on the weekend What doesn’t kill you makes you stronger


Links:

Yury's course: https://mlcourse.ai/ Yury's Twitter: https://twitter.com/ykashnitsky


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 04, 202159:56
Becoming a Data-led Professional - Arpit Choudhury

Becoming a Data-led Professional - Arpit Choudhury

We talked about:

Data-led academy Arpit’s background Growth marketing Being data-led Data-led vs data-driven Documenting your data: creating a tracking plan Understanding your data Tools for creating a tracking plan Data flow stages Tracking events — examples Collecting the data Storing and analyzing the data Data activation Tools for data collection Data warehouses Reverse ETL tools Customer data platforms Modern data stack for growth Buy vs build People we need to in the data flow Data democratization Motivating people to document data Product-led vs data-led


Links:

https://dataled.academy/


Join our Slack: https://datatalks.club/slack.html

May 28, 202101:00:20
How to Market Yourself (without Being a Celebrity) - Shawn Swyx Wang

How to Market Yourself (without Being a Celebrity) - Shawn Swyx Wang

We talked about:

Shawn’s background and his book Marketing ourselves Components of personal marketing Personal brand for an average developer Picking a domain: what to write about? Being too niche Finding a good niche Learning in public Borrowed platforms vs own platform Starting on social media: Picking what they put down Career transitioning: mutual exchange of value Personal marketing for getting a new job Getting hired through the back door Finding content ideas Marketing yourself in public — summary Open-source knowledge Internal marketing: promoting ourselves at work Signature initiative Public speaking Wrapping up Discount for the coding career book 75% of the engineering ladder criteria are not technical

Links:

Shawn's personal page: https://www.swyx.io/ Twitter: https://twitter.com/swyx Book of the week page: https://datatalks.club/books/20210510-the-coding-career-handbook.html (with a discount for DTC members!)


Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

May 21, 202101:02:57
From Physics to Machine Learning - Tatiana Gabruseva

From Physics to Machine Learning - Tatiana Gabruseva

We talked about:

Tatiana’s background 12 career hacks and changing career Hack #1: Change your social circle Hack #2: Forget your fears and stereotypes Hack #3: Forget distractions Hack #4: Don’t overestimate others and don’t underestimate yourself Hack #5: Attention genius Hack #6: Make a team Hack #7: Less is more. Forget about perfectionism Hack #8: Initial creation Hack #9: Find mentors Hack #10: Say “no” Hack #11: Look for failures Hack #12: Take care of yourself Kaggle vs internships and pet projects Resources for learning machine learning Starting with Kaggle Improving focus Astroinformatics How background in Physics is helpful for transitioning Leaving academia Preparing for interviews


Links:

Mock interviews: https://www.pramp.com/ Learning ML: https://www.coursera.org/learn/machine-learning and https://www.coursera.org/specializations/deep-learning Python: https://www.coursera.org/learn/machine-learning-with-python  SQL: https://www.sqlhabit.com/  Practice: https://www.kaggle.com/ MIT 6.006: https://courses.csail.mit.edu/6.006/fall11/notes.shtml Coding: https://leetcode.com/ System design: https://www.educative.io/courses/grokking-the-system-design-interview Ukrainian telegram groups for interview preparation: https://t.me/FaangInterviewChannel,  https://t.me/FaangTechInterview, https://t.me/FloodInterview


Join DataTalks.Club: https://datatalks.club/slack.html

May 14, 202101:06:33
What I Learned After Interviewing 300 Data Scientists - Oleg Novikov

What I Learned After Interviewing 300 Data Scientists - Oleg Novikov

We talked about:

Oleg’s background Standing out in recruitment process NextRound — a service for free mock interviews Why rejections are generic Starting NextRount — preparing a list of situations Steps in the interview process Read the job description! CV is your landing page Take-home assignments Questions about your past experience Hypothetical case questions Technical rounds Handling rejections What to do after receiving an offer? Do recruiters pay attention to age? Getting a job with a PhD — it’s a cold start problem Should I answer rejection emails? Negotiating when my salary is low Should I apply for jobs that require 5 years of experience? Tricking applicant tracking systems What else Oleg learned after interviewing 300 data scientists How a horse's ass determined the design of a space shuttle


Links:

Oleg's service for interviews: https://nextround.cc/ LinkedIn: https://www.linkedin.com/in/olegnovikov/


Join DataTalks.Club: https://datatalks.club/slack.html

May 07, 202101:08:36
Effective Communication with Business for Data Professionals - Lior Barak

Effective Communication with Business for Data Professionals - Lior Barak

We talked about:

DataTalks.Club intro Lior’s background Who is a data strategist? Improving communication between business and tech Building trust Putting data and business people together Dealing with pushbacks Building things in the lean way (and growing tomatoes) Starting with ugly code Convincing others to take our code MVP vs development and Hummus Talking to people who can’t code Break down the silos Hummus Hummus places in Berlin Lior’s book: Data is Like a Plate of Hummus Data chaos


Links:

Book: https://www.amazon.com/-/en/Sarah-Mayor/dp/B086L277LZ (can be found on any amazon store) Company: https://www.taleaboutdata.com/ Podcast: https://podcast.whatthedatapodcast.com/ Linkedin: https://www.linkedin.com/in/liorbarak/ Twitter: https://twitter.com/liorb


Hummus places in Berlin:

Azzam: https://goo.gl/maps/uCkb3ATc5CVKapDa6 Akkawy: https://g.page/akkawy The Eatery Berlin: https://g.page/theeateryberlin


Join DataTalks.Club: https://datatalks.club/slack.html

Apr 30, 202157:23
Data Observability - Barr Moses

Data Observability - Barr Moses

We covered:

Barr’s background Market gaps in data reliability Observability in engineering Data downtime Data quality problems and the five pillars of data observability Example: job failing because of a schema change Three pillars of observability (good pipelines and bad data) Observability vs monitoring Finding the root cause Who is accountable for data quality? (the RACI framework) Service level agreements Inferring the SLAs from the historical data Implementing data observability Data downtime maturity curve Monte carlo: data observability solution Open source tools Test-driven development for data Is data observability cloud agnostic? Centralizing data observability Detecting downstream and upstream data usage Getting bad data vs getting unusual data


Links:

Learn more about Monte Carlo: https://www.montecarlodata.com/ The Data Engineer's Guide to Root Cause Analysis: https://www.montecarlodata.com/the-data-engineers-guide-to-root-cause-analysis/ Why You Need to Set SLAs for Your Data Pipelines: https://www.montecarlodata.com/how-to-make-your-data-pipelines-more-reliable-with-slas/ Data Observability: The Next Frontier of Data Engineering: https://www.montecarlodata.com/data-observability-the-next-frontier-of-data-engineering/ To get in touch with Barr, ping her in the DataTalks.Club group or use barr@montecarlodata.com


Join DataTalks.Club: https://datatalks.club/slack.html

Apr 23, 202101:01:44
Shifting Career from Analytics to Data Science - Andrada Olteanu

Shifting Career from Analytics to Data Science - Andrada Olteanu

We talked about:

Andrada’s background

Recommended courses Kaggle and StackOverflow Doing notebooks on Kaggle Projects for learning data science Finding a job and a mentor with Kaggle’s help The process for looking for a job Main difficulties of getting a job Project portfolio and Kaggle Helpful analytical skills for transitioning into data science Becoming better at coding Learning by imitating Is doing masters helpful? Getting into data science without a masters Kaggle is not just about competitions The last tip: use social media


Links:

https://www.kaggle.com/andradaolteanu  https://twitter.com/andradaolteanuu https://www.linkedin.com/in/andrada-olteanu-3806a2132/


Join DataTalks.Club: https://datatalks.club/slack.html

Apr 16, 202101:02:34
Transitioning from Project Management to Data Science - Ksenia Legostay

Transitioning from Project Management to Data Science - Ksenia Legostay

We talked about:

Knesia’s background Data analytics vs data science Skills needed for data analytics and data science Benefits of getting a masters degree Useful online courses How project management background can be helpful for the career transition Which skills do PMs need to become data analysts? Going from working with spreadsheets to working with python Kaggle Productionizing machine learning models Getting experience while studying Looking for a job Gap between theory and practice Learning plan for transitioning Last tips and getting involved in projects


Links:

Notes prepared by Ksenia with all the info: https://www.notion.so/ksenialeg/DataTalks-Club-7597e55f476040a5921db58d48cf718f


Join DataTalks.Club: https://datatalks.club/slack.html

Apr 09, 202101:03:32
Building Online Tech Communities - Demetrios Brinkmann
Apr 02, 202101:13:52
DataOps 101 - Lars Albertsson

DataOps 101 - Lars Albertsson

We talked about:

Lars’ career Doing DataOps before it existed What is DataOps Data platform Main components of the data platform and tools to implement it Books about functional programming principles Batch vs Streaming Maturity levels Building self-service tools MLOps vs DataOps Data Mesh Keeping track of transformations Lake house


Links:

https://www.scling.com/reading-list/ https://www.scling.com/presentations/


Join DataTalks.Club: https://datatalks.club/slack.html​​​

Mar 26, 202101:09:26
The Essentials of Public Speaking for Career in Data Science - Ben Taylor

The Essentials of Public Speaking for Career in Data Science - Ben Taylor

We talked about:

Ben’s background AI evangelism Ben’s first experiences speaking in public Becoming a great speaker  Key Takeaways and Call to Action Making a good introduction Being Remembered Writing a talk proposal for conferences Landing a keynote Good topics to start talks on Pitching a solution talk to meetup organizers Top public speaking skill to acquire Book recommendations


Join DataTalks.Club: https://datatalks.club/slack.html​​​

Mar 19, 202101:08:48
New Roles and Key Skills to Monetize Machine Learning - Vin Vashishta

New Roles and Key Skills to Monetize Machine Learning - Vin Vashishta

We discussed monetization roles and the capabilities people need to move into those roles.

The key roles are ML Researcher, ML Architect, and ML Product Manager.


We talked about:

Vin's career journey

What does it mean to "monetize machine learning" Important monetization metrics Who should we have on the team to make a project successful Machine Learning Researcher (applied and scientist) - background, responsibilities, and needed skills Developing new categories  The best recipe for a startup: angry users + data scientists What research actually is ML Product Manager - background, responsibilities, and needed skills How product managers can actually manage all their responsibilities (and they have a lot of them!) ML Architect - background, responsibilities, and needed skills Path to becoming an architect  How should we change education to make it more effective  Important product metrics


And more! 


Links:

https://twitter.com/v_vashishta​ https://linkedin.com/in/vineetvashishta​ https://databyvsquared.com/​



Join DataTalks.Club: https://datatalks.club/slack.html​

Mar 12, 202101:19:52
Personal Branding - Admond Lee Kin Lim

Personal Branding - Admond Lee Kin Lim

We talked about: 

Admond's career journey What is personal brand How Admond started being active online Publishing on medium and LinkedIn Idea generation process and tools Other platforms Podcasts Offline presence 1x1 meetings Speaking on conferences Having confidence to publish Selling online courses Personal values Admond's course

And many other things

Links:

https://twitter.com/admond1994 https://linkedin.com/in/admond1994 https://buzzsumo.com https://feedly.com/ https://lunchclub.com/ https://thelead.io/data-scientist-personal-brand-toolkit?utm_medium=instructor&utm_source=admond


Join DataTalks.Club: https://datatalks.club/slack.html

Mar 05, 202101:13:14
The ABC’s of Data Science - Danny Ma

The ABC’s of Data Science - Danny Ma

Did you know that there are 3 types different types of data scientists? A for analyst, B for builder, and C for consultant - we discuss the key differences between each one and some learning strategies you can use to become A, B, or C.


We talked about:


Inspirations for memes  Danny's background and career journey The ABCs of data science - the story behind the idea Data scientist type A - Analyst  Skills, responsibilities, and background for type A Transitioning from data analytics to type A data scientist (that's the path Danny took) How can we become more curious? Data scientist B - Builder  Responsibilities and background for type B Transitioning from type A to type B Most important skills for type B Why you have to learn more about cloud  Data scientist type C - consultant Skills, responsibilities, and background for type C Growing into the C type Ideal data science team Important business metrics Getting a job - easier as type A or type B? Looking for a job without experience Two approaches for job search: "apply everywhere" and "apply nowhere" Are bootcamps useful? Learning path to becoming a data scientist Danny's data apprenticeship program and "Serious SQL" course  Why SQL is the most important skill R vs Python Importance of Masters and PhD


Links:


Danny's profile on LinkedIn: https://linkedin.com/in/datawithdanny Danny's course: https://datawithdanny.com/ Trailer: https://www.linkedin.com/posts/datawithdanny_datascientist-data-activity-6767988552811847680-GzUK/ Technical debt paper: https://proceedings.neurips.cc/paper/2015/hash/86df7dcfd896fcaf2674f757a2463eba-Abstract.html


Join DataTalks.Club: https://datatalks.club/slack.html

Feb 26, 202101:25:49
Translating ML Predictions Into Better Real-World Results with Decision Optimization - Dan Becker

Translating ML Predictions Into Better Real-World Results with Decision Optimization - Dan Becker

We talked about:

How we make decisions with machine learning What is decision optimization  Specifying the decision function Emulation for making the best decisions Decision optimization and reinforcement learning Getting started with decision optimization Trends in the industry


Links:

https://datatalks.club/people/danbecker.html https://www.decision.ai/​


Join DataTalks.Club: https://datatalks.club/slack.html

Feb 19, 202155:44
Feature Stores: Cutting through the Hype - Willem Pienaar

Feature Stores: Cutting through the Hype - Willem Pienaar

We covered:

What is a feature store Problems it solves When to use a feature store  When not to use a feature store The main components When a team should start using a feature store 


Links:

Feast: https://feast.dev/ https://www.tecton.ai/blog/what-is-a-feature-store/  https://docs.greatexpectations.io/en/latest/reference/core_concepts.html


Join DataTalks.Club: https://datatalks.club​​​

Feb 12, 202101:01:06
The Rise of MLOps - Theofilos Papapanagiotou

The Rise of MLOps - Theofilos Papapanagiotou

We covered:

What is MLOps The difference between MLOps and ML Engineering Getting into MLOps Kubeflow and its components, ML Platforms Learning Kubeflow DataOps 

And other things


Links:

Microsoft MLOps maturity model: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-maturity-model Google MLOps maturity levels: https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning MLOps roadmap 2020-2025: https://github.com/cdfoundation/sig-mlops/blob/master/roadmap/2020/MLOpsRoadmap2020.md Kubeflow website: https://www.kubeflow.org/ TFX Paper: https://research.google/pubs/pub46484/


Join DataTalks.Club: https://datatalks.club​​

Feb 05, 202101:02:51
Getting Started with Open Source - Vincent Warmerdam

Getting Started with Open Source - Vincent Warmerdam

We talked about 

open source getting started with open source convincing your employer to contribute to open source public speaking the checklist for open source projects the role of research advocate

And many more things!


Links from Vincent:

https://www.youtube.com/watch?v=68ABAU_V8qI&t=975s&ab_channel=PyData https://www.youtube.com/watch?v=kYMfE9u-lMo&t=958s&ab_channel=PyData https://koaning.io/projects.html https://calmcode.io/ https://makenames.io/ https://koaning.github.io/clumper/api/clumper.html


Join DataTalks.Club: https://datatalks.club​

Jan 29, 202101:02:47
Developer Advocacy for Data Science - Elle O'Brien

Developer Advocacy for Data Science - Elle O'Brien

We talked about development advocacy for data science.


We covered

The role of a developer advocate The skills needed for the job and the responsibilities How to become a developer advocate


You can find Elle on:

Twitter: https://twitter.com/DrElleOBrien LinkedIn: https://linkedin.com/in/drelleobrien DVC's youtube channel: https://www.youtube.com/channel/UC37rp97Go-xIX3aNFVHhXfQ


Join DataTalks.Club: https://datatalks.club

Jan 23, 202155:36
The Importance of Writing in a Tech Career - Eugene Yan

The Importance of Writing in a Tech Career - Eugene Yan

We talk about blogging technical writing. We cover:

Why should we write online? What should we write about? Writing at work: Design documents, wikis, etc. The writing process (also at work)


Eugene's website:  eugeneyan.com 

Follow Eugene on Twitter: https://twitter.com/eugeneyan

Suggest topics: https://eugeneyan.com/topic-poll/


Join DataTalks.Club: https://datatalks.club

Jan 15, 202157:24
Mentoring - Rahul Jain
Dec 25, 202056:12
Standing out as a Data Scientist - Luke Whipps
Dec 18, 202001:09:26
Building a Data Science Team - Dat Tran
Dec 11, 202058:45
Processes in a Data Science Project - Alexey Grigorev
Dec 04, 202031:33
Roles in a data team - Alexey Grigorev

Roles in a data team - Alexey Grigorev

We talked about:

- different roles in a data team: product managers, data analysts, data engineers, data scientists, ML engineers, MLOps engineers
- their responsibilities
- the skills they need


DataTalks.Club is the place to talk about data. Join our community: https://datatalks.club

Nov 21, 202042:45