
127. Matthew Stewart - The emerging world of ML sensors
Towards Data Science
09/21/22
•41m
About
Comments
Featured In
Today, we live in the era of AI scaling. It seems like everywhere you look people are pushing to make large language models larger, or more multi-modal and leveraging ungodly amounts of processing power to do it.
But although that’s one of the defining trends of the modern AI era, it’s not the only one. At the far opposite extreme from the world of hyperscale transformers and giant dense nets is the fast-evolving world of TinyML, where the goal is to pack AI systems onto small edge devices.
My guest today is Matthew Stewart, a deep learning and TinyML researcher at Harvard University, where he collaborates with the world’s leading IoT and TinyML experts on projects aimed at getting small devices to do big things with AI. Recently, along with his colleagues, Matt co-authored a paper that introduced a new way of thinking about sensing.
The idea is to tightly integrate machine learning and sensing on one device. For example, today we might have a sensor like a camera embedded on an edge device, and that camera would have to send data about all the pixels in its field of view back to a central server that might take that data and use it to perform a task like facial recognition. But that’s not great because it involves sending potentially sensitive data — in this case, images of people’s faces — from an edge device to a server, introducing security risks.
So instead, what if the camera’s output was processed on the edge device itself, so that all that had to be sent to the server was much less sensitive information, like whether or not a given face was detected? These systems — where edge devices harness onboard AI, and share only processed outputs with the rest of the world — are what Matt and his colleagues call ML sensors.
ML sensors really do seem like they’ll be part of the future, and they introduce a host of challenging ethical, privacy, and operational questions that I discussed with Matt on this episode of the TDS podcast.
***
Intro music:
Artist: Ron Gelinas
Track Title: Daybreak Chill Blend (original mix)
Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:
3:20 Special challenges with TinyML
9:00 Most challenging aspects of Matt’s work
12:30 ML sensors
21:30 Customizing the technology
24:45 Data sheets and ML sensors
31:30 Customers with their own custom software
36:00 Access to the algorithm
40:30 Wrap-up
Previous Episode

126. JR King - Does the brain run on deep learning?
September 14, 2022
•55m
Deep learning models — transformers in particular — are defining the cutting edge of AI today. They’re based on an architecture called an artificial neural network, as you probably already know if you’re a regular Towards Data Science reader. And if you are, then you might also already know that as their name suggests, artificial neural networks were inspired by the structure and function of biological neural networks, like those that handle information processing in our brains.
So it’s a natural question to ask: how far does that analogy go? Today, deep neural networks can master an increasingly wide range of skills that were historically unique to humans — skills like creating images, or using language, planning, playing video games, and so on. Could that mean that these systems are processing information like the human brain, too?
To explore that question, we’ll be talking to JR King, a CNRS researcher at the Ecole Normale Supérieure, affiliated with Meta AI, where he leads the Brain & AI group. There, he works on identifying the computational basis of human intelligence, with a focus on language. JR is a remarkably insightful thinker, who’s spent a lot of time studying biological intelligence, where it comes from, and how it maps onto artificial intelligence. And he joined me to explore the fascinating intersection of biological and artificial information processing on this episode of the TDS podcast.
***
Intro music:
Artist: Ron Gelinas
Track Title: Daybreak Chill Blend (original mix)
Link to Track: https://youtu.be/d8Y2sKIgFWc
***
Chapters:- 2:30 What is JR’s day-to-day?
- 5:00 AI and neuroscience
- 12:15 Quality of signals within the research
- 21:30 Universality of structures
- 28:45 What makes up a brain?
- 37:00 Scaling AI systems
- 43:30 Growth of the human brain
- 48:45 Observing certain overlaps
- 55:30 Wrap-up
Next Episode

128. David Hirko - AI observability and data as a cybersecurity weakness
September 28, 2022
•49m
Imagine you’re a big hedge fund, and you want to go out and buy yourself some data. Data is really valuable for you — it’s literally going to shape your investment decisions and determine your outcomes.
But the moment you receive your data, a cold chill runs down your spine: how do you know your data supplier gave you the data they said they would? From your perspective, you’re staring down 100,000 rows in a spreadsheet, with no way to tell if half of them were made up — or maybe more for that matter.
This might seem like an obvious problem in hindsight, but it’s one most of us haven’t even thought of. We tend to assume that data is data, and that 100,000 rows in a spreadsheet is 100,000 legitimate samples.
The challenge of making sure you’re dealing with high-quality data, or at least that you have the data you think you do, is called data observability, and it’s surprisingly difficult to solve for at scale. In fact, there are now entire companies that specialize in exactly that — one of which is Zectonal, whose co-founder Dave Hirko will be joining us for today’s episode of the podcast.
Dave has spent his career understanding how to evaluate and monitor data at massive scale. He did that first at AWS in the early days of cloud computing, and now through Zectonal, where he’s working on strategies that allow companies to detect issues with their data — whether they’re caused by intentional data poisoning, or unintentional data quality problems. Dave joined me to talk about data observability, data as a new vector for cyberattacks, and the future of enterprise data management on this episode of the TDS podcast.
***
Intro music:
Artist: Ron Gelinas
Track Title: Daybreak Chill Blend (original mix)
Link to Track: https://youtu.be/d8Y2sKIgFWc
*** Chapters:
- 0:00 Intro
- 3:00 What is data observability?
- 10:45 “Funny business” with data providers
- 12:50 Data supply chains
- 16:50 Various cybersecurity implications
- 20:30 Deep data inspection
- 27:20 Observed direction of change
- 34:00 Steps the average person can take
- 41:15 Challenges with GDPR transitions
- 48:45 Wrap-up
If you like this episode you’ll love
Promoted




