Andrea Burbank - Data Communicator at
Pinterest
ABSTRACT
Four years at Pinterest show how building a data-driven culture both arises from and contributes to software engineering practices. The talk will touch on building data infrastructure.
Data Science
- Who is coming to our services?
- Kafka logging
- Daily active users
- Weekly Active Users / Monthly Active Users / Yearly Active Users
- Instead of holding all the data in a database for a whole year, create derived tables: e.g. DAU that holds one line for each user who visited the site that day. Also create WAU and MAU.
- Spam filtering
- Prefetch of pages will mess with your data.
Data chaos: table proliferation
- Data Clarity: thrift
Avoid Data chaos: well-structured data, well understood.
Spectum of certainty: correlation vs causation
- A/B testing
Doing the right thing should be easy. Doing the wrong thing should be hard.
Counting is hard