Plugging the metadata leaks
in the Ethereum ecosystem
2 November 2018 Devcon 4, Prague
Péter Szilágyi
Ethereum Team Lead
Web 2.0
Diving into Etherscan
Easily check your balance / tokens
Balance, conversion, transactions
Tokens, token-contracts, events
Comment popular accounts
What happens behind the scenes?
You Cloudflare Etherscan Google Analytics Disqus
Etherscan + HTTP referrers + external services = 💔
HTTP referrer headers track web request origins
Etherscan 🠒 Google Analytics and Disqus
Disqus 🠒 External integrated services
Disqus Facebook Google Twitter
Disqus Arbor - Marketplace for people-data AppNexus - Marketplace for advertising LiveRamp - Identity resolution service Narrative I/O - Data trading platform ZetaHub - AI marketing platform
Disqus YouTube Vimeo Twitter (tweets) Facebook (status, video, photo) Instagram (photo only) Giphy Imgur Google Maps Soundcloud Vine
Social services
Tracking services
Embedded services
Plug the referral leaks
Providers must protect their users
Do not integrate legacy web 2.0 services
Do not use identifying information in URLs
Do use strict Referrer-Policy HTTP restrictions
Users must protect themselves
Use ad, social and tracker blocking extensions*
Use mobile browsers that permit blocking trackers*
* YouTube & co. are neither, but will still happily track you
Plug the geolocation leaks
You
Infura
You Cloudflare Etherscan MyCrypto / MyEtherWallet Infura Metamask / DApp
Would you publicize your location and Ethereum address?
Cloudflare can link your IP and Ethereum wallet
Etherscan can link your IP and Ethereum wallet
Infura can link your IP and Ethereum wallet
Anonymize your IP address with Tor!
Web 3.0
Diving into discovery
Phases of connecting to Ethereum
Discover peers via public bootnodes
Randomly connect peers to stabilize
Nodes maintain a Kademlia routing table
Every node picks a random ID (pubkey)
Track ID-to-IP mappings in prefix groups
Do random lookups while table is sparse
Integrate any unknown contacting nodes
Portable devices + Ethereum nodes + DHT = 💔
You - Last month - San Francisco Ethereum DHT You - Last week - Berlin You - This week - Prague You - Next week - Lisbon You - Next month - Shanghai Random node 1 Random node 2 Random node 3 ... Random node N
Node IDs and node IPs are public knowledge
Accurate historical movement tracking 😱
By anyone, anywhere in the world 😭😭😭
Nodes should switch to ephemeral IDs (?)
What about bootnodes? Stable nodes? Vipnode?
Diving into light clients
Sync and verify only headers ⇔ state root cannot be verified
Assume good connectivity ⇔ converges on the correct chain
State retrieved on-demand ⇔ limit traffic to interesting data
Light clients + on-demand retrieval + limited servers = 💔
Mist - eth_getBalance(your_addr, 6585725) Light client Mist - eth_getBalance(your_addr, 6585726) Mist - eth_getBalance(your_addr, 6585727) Mist - eth_getBalance(your_addr, 6585728) Mist - eth_getBalance(your_addr, 6585729) Mist - eth_getBalance(your_addr, 6585730) Light server #1 Light server #2 Light server #3
Light clients only retrieve useful data
Servers can map IPs to Ethereum accounts
Stable servers can track geographical movements
Anonymize your IP address with Tor (?)
What about embedded devices? Connectivity and validity guarantees?
Takeaway nuggets
Full nodes are the most powerful anonymizers in crypto, because they make everyone look the same and act the same. Every shortcut is an exchange of privacy for convenience!
Privacy on Ethereum is currently worse than the surveillance paradise of the legacy web. We have the knowledge to fix it (Tor, I2P); let's prioritize and fund it before inertia sets in!
Privacy is not up to users to get right, because they won't know any better. It is our job as platform-, dapp- and decentralized system developers to protect them from ourselves!
Thank you
Use the left and right arrow keys or click the left and right
edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)