Screens are not a Panacea
Today, adopting self-custody is a journey, not something you casually pick up over a morning coffee -- and it's far from the default way to hold bitcoin. Self-custody tools can be unfamiliar and complicated, often favoring configurability and details for experienced users over usability for newcomers. Among the commonly-accepted practices that contribute to this: using tiny screens and buttons to mitigate attacks, often ineffectively.
In practice, using a screen today means squinting and pecking through prompt after prompt, comparing alphanumeric strings and other details between the screen on the hardware wallet and the screen on whatever device you're using with it. When this process results in a final decision to approve a transaction, the hardware wallet uses its private key to sign the transaction so that it will be accepted by the bitcoin network onto the public ledger.
The irreversible nature of these transactions makes verification more important than with typical custodial banking transactions. After all, if you send the wrong amount of money or send it to the wrong address, only the recipient can give it back to you - and only if they have the keys to the wallet you sent it to.
With this in mind, we want to help people avoid mistakes and theft as they adopt self-custody. Are you sure this is the right amount? Did you copy paste the right address? Where did you get this address in the first place? How do you know malware on your phone didn't modify the transaction details or ask the hardware wallet to do anything that you didn't intend?
Screens on hardware wallets can help with some aspects of these questions -- but we think in practice they are often used in a way that doesn't achieve the protections customers hope to get from them. And, we think there are other ways Bitkey can help people protect themselves from common - and not so common - mistakes and threats.
Background: Using the Keys
No matter where they’re stored, secret keys need to be used in order to actually move bitcoin. And using the bitcoin protocol requires an internet connection: inputs from the bitcoin network are part of constructing a transaction and submitting it for inclusion in the public ledger.
This means that many sensitive operations are conducted on internet-connected computers – and that even if you store your keys on a hardware wallet, you must transfer information to your hardware wallet from an internet-connected device in order to actually use the keys. To know what money to move, even an offline device like a hardware wallet needs information from the internet about which not-yet-spent funds (UTXOs) on the public ledger are yours, what address you want to send the funds to (the ‘destination address’), and sometimes other inputs like what address any remaining change should go to (the ‘change address’).
We expect that a broad audience of bitcoin owners will often rely on electronic means of communication to share addresses, like sharing a destination address via SMS or copying one off of a website. So these critical inputs to a transaction will transit the internet and multiple computer platforms before a hardware wallet could display it to the user. What threats can compromise the integrity of a transaction along the way? And how can these users protect transaction information from malicious modification? Let’s take a look.
What does a screen actually do?
Offline platforms give customers more control over when keys are used — and in an ideal setting, over how they are used, too.
Hardware wallets typically achieve the first property - giving customers control over when keys are used - by remaining offline and by limiting the mechanisms by which they can communicate with internet-connected devices.
Hardware wallets often aim for the second property - giving customers control over how keys are used - by incorporating a small screen and asking customers to verify the internals of a bitcoin transaction or a smart contract. Unfortunately, in many practical situations, we think the way people use hardware wallets does not provide the protection from malware that they are hoping for – and we think this negative reality will be even stronger as more people move to self-custody. Let’s take a look at why.
Sending Money: Screens are Hard to Use Correctly
The destination address - the address to which funds will be sent in a transaction - is generated by the recipient’s device and must transit many systems without manipulation: from the recipient’s wallet software, through the recipient’s device platform, across the internet, through the sender’s device platform, to the sender’s wallet software, and ultimately to the hardware wallet. The journey the destination address travels typically looks like this:
How can a customer protect against manipulation of the destination address between their wallet (the sender) and the recipient’s wallet? A common response is to compare the destination address shown on the sender device (e.g. a phone or desktop) to the one shown on the screen of a hardware wallet, like this:
In practice, this protection won’t shield a broad audience from threats like mobile malware or desktop malware. Why? If the sender’s internet-connected device is sufficiently compromised, then the destination address can be poisoned before it ever gets to the hardware wallet. When that happens, comparing the address shown on the hardware wallet screen to the address shown on the sender’s phone or desktop will always result in a match – because the process is comparing garbage to garbage. The same holds for (A), (B), (E), and most cases of (C) in the figure below:
Simple address comparison between sender device and hardware wallet screens can reliably detect (D), which is manipulation of the address in transit between phone and hardware wallet using physical access and expertise (e.g. over Bluetooth, USB, NFC, QR code, SD card, etc). We think the risk of leaving coins on a custodial platform outweighs the risk posed by (D). Plus, we think a stronger protection is to use modern cryptography to secure the NFC channel, rather than asking customers to do it themselves – and that’s something we’re interested in building.
That leaves us with scenario (C) in Figure 3 above. Let’s break this category down:
We can see in the table above that a hardware wallet screen will not help you if you're comparing it to something that is already poisoned. To protect against these attacks, we need something stronger: a comparison to an independent source. Read on to learn about how we're considering providing this comparison for Bitkey customers when they're sending funds -- but first, let's take a look at how screens are typically used in the process of receiving funds.
Receiving Money: Screens Help, But Aren’t the Only Option
We've covered why address comparisons for sending funds over the internet often aren't providing protection. What about the other direction - receiving funds? Consider a customer transferring to their hardware wallet from another wallet, such as an exchange-hosted one – or providing an address to another person who’s going to send them money. The customer needs to generate a receive address to transfer or receive the money to. If that address comes from a mobile application, including the hardware wallet’s companion app, then mobile malware on the customer's phone could substitute this receive address for an attacker-controlled address, in order to steal funds when the transfer occurs:
How can we help customers gain confidence that mobile malware hasn’t replaced the receive address they generated in order to send money to themselves?
One answer is to take the address from a screen on a hardware wallet. For single-signature wallets, this generally means that a new address must be transferred from the hardware wallet screen every time the customer wants to receive funds or that a previous, trusted address must be reused, reducing privacy. Once the customer has generated the receive address on their hardware wallet, they must provide it securely to the sender, who must enter it into some system in order to sign a transaction.
In what ways might that happen in practice?
- Receiver provides the address directly from their hardware wallet's screen to the sender, e.g. in person or over a trusted communication channel
- Receiver transfers the address from their hardware wallet (e.g. via QR code, USB, NFC, typing it in) to their phone or desktop computer, and then provides the address electronically to the sender
In the first case, the receiver can know with high confidence that the sender has the correct address. In the second case, the receiver can verify the receive address against the hardware wallet screen. But if the mobile or desktop computer they transferred it to is sufficiently compromised, the address can still be poisoned afterwards! To defend against this, customers could share the receive address by transferring it to multiple platforms which can each be used to provide it to the sender, who can check that they match.
This isn't the only answer, though. Let's take a look at how we could enable customers to verify their receive addresses and transaction details using one of the other components in Bitkey's 2-of-3 multi-signature setup: the server.
Potential Approach: Using the Server as a Screen
Bitkey hardware can cryptographically sign information, the customer’s phone can forward that signature to Bitkey servers, and Bitkey servers can verify the signature in order to guarantee that the information was not modified in transit by the customer's phone, even if their phone is compromised by malware.
The reverse is also true -- Bitkey servers can sign information that can be verified by Bitkey hardware, ensuring that a compromised phone didn't tamper with information sent from server to hardware.
With the ability to send data securely between hardware and server, we can potentially use the server to do something the hardware cannot: communicate detailed transaction information like destination address, fees, and amounts directly to users.
That is, after receiving information like transaction details and receive addresses from the hardware, the server could communicate it, along with guidance about how to safely verify it, to a customer via channels like email, a webpage, or even one to one of their Trusted Contacts in the Social Recovery feature. Some ways we might build this are outlined below.
We’re considering how we could provide an optional protection that wouldn't require a broad audience to use small hardware wallet screens or take their hardware out every time they want to receive money. Specifically, one optional protection we could provide is to have Bitkey hardware involved in generating the address, and Bitkey servers involved in verifying and communicating it back to the customer to verify:
How does the above work? Bitkey hardware maintains knowledge of which mobile application key and which server key it can collaborate with to sign bitcoin transactions, allowing the hardware to generate a receive address without depending on inputs from the mobile application. The hardware can't communicate addresses directly to a human, but when used with the phone, it can communicate with Bitkey servers, which can in turn communicate with the wallet owner or whoever they’re providing the address to. Customers who want the additional confidence of a hardware-verified receive address could share a link with their intended recipient, who could view the receive address on a Bitkey-hosted page and check that it is indeed the address they want to be paid to ahead of the money being sent. In addition to providing strong protection against mobile malware, we think this approach could also help facilitate sharing addresses safely. If we built this feature, would you use it?
The goal of verifying transaction details when sending funds is to gain confidence that the offline key will be used only to move the funds you want to move and only to the intended address - that is, that the transaction details were not modified by the internet-connected computer you had to use to interact with the hardware wallet. By including Bitkey servers in the process of verification, we could enable customers to view these details on an independent device, with very high confidence that the details are exactly what the hardware will sign if subsequently authorized by the customer using their hardware. Let’s take a look at how this concept could work:
One limitation of this approach is that it cannot be used if Bitkey servers are unreachable. In order to ensure that customers can always move money in that situation, the hardware would need to allow customers to unlock their hardware and use a specific gesture (e.g. long press) to bypass the protection, if they'd opted into it or if we had enabled it by default.
Furthermore, this type of protection requires careful technical design and review, which we will engage in and publish if we take this approach and find that it can meet our bar for user experience. In the meantime, what pros and cons do you see with the concept?
Easier to Use Means More Self-Custody Owners
In earlier sections, we covered how a screen can be useful in several cases:
- For providing receive addresses to a sender either in person or via multiple platforms/channels that the sender can compare against
- For verifying outgoing transaction details, specifically on a device that was not used to supply the hardware wallet with the transaction details in the first place
We think the above aren’t likely to happen for a broad audience - it's hard to know how to verify correctly and very easy to do a naive comparison between screens that doesn't mitigate most threats. We also think that reducing the complexity involved in hardware wallet management is one of the biggest levers we can use to bring more people to self-custody and put them in control of their money. Here's why:
- Small screens are hard to use, and often accompanied by user input mechanisms that are hard to use, too. We want to make self-custody more accessible -- and enabling customers to use their familiar phone screen as much as possible is a key enabler.
- Screens add cost and complexity to hardware wallets, including introducing more ways for the hardware to fail. We want to drive the cost of self-custody down, and the reliability up.
- Using a screen correctly in order to provide protection is hard – customers have to know to compare their hardware wallet screen to an independent source on outgoing transfers, and to share receive addresses through multiple channels. When many customers don’t do this in practice, they incur a user experience cost that doesn’t come with the benefits they were hoping for. And the user experience cost also means that it’s more likely the feature will be misused, or won’t be used at all. We want large security benefits for low user experience cost – not the other way around.
- Correctly verifying properties of transactions is likely to become more technical, not less (e.g. as the community builds on top of Bitcoin over time, for example to provide functionality over the Lightning network). We want to build solutions that hide this complexity from people where we can, while still giving them a set of options that allow them to add more verification friction where they really want it.
We'd love your feedback on anything in this post, and specifically on two big questions:
- This post suggests that the complexity of tiny screens doesn't outweigh the practical benefit, and is one of the factors that keeps a broader audience from adopting self-custody. And that as a result, if we provide an alternative to a hardware wallet screen, we would do so in an optional, server-assisted way. Do you think we're making the right choice? Why or why not?
- We also outlined a 'server as a screen' concept that relies on secure communication from Bitkey hardware to Bitkey servers (via the Bitkey mobile app) to enable customers to more safely share the receive addresses the hardware generates, and more safely verify what transactions the hardware key is used to sign.
- What strengths and weaknesses do you see with this ‘server as a screen’ concept?
- What channels (webpage, email, in-app push notifications, SMS, other) would you want us to use to communicate with you about transaction details and receive addresses - and why?
- Should we build this? Why or why not?We think the most important next step toward an open, global payments network is to get more coins off of exchanges and into customers' hands, and we’re focused on how to make self-custody a safe and easy alternative giving up ownership and control to a third party. We’re much more likely to be successful at this mission with your help - so please consider giving us feedback on our direction and the questions above at email@example.com, on Twitter, or on nostr.