Building a browser is hard; building a great browser inevitably requires gathering plenty of data to make sure that things that work in the particular lab work in the field. But once you gather data, you have to make sure you secure user privacy. We’ re generally looking at ways to improve the security of our own data collection, and lately we’ ve been experimenting with a really great technique called Prio.
Currently, all the major browsers perform more or less the same thing for data confirming: the browser collects a bunch of stats and sends it back to the internet browser maker for analysis; in Opera, we call this system Telemetry. The task with building a Telemetry system is that will data is sensitive. In order to make sure that we are safeguarding our users’ personal privacy, Mozilla has built a set of transparent information practices which figure out what we can collect and under exactly what conditions. For particularly sensitive types of data , we ask customers to opt-in to the collection and be sure that the data is handled safely.
We understand that this involves users to trust Mozilla — that we won’ t misuse their own data, that the data won’ capital t be exposed in a breach, which Mozilla won’ t be motivated to provide access to the data by one more party. In the future, we would prefer customers to not have to just trust Mozilla, especially when we’ re collecting information that is sufficiently sensitive to need an opt-in. This is why we’ lso are exploring new ways to preserve your computer data privacy and security without diminishing access to the information we need to build the very best products and services.
Obviously, not really collecting any data at all ideal privacy, but it also blinds us in order to real issues in the field, which makes it difficult for us to build features — which includes privacy features — which we all know our users want. This is a typical problem and there has been quite a bit of focus on what’ s called “ privacy-preserving data collection”, including systems produced by Google ( RAPPOR , PROCHLO ) and Apple. Each of these techniques has advantages and disadvantages that are beyond the particular scope of this post, but be sufficient to say that this is an area of extremely active work.
Recently, we’ ve been experimenting with one particular system: Prio , developed by Teacher Dan Boneh plus PhD student Henry Corrigan-Gibbs of Stanford University’ ersus Computer Science department. The basic understanding behind Prio is that for most reasons we don’ t need to gather individual data, but rather only aggregates. Prio, which is in the public domain, enables Mozilla collect aggregate data with no collecting anyone’ s individual information. It does this by having the internet browser break the data up into 2 “ shares”, each of which is delivered to a different server. Individually the stocks don’ t tell you anything concerning the data being reported, but jointly they do. Each server collects the particular shares from all the clients plus adds them up. If the computers then take their sum beliefs and put them together, the result will be the sum of all the users’ values. So long as one server is honest, after that there’ s no way to recover the person values.
We’ ve been working with the Stanford group to test Prio in Firefox. Within the first stage of the experiment we would like to make sure that it works efficiently at size and produces the expected outcomes. This is something that should just work, yet as I mentioned before, building systems will be a lot harder in practice than theory. To be able to test our integration, we’ lso are doing a simple deployment where all of us take nonsensitive data that we currently collect using Telemetry and gather it via Prio as well. Allowing us prove out the technologies without interfering with our existing, cautious handling of sensitive data. This particular part is in Nightly now plus reporting back already. In order to procedure the data, we’ ve integrated assistance for Prio into our Ignite -based telemetry analysis program, so it automatically talks to the Prio servers to compute the aggregates.
Our initial answers are promising: we’ ve been working Prio in Nightly for six weeks, gathered over 3 mil data values, and after fixing a little glitch where we were getting fake results , our Prio outcomes match our Telemetry results flawlessly. Processing time and bandwidth furthermore look good. Over the next few months we’ ll be doing further tests to verify that Prio is constantly on the produce the right answers plus works well with our existing data pipeline.
Most importantly, in a manufacturing deployment we need to make sure that user personal privacy doesn’ t depend on trusting just one party. This means distributing trust simply by selecting a third party (or parties) that will users can have confidence in. This particular third party would never see any individual consumer data, but they would be responsible for maintaining us honest by ensuring that all of us never see any individual user information either. To that end, it’ h important to select a third party that customers can trust; we’ ll convey more to say about this as we firm up our own plans.
We don’ t yet have concrete programs for what data we’ lmost all protect with Prio and when. As soon as we’ ve validated that it’ s working as expected and provides the particular privacy guarantees we require, we are able to move forward in applying it where it really is needed most. Expect to hear a lot more from us in future, however for now it’ s exciting in order to take the first step towards personal privacy preserving data collection.
If you liked Assessment Privacy-Preserving Telemetry with Prio by Robert Helmer Then you'll love Web Design Agency Miami