Our DNA can tell us a lot about who our relatives are. Recently, several companies including 23andMe and AncestryDNA now provide services where they collect DNA from individuals and then match the DNA to a database of the DNA of other people to identify relatives. Relatives are then informed by the company that their DNAs match. Our lab was interested if we can perform this same type of service but without involving a company and more generally without involving any third party. One way to do this would be to have individuals obtain their own DNA sequences and then share their DNA sequences directly with each other. Unfortunately, DNA sequences are considered medical information and it is inappropriate to share them in this way.
Through a collaboration between our lab and the UCLA cryptography group, we recently published a paper that combines cryptography and genetics which describes an approach for identifying relatives without compromising privacy. Our paper was published in the April 2014 issue of Genome Research. The key ideas is that individuals release an encrypted version of their DNA information. Another individual can download this encrypted version and then use their own DNA information to try to decrypt it. If the are related to each other, their DNA sequences will be close enough that the decryption will work telling the individual that they are related. While if they are unrelated, the decryption will fail. What is important in this approach is that individuals who are not related do not obtain any information about each other’s DNA sequences.
The intuitive idea behind the approach is the following. Individuals each release a copy of their own genomes encrypted with a key that is based on the genome itself. Other users then download this encrypted information and try to decrypt it using their own genomes as the key. The encryption scheme is designed to allow for decryption if the encrypting key and decrypting key are “close enough”. Since related individuals share a portion of their genomes, we set the threshold for “close enough” to be exactly the threshold of relatedness that we want to detect.
Our approach uses a relatively new type of cryptographic technique called Fuzzy Extractors which were pioneered by our co-authors on this study, Amit Sahai and Rafail Ostrovsky. This type of technique allows for encryption and decryption with keys that match inexactly. Students in our group who were involved are Dan He, Nick Furlotte, Farhad Hormozdiari, and Jong Wha (Joanne) Joo. This research was supported by National Science Foundation grant 1065276.
The full citation of our paper is here:
Identifying genetic relatives without compromising privacy. Journal Article
In: Genome Res, 2014, ISSN: 1549-5469.