Christian Covington of Pfafftown, NC, of University; Karl Knopf of London, Ontario, Canada; Shubhankar Mohapatra of Kolkata, India; and Shufan Zhang of Anhui, China, are all graduate students at the David Cheriton School of Computer Science at the University of Waterloo. Together, they are team Goose and their team is interested in practical applications of privacy research and gaining experience in developing solutions to real-world problems.
To create usable private data while guaranteeing data contributor privacy, team Goose designed a synthetic data generation pipeline which seeks to maintain key data characteristics while only making low-sensitivity queries to private records. To do this, they construct profiles of representative record distributions, which they call archetypes, from the public data. Using these archetypes, in addition to a small number of private queries, the team can construct a set of partial synthetic records. With additional pre and post processing using the public data, as well as domain knowledge, Goose created a complete synthetic data set with the missing attribute values. The chart below depicts this process:
To contact this team, please email Christian Covington at ccovington [at] uwaterloo.ca (ccovington[at]uwaterloo[dot]ca).