handouts/ho07.tex
changeset 444 aea1d40cf1ba
parent 443 67d7d239c617
child 448 48d0a9890adc
equal deleted inserted replaced
443:67d7d239c617 444:aea1d40cf1ba
   324 also looked up diseases that her friends had, which had
   324 also looked up diseases that her friends had, which had
   325 nothing to do with her. Any rational analysis of her query
   325 nothing to do with her. Any rational analysis of her query
   326 data must therefore have concluded, the lady is on her
   326 data must therefore have concluded, the lady is on her
   327 death bed, while she was actually very much alive and kicking.
   327 death bed, while she was actually very much alive and kicking.
   328 
   328 
       
   329 In 2016, Yahoo released the so far largest machine learning
       
   330 dataset to the research community. It includes approximately
       
   331 13.5 TByte of data representing around 100 Billion events from
       
   332 anonymized user-news items, collected by recording
       
   333 interactions of about 20M users from February 2015 to May
       
   334 2015. Yahoo's gracious goal is to promote independent research
       
   335 in the fields of large-scale machine learning and recommender
       
   336 systems. It remains to be seen whether this data will really
       
   337 only be used for that purpose.
       
   338 
   329 \subsubsection*{Differential Privacy}
   339 \subsubsection*{Differential Privacy}
   330 
   340 
   331 Differential privacy is one of the few methods that tries to
   341 Differential privacy is one of the few methods that tries to
   332 achieve forward privacy. The basic idea is to add appropriate
   342 achieve forward privacy. The basic idea is to add appropriate
   333 noise, or errors, to any query of the dataset. The intention
   343 noise, or errors, to any query of the dataset. The intention