Thursday, March 29, 2018

Google DoubleClick Mozilla essay third draft


There are many problems with web advertising in general, including some of them being annoying to users (like autoplay video ads and pop-ups) and also problems like click fraud which matter to advertisers. But the ethical issues with them are the most important, including malware like exploit kits and tracking ads. I will be focusing on the ethical issues with some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them when buying DoubleClick for example. This essay will also talk about Mozilla and how they are involved (like in the Google/Mozilla search deal), including Brendan Eich who created JavaScript that eventually left Mozilla to found Brave. I will be also be talking about the difficulty of solving these issues.

Google was founded in 1998 by Larry Page and Sergey Brin while at Stanford, and took VC funding from KP and other partners. Eric Schmidt was bought in as CEO in 2001 and recently left but are still on the board. Google IPOed in 2004, using dual class stock for example.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was also taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

Google bought Urchin in 2005, turning it into Google Analytics. Initially its product was to analyze web server log files, with JavaScript tags being added later.

One of the problems of ads is malware. Typically websites take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder to make enough revenue to support them and often websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable (obviously because they are unpatched).

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers in 2016 (discovered by Brian Baskin). Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place, and it illustrated some of the flaws discussed here. This exploit kits also hit sites like MSN.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved. I think AdSafe worked by creating a limited sandbox to prevent things like XSS attacks.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

According to http://adage.com/article/digital/google-turns-behavioral-targeting-beef-display-ads/135152/, “In December 2008 Google added DoubleClick cookies to AdSense ads”, tying the DoubleClick cookie-based tracking (dating long before Google bought it) to AdSense. I assume that AdSense tracking probably did not exist before Google bought DoubleClick. Google Analytics added AdWords and AdSense support in 2009. In 2012, Google changed its privacy policy to allow data to be consolidated, which was also very controversial. In 2014, Google Analytics integrated with DoubleClick, allowing things like remarketing lists to be shared. Remarketing lists for search ads (tied to Google Analytics) was introduced in 2015. Remarketing lists are basically lists of website visitors that can be uniquely identified by things like cookies, and it is one of the ways of targeting ads to users. I would assume that sharing remarketing lists basically ties the tracking together.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others, such as AdChoices for example.

So why didn’t Larry/Sergey consider the ethical and other issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now. One of the first common exploits (dating back to the late 1990s) was stack-based and sometime heap-based buffer overruns (using null-terminated C string copies that don’t limit the length copied for example), then the exploits got more sophisticated and complex (like use after free and ASLR information leaks used to disclose addresses as an example) especially as mitigation measures like stack canaries, NX and ASLR became common in response. I assume that the market for exploit kits and zero day exploits and the like also probably took time to develop (though some of them was made famous by recent NSA leaks for example).

Before the Google-DoubleClick acquisition, DoubleClick was once planned to merge with Abacus. FTC blocked the merger because of the privacy problems (especially problems with deanonymizing users) and it never happened. Abacus Direct seems to be a market researching company targeting consumer buying behavior. As a result, Abacus had a lot of personal info about consumers, and there were concerns that this data could be merged with DoubleClick data.

The Google-DoubleClick acquisitions was also controversial, with EPIC for example filing complaints with the FTC. There was also a Senate hearing on Sept 27, 2007 with testimonies from a variety of sources regarding that issue. One of the concerns back then was aggregation of tracking data and lack of control by users.

In 2012, Jonathan Mayer discovered that Google used some tricks in JavaScript to allow tracking in Safari. It involved how Google was able to bypass cookie blocking policy in Safari by using an invisible form to fool Safari into allowing cookies. FTC fined Google $22.5 million over this behaviour, and more recently there has been lawsuits about it in the UK. Google argued the tracking was unintentional at the time and that it was related to Google+ “Plus” buttons on DoubleClick ads (for logged-in users I believe). It is probably worth mentioning here that a lot of these kind of buttons (like Facebook’s Like buttons, to name another example) do their own tracking too (they generally worked by using IFRAMEs to the website involved), and this has been well known for years. For example, according to https://www.technologyreview.com/s/541351/facebooks-like-buttons-will-soon-track-your-web-browsing-to-target-ads/ Facebook started using the tracking Like buttons to target ads in 2015. I think the Facebook WhatsApp acquisition story is also famous by now BTW, including how they eventually allowed data sharing between the two. It is worth mentioning how even the WhatsApp founders now recommend deleting Facebook (especially after the Cambridge Analytica debacle).

Now, lets talk about Mozilla. Brendan Eich was the creator of JavaScript and was the CTO of Mozilla Corporation from 2005 to 2014. After he stepped down from Mozilla in 2014, he started Brave with its Basic Attention Token etc. Andreas Gal joined Mozilla in 2008 and was the CTO from 2014 until 2015 when he left Mozilla.

Mozilla signed the Google search deal in 2004, before Google even IPOed (let along things like DoubleClick). Mozilla switched to a Yahoo search deal in late 2014. Recently Mozilla switched back to Google as the default.

BrendanEich mentioned in https://twitter.com/BrendanEich/status/932747825833680897 that “It's not a simple Newtonian-physics (or fake economics based on same) problem.” This was about the history of the Google search deal with Mozilla and the fact that it was signed before Google IPOed (when it was being funded by VCs). It is worth mentioning here that Google was founded in 1998 when the now famous dot-com bubble was at the peak and VC funding was common (allowing many startups to grow fast which was considered more important than profits). Many other dot-com startups at the time had problems and ended up failing when the bubble collapsed around 2001. It is worth mentioning that the DoubleClick acquisition dates back to 2007 which was just before the housing bubble famously collapsed leading to another recession, and that bubble probably started just after the dot-com bubble.

It was mentioned on Twitter that Firefox OS enabled tracking protection by default unlike desktop Firefox. It was mentioned in https://twitter.com/andreasgal/status/932757853504339968 that “Yup. I was able to sneak that past management”. I then asked “I wonder if you ever talked to Larry/Sergey.” and Brendan then answered that Andreas didn’t of course. I wonder what would have happened if they did.

https://pagefair.com/blog/2017/gdpr_risk_to_the_duopoly/ has some information on the effect of EU GDPR on Google ads. Notice that AdWords comply if all “personalization” features are removed for example. This included things like “remarketing”. I suspect that AdWords when it was first created in 2000 did not have these features. Other features like “remarketing lists for search ads” are also listed as not compliant, which was of course probably added later too.

One of the first type of blocking was popup blockers, and Google was taking a stand against popups in the early days (they were well known to be annoying). They became common in browsers by the mid-2000s (even IE6 in XP SP2 had them). At one point circa 2002, AOL/Netscape was disabling the popup blocker from Netscape-branded Mozilla releases (for example the original Netscape 7 release I think). Of course, this was long before Google bought DoubleClick for example. Later more sophisticated ad and cookie blockers like AdBlock Plus and uBlock Origin came out as add-ons to browsers like Firefox, and one is built into Brave of course (along with BAT as a replacement for the lost ad revenue). Many other browsers have also similar tracking protection including Firefox and IE, but they just disable them by default.

Of course, it is worth noting that Google/DoubleClick isn’t the only one involved in the ad bubble (though DoubleClick was one of the first to do ad tracking I think). I think Taboola is often considered even worse than Google for example. The same fundamental problems with tracking and malware ads and the ad bubble etc. however tends to apply to all of the ad networks.

Recently, Google’s ad blocking and “better ads” (including so-called Better Ad Alliance) involves annoying ads, but don’t fix the fundamental issues described here. Apple’s ad blocking targets retargeting by limiting the life of cookies for example (making them less effective for tracking), but does not change the display of ads or make ads less annoying (for example, autoplay video ads are pretty famous as well, especially with Flash).

Now, fixing the problems might be difficult. One example here is that both Microsoft and Novell used CALs. CALs (called node licenses by Novell I think) are per user or per computer licenses common in server software like NetWare and Windows Server. Of course, when Novell moved to Linux, it was open source software that didn’t have CALs (the company only pays for support) meaning that Novell could not expect the same level of revenue as in the NetWare days (they moved to Linux by buying SUSE). The story about Sun’s open source projects and Jonathan Schwartz (the former “ponytail” CEO), and how they eventually had to sell to Oracle is probably pretty famous as well (some examples included OpenSolaris, OpenOffice, and OpenJDK). The ad bubble will probably not last forever though. This is part of the problem of the current debt-based economy (which allows almost infinite amounts of money to be printed), especially how it encourage extracting as much money as possible from so-called “consumers” (another example is Adobe Creative Cloud subscriptions and how Adobe’s stock price rose).

Google was famous for offering high amounts of storage in Gmail since the launch in 2004, not to mention that the size of the search index also probably grows over time. This obviously means the amount of revenue Google makes always have to grow (since storage costs always increase), or eventually profit margins would decline. This is particularly hard during recessions like those in 2007-2008. According to https://www.economist.com/node/14140373, Internet advertising declined the least in Q1 2009 but still declined. This is still an issue with cloud providers offering “unlimited” storage to users that gets abused to store excessive data (most recent example is Amazon where some users was touting being able to store more than 1PB, leading them to end unlimited storage).

Saturday, March 24, 2018

Google DoubleClick Mozilla essay second draft

This essay will describe the history of Internet advertising at Google. I will also talking about the ethical issues of some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them for example. Of course, it is worth noting that Google isn’t the only one involved in the ad bubble. This essay will also talk about Mozilla, including Brendan Eich who created JavaScript.

Google was founded in 1998 by Larry Page and Sergey Brin while at Stanford, and took VC funding. Eric Schmidt was bought in as CEO in 2001 and recently left but are still on the board. Google IPOed in 2004.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was also taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

Google bought Urchin in 2005, turning it into Google Analytics. Initially its product was to analyze web server log files, with JavaScript tags being added later.

One of the problems of ads is malware. Typically advertisers take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder and sometimes websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers. Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

Google Analytics added AdWords and AdSense support in 2009. In 2012, Google changed its privacy policy to allow data to be consolidated, which was also very controversial. In 2014, Google Analytics integrated with DoubleClick, allowing things like remarketing lists to be shared. Remarketing lists for search ads (tied to Google Analytics) was introduced in 2015. Remarketing lists are basically lists of website visitors that can be uniquely identified by things like cookies, and it is one of the ways of targeting ads to users. Sharing remarketing lists basically ties the tracking together.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others.

So why didn’t Larry/Sergey consider the issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now. I assume that the market for exploit kits and zero day exploits and the like took time to develop.

Before the Google-DoubleClick acquisition, DoubleClick was once planned to merge with Abacus. FTC blocked the merger because of the privacy problems (especially problems with deanonymizing users) and it never happened.

The Google-DoubleClick acquisitions was controversial, with EPIC for example filing complaints with the FTC. There was also a Senate hearing on Sept 27, 2007 with testimonies from a variety of sources regarding that issue. One of the concerns was aggregation of tracking data and lack of control by users.

Now, lets talk about Mozilla. Brendan Eich was the creator of JavaScript and was the CTO of Mozilla Corporation from 2005 to 2014. After he stepped down from Mozilla in 2014, he started Brave with its Basic Attention Token etc. Andreas Gal joined Mozilla in 2008 and was the CTO from 2014 until 2015 when he left Mozilla.

Mozilla signed the Google search deal in 2004, before Google even IPOed (let along things like DoubleClick). Mozilla switched to a Yahoo search deal in late 2014. Recently Mozilla switched back to Google as the default.

BrendanEich mentioned in https://twitter.com/BrendanEich/status/932747825833680897 on the Google search deal and history of Google that “It's not a simple Newtonian-physics (or fake economics based on same) problem.”

It was mentioned that Firefox OS enabled tracking protection by default unlike desktop Firefox. It was mentioned in https://twitter.com/andreasgal/status/932757853504339968 that “Yup. I was able to sneak that past management”.

Google’s ad blocking and “better ads” involves annoying ads, but don’t fix the issues described here. Apple’s ad blocking targets retargeting, but does not change the display of ads or make ads less annoying.

Tuesday, March 13, 2018

Google DoubleClick essay first draft

Note: This is the first draft. Many issues like Mozilla and Google Analytics are not covered in detail yet. Final essay will be posted in April. Thanks Brendan Eich for the inspiration for the essay.

This essay will describe the history of Internet advertising at Google. I will also talking about the ethical issues of some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them for example. Of course, it is worth noting that Google isn’t the only one involved in the ad bubble.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

One of the problems of ads is malware. Typically advertisers take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder and sometimes websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers. Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others.

So why didn’t Larry/Sergey consider the issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now.

Google’s ad blocking involves annoying ads, but don’t fix the issues described here. Apple’s ad blocking targets retargeting, but does not make ads less annoying.