Tuesday, March 13, 2018

Google DoubleClick essay first draft

Note: This is the first draft. Many issues like Mozilla and Google Analytics are not covered in detail yet. Final essay will be posted in April. Thanks Brendan Eich for the inspiration for the essay.

This essay will describe the history of Internet advertising at Google. I will also talking about the ethical issues of some of the kinds of ads that Google produces, and why Larry/Sergey didn’t consider them for example. Of course, it is worth noting that Google isn’t the only one involved in the ad bubble.

The first kind of ads that Google did was AdWords, dating back to 2000. AdWords was based on search keywords, and the text ads was displayed at the top of the search results (labelled as ads) and was relatively simple. Typically the highest bidder was shown, and the advertiser paid Google when the user clicked on the ads. AdWords involved relatively little tracking at least initially and will not be mentioned much here. At this time Google was taking a stand against popup ads.

AdSense was ads shown on webpages themselves, based on JavaScript. It was invented in 2003. AdSense at least initially was based on keywords on webpages themselves (which Google fetched from its cache for example), which advertisers could bid on. Like with AdWords, Google and websites gets paid when users click on the ads. It also involved little tracking at least initially, but the malware problems will be described here.

Google bought DoubleClick in 2008. DoubleClick was invented in 1995. It made more sophisticated ad tracking via cookies and the like famous (which was often called “retargeting”), and the problems will be described here. DoubleClick themselves called its product “Dynamic Advertising Reporting and Targeting” at one point for example. Initially DoubleClick was mostly banner ads, and many users developed so called banner-blindness from these ads.

One of the problems of ads is malware. Typically advertisers take the highest bidder of ads and fill as much space as possible with ads, making malware like exploit kits difficult to prevent. To make things worse, companies can only spend a limited amount of money on ads, so sites often have to take the highest bidder and sometimes websites even use multiple ad networks. Flash was famous for many exploits for example, and these days in general plug-ins are dying off (Java was even worse for example). Of course, there are browser exploits too like in Firefox and Chrome.

Though the vast majority of exploits in kits are typically already patched, sometimes unpatched zero day exploits get delivered by ads like in the case of https://www.trendmicro.com/vinfo/us/security/news/zero-day-exploit. There is a market for exploit kits in general, and zero days are particularly valuable.

One of the most famous of ads that contain malware was at Forbes, where the Angler exploit kit was served via pop-under ads after the site asked users to turn off ad blockers. Of course, asking users to turn off ad blockers or otherwise fighting against them is not a good idea in the first place.

Douglas Crockford tried to prevent malicious JavaScript in ads at Yahoo with AdSafe, including cross site scripting attacks. Of course, JavaScript is a Turing complete language making this more difficult, and Flash is even more complex. This is especially an issue when browser exploits are involved.

Another problem is tracking. The current economy is a debt-based economy based on consumption. The more money advertisers can extract from consumers, the more they are willing to spend on ads. This results in tracking getting creepier and creepier. Most of the tracking is called “retargeting” and it is often based on cookies and JavaScript.

For example, DoubleClick has cross-device retargeting introduced in 2015. Of course, it is limited to logged-in users tracking via the user account at least initially which any websites can do, but it illustrated the trend. Google changed the privacy policy to allow Google accounts to be used for such logged-in user tracking in 2016.

Of course, users often has little control and benefit over storage of user data and ad retargeting by trackers too, especially when many parties are involved. Of course, some provides more control than others.

So why didn’t Larry/Sergey consider the issues when buying DoubleClick for example?

One reason I assume is that no one cared as much about security when AdSense added Flash ads for example, with exploits not as common as now.

Google’s ad blocking involves annoying ads, but don’t fix the issues described here. Apple’s ad blocking targets retargeting, but does not make ads less annoying.

Thursday, December 14, 2017

Google, Mozilla, and the debt-based economy

Thread 1:
From https://twitter.com/BrendanEich/status/932020295384178688:
"From me to you (I'm still bound by non-disclosure agreements with Mozilla), when we started the first Google/Firefox search deal in 2004, it was a better world. Pre-Doubleclick/YouTube Google, for one. Pre-programmatic ad tech too. Important not to equate threat from then to now."
From https://twitter.com/yuhong2/status/932456215048724481:
"There is a reason why I asked about your meeting with Google founders in 2005 around the time they acquired Urchin."
From https://twitter.com/yuhong2/status/932463189547278337:
"I mean, did you care about Google Analytics back then?"
From https://twitter.com/BrendanEich/status/932463682822422528:
"No, it was not tied into search ads or anything like what doubleclick brought to the table. We've been over this. That was a different era. Maybe you saw farther than I did -- if so, good for you!"
From https://twitter.com/yuhong2/status/932463850237997056:
"And do you think it is the founder's fault?"
From https://twitter.com/yuhong2/status/932470729144020992:
"AFAIK they were against popup ads in the early days."
From https://twitter.com/BrendanEich/status/932473969625595904:
"A friend said in 2003 that Sergey declared G would not acquire display ads & arb. Search vs. Display as that would be “evil”. But going public in 2004 inevitably meant growth, arb-opptys, monopoly power. Capitalism 101, I said recently.

Could search-only G have become a utility?"
From https://twitter.com/yuhong2/status/932474138450542592:
"Yea, part o the problem is that it took VC funding so it had to IPO or sell to exit."
From https://twitter.com/yuhong2/status/932475230748008448:
"I said before that VC might not quite be debt, but it is close enough for this discussion."

Thread 2:
From https://twitter.com/jwajsberg/status/932746958703349761 :
"Totally agree. I don't get why we don't do that now, in this time where we want to take risks again. Note we enabled tp by default in Firefox os..."
From https://twitter.com/andreasgal/status/932757853504339968 :
"Yup. I was able to sneak that past management"
From https://twitter.com/yuhong2/status/932760376294359040 :
"I wonder if you ever talked to Larry/Sergey."
From https://twitter.com/BrendanEich/status/932761563617837057:
"He didn’t - will you give it a rest?! It wasn’t that explicit back in day, and more recently it was Sundar’s people not Sergey or Larry!"
From https://twitter.com/yuhong2/status/932761950848557057:
"I wonder what the discussion would be like if they did."

Thread 3:
From https://twitter.com/yuhong2/status/932747119009546240 :
"Thinking about it, the Firefox/Google search deal was probably before or during the IPO and they were VC funded before that, right?"
From https://twitter.com/BrendanEich/status/932747375986163712 :
"It was pre-IPO. They were definitely thinking about that but they were also naive. Both founders said things like "we can defy public markets and take losses doing what is right". Lol."
From https://twitter.com/yuhong2/status/932747653963702273 :
"I am mainly talking about where the funding came from though."
From https://twitter.com/BrendanEich/status/932747825833680897 :
"It's not a simple Newtonian-physics (or fake economics based on same) problem."
From https://twitter.com/yuhong2/status/932747980272103424 :
"Yea, part of the problem is how the current debt based economy works in the first place."

A few more:
From https://twitter.com/yuhong2/status/933611862268174336:
"Thinking about it, if Google showed no growth, it is not the end of the world but things like stock options would worth less, right?"
From https://twitter.com/yuhong2/status/934647951518920705:
"It probably doesn't help that things like storage costs scale with the size of the index not the number of searches per day or the like."
From https://twitter.com/yuhong2/status/934655735366959104:
"Imagine the search revenue don't grow but the size of the search index still grows."
From https://twitter.com/yuhong2/status/934656082214993923:
"Though per-GB storage cost is cheaper today than it was last decade."

Monday, December 26, 2016

NT 4.0, .NET 1.1, and INTLFXSR.SYS problems

Here is the code from a disassembly of INTLFXSR.SYS with symbols:
.text:000102A0 ; __stdcall FxsrGetProcessorFeatures()
.text:000102A0                 public _FxsrGetProcessorFeatures@0
.text:000102A0 _FxsrGetProcessorFeatures@0 proc near   ; CODE XREF: DriverEntry(x,x)+61 p
.text:000102A0                 push    edi
.text:000102A1                 push    esi
.text:000102A2                 push    ebx
.text:000102A3                 pushf
.text:000102A4                 pop     eax
.text:000102A5                 push    eax
.text:000102A6                 mov     ecx, eax
.text:000102A8                 xor     eax, 40000h
.text:000102AD                 push    eax
.text:000102AE                 popf
.text:000102AF                 pushf
.text:000102B0                 pop     eax
.text:000102B1                 cmp     ecx, eax
.text:000102B3                 jz      short cpu_is_i386
.text:000102B5                 mov     eax, ecx
.text:000102B7                 xor     eax, 200000h
.text:000102BC                 push    eax
.text:000102BD                 popf
.text:000102BE                 pushf
.text:000102BF                 pop     eax
.text:000102C0                 cmp     ecx, eax
.text:000102C2                 jz      short other_cpu
.text:000102C4                 mov     eax, 0
.text:000102C9                 cpuid
.text:000102CB                 cmp     eax, 3
.text:000102CE                 jg      short cpu_identified
.text:000102D0                 mov     _VerifyIntel, ebx
.text:000102D6                 mov     dword_106C4, edx
.text:000102DC                 mov     dword_106C8, ecx
.text:000102E2                 lea     esi, _VerifyIntel
.text:000102E8                 lea     edi, _GenuineIntel ; "GenuineIntel"
.text:000102EE                 mov     ecx, 0Ch
.text:000102F3                 repe cmpsb
.text:000102F5                 jnz     short other_cpu
.text:000102F7                 mov     eax, 1
.text:000102FC                 cpuid
.text:000102FE                 mov     eax, edx
.text:00010300                 jmp     short cpu_identified
.text:00010302 ; ---------------------------------------------------------------------------
.text:00010302 other_cpu:                              ; CODE XREF: FxsrGetProcessorFeatures()+22 j
.text:00010302                                         ; FxsrGetProcessorFeatures()+55 j
.text:00010302                 mov     eax, 0
.text:00010307                 jmp     short cpu_identified
.text:00010309 ; ---------------------------------------------------------------------------
.text:00010309 cpu_is_i386:                            ; CODE XREF: FxsrGetProcessorFeatures()+13 j
.text:00010309                 mov     eax, 0
.text:0001030E cpu_identified:                         ; CODE XREF: FxsrGetProcessorFeatures()+2E j
.text:0001030E                                         ; FxsrGetProcessorFeatures()+60 j ...
.text:0001030E                 popf
.text:0001030F                 pop     ebx
.text:00010310                 pop     esi
.text:00010311                 pop     edi
.text:00010312                 retn
.text:00010312 _FxsrGetProcessorFeatures@0 endp
If you know x86 assembly, you will notice that it relies on a GenuineIntel CPU and for CPUID leaf 0 to return a value less than 3.
As for the .NET Framework 1.1 problems, the way to determine if SSE is supported is to first use CPUID to determine if the SSE bit is set. But there is also an extra step. Without CR4.OSFXSR set, SSE instructions will cause #UD. This can be caught on Windows as a SEH exception. My guess is that .NET 1.1 is not doing that, which is why it crashes without INTLFXSR.SYS properly loaded.

Tuesday, June 16, 2015

Why your Core 2 processor appear to not have CMPXCHG16B

From http://download.intel.com/design/processor/specupdt/318733.pdf :
"AW67. Enabling PECI via the PECI_CTL MSR Does Not Enable PECI and May Corrupt the CPUID Feature Flags
Problem: Writing PECI_CTL MSR (Platform Environment Control Interface Control Register) will not update the PECI_CTL MSR (5A0H), instead it will write to the VMM Feature Flag Mask MSR (CPUID_FEATURE_MASK1, 478H).
Implication: Due to this erratum, PECI (Platform Environment Control Interface) will not be enabled as expected by the software. In addition, due to this erratum, processor features reported in ECX following execution of leaf 1 of CPUID (EAX=1) may be masked. Software utilizing CPUID leaf 1 to verify processor capabilities may not work as intended.
Workaround: It is possible for the BIOS to contain a workaround for this erratum. Do not initialize PECI before processor update is loaded. Also, load processor update as soon as possible after RESET as documented in the RS – Wolfdale Processor Family Bios Writers Guide, Section 14.8.3 Bootstrap Processor Initialization Requirements. "
The CMPXCHG16B feature flag is one of the flags that is reported in ECX.
This erratum only affects E0/R0 steppings of 45nm Core 2, as you can see in the Summary Table of Changes.
Generally a BIOS update will contain the needed microcode update mentioned above.
For those who have Intel motherboards, from https://communities.vmware.com/message/1765787 :
"I got fed up and went to Intel on this one.  One of their second level people finally gave me the suggestion that I should again flash the BIOS update, but use the method for full bios refresh, rather than the windows-based update process.  I suspect that the microcode fix referred to in AV69 is in a part of the bios core that is not updated unless you do the full refresh."

Thursday, October 30, 2014

The history of the MS C runtime DLL

In the earliest days, there was the Win32 SDK shipped with the NT betas and the final release of NT 3.1. The CRT DLL was called CRTDLL.DLL.

Visual C++ 1.0 for NT shipped around the time of NT 3.1 release, and it used MSVCRT10.DLL. This was followed by MSVCRT20.DLL (for 2.x) and MSVCRT40.DLL (for 4.0).

Visual C++ 4.2 introduced the now famous MSVCRT.DLL, which was also used by 5.0 and 6.0. The 6.0 MSVCRT had a new heap allocator that exposed bugs in existing apps, forcing MS to issue the Microsoft Libraries Update.

As a result, starting with Win2000 the MSVCRT.DLL was now part of Windows. Future versions of Visual C++ used MSVCR70.DLL etc. For 7.x the DLLs was supposed to go into the application directory. 8.0 and 9.0 used SxS (with the exception of Win2000 and older where it was supposed to be placed in System32, if I remembered correctly). 10.0 abandoned SxS and always used System32, This is also true for 11.0 and 12.0.

14.0 will split the CRT into two parts, one is the version specific vcruntime140.dll etc, and the other is the non version specific backward compatible appcrt.dll and desktopcrt.dll. See MS's blog article for more details.

Tuesday, August 19, 2014

My wishlist for Satya

I originally posted this as a comment on the hal2020.com blog, but I think it is important enough that I posted it here too.
From http://hal2020.com/2014/03/03/satya-shuffles-his-leadership/#comment-14856:
"I agree, but I do have several items on my wishlist for Satya, including ending the Yahoo-Bing and the MS-Novell deal, ending the Android patent attacks, putting an end to the SCO lawsuit,"
From http://hal2020.com/2014/03/03/satya-shuffles-his-leadership/#comment-14918:
"For example, the MS-Novell deal is so bad that FSF put a provision in GPLv3 against it. I don’t know how much power MS has right now to end the SCO lawsuit, but it was quite famous. So is the FAT/exFAT patents and how it has been used to attack Android and other things that uses them (I am thinking that existing patents should go to the public domain and any remaining exFAT patent applications withdrawn from USPTO if possible)."
If you don't remember, the patent part of the Microsoft-Novell deal was discriminatory, which means that it was limited to specific customers. The point of free software, including licenses like the GPL, is that it allows free distribution of software without any royalty based patent licensing requirements. This is why the GPLv3 had a provision against it. Another problem is the $100 million worth of vouchers, to get customers to buy SUSE. Why should MS help a competitor like this? This deal was renewed in 2011 for four years, and it still have the same problems. Since then Novell has abandoned Mono, making the deal less valuable for MS than it was before.
Also, the FAT patents are not the only patents MS used to attack Android, and ChromeOS is also attacked using similar patents. Most of these patent attacks are based on FUD.
From http://hal2020.com/2014/03/03/satya-shuffles-his-leadership/#comment-14919:
"And I forgot to mention OOXML. I just realized that Office for Windows don’t use the “Open XML” term that much inside the software. I am thinking of a proposal where the standard would be withdrawn from ISO, the “Office Open XML” term would be depreciated, and the “Strict Open XML” option would be removed (I doubt it is catching on). Note this don’t change the file format itself in anyway, the contents of the ISO standard would be merged with MS-DOCX/XLSX/PPTX."

Friday, December 20, 2013

MS12-034, keyboard layouts‏, and a bug

I have reverse engineered this patch and its effects on keyboard layouts a bit. The patch works by shipping a new version of win32k for XP and Server 2003 that pay attention to a registry key. When this registry key is added, it restricts loading of keyboard layouts to the System32 folder (already done in Vista and later). This prevents further exploits on the keyboard layout loading code. This is the first part shipped in KB2676562.

The second part is a patch (KB2686509) that adds this registry key. Before this registry key is added, a DLL called kblchecker.dll is loaded that is shipped inside the patch. This DLL is supposed to enumerate all the keyboard layouts on the system and make sure they are all in the system32 folder because any other keyboard layout DLL is going to be disabled by this update. What I found out by black box testing this patch is that any registry key value (not subkeys or any value inside a subkey) in the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Keyboard Layout key regardless of name is going to make this check fail with no FaultyKeyboards.log being created, which looks like a bug. The reason MS is not fixing this bug is probably because all it does is makes the installation of this patch fail.