Browser Window column #4
Blame It on the Cookie
by Michael Macrone
What’s become of the word information in the ’90s must be some sort of cultural indicator. Once, it meant “enlightenment”; now it means something closer to “data.” An atmosphere we can’t help breathing, a verbal and intellectual fog of claims, factoids, pitches, news bites, documentation and FAQs, information is something we live, not something we get.
Buzzwords like information highway and information overload echo our feeling that information has become a lifestyle. And what the TV was to our last big paradigm, “leisure,” the Internet is to our new one. But as any amount of the exposure to the Net will tell you, there’s information, and then there’s information.
Data really isn’t the word. Machines can produce data, but only people produce information. It’s a social act, between someone who knows and someone who doesn’t. But an act isn’t the same as a truth, and its result may not be equal knowledge. Seeming to inform can be as persuasive as actually informing, and rhetoric turns into a kind of reality. This is why we’ve invented works like misinformation, for authoritative and convincing lies. Or disinformation, for insidious mass misinformation.
The Internet has made a new contribution to this roll of pseudo-information. I call it uninformation: the brazen presentation of uninformed beliefs as if they were true facts. Unlike misinformers and disinformers, uninformers don’t know they’re spreading fictions. Unlike mere lies or episodes of ignorance, uninformation is proudly broadcast to the world, where it is rebroadcast, sometimes often enough and in the right places to be taken for common knowledge.
Uninformation does have a history. In the seventeenth century they called it vulgar error, which translates today to “common misperceptions.” Legends and myths are obvious kin, but their origins are mystified, and they don’t make the same type of factual claims. Early Net users, seeking a label for what they recognized as a new phenomenon, adopted the term urban legend. But it doesn’t fit well, as uninformation is everywhere, not urban, and it’s a form of stupidity, not legend.
I’ve been guilty of it myself. Just yesterday I read a story on a reasonably trusted Web site, citing another trusted site, about withdrawn software and a reported security issue. I didn’t bother to check the source. Minutes later, I read several posts to a Usenet group claiming the software had disappeared from its company’s FTP servers. Eager to enlighten, I posted a follow-up reporting that the software had been pulled because of the security issue. As it turns out, the software hadn’t been pulled at all. I had uninformed the world.
The original site eventually admitted its error, and it doesn’t look like anyone else has adopted or spread my falsehood. Other uninformation enjoys what you might call a happier fate. The annals of the urban folklore newsgroup (alt.folklore.urban) tell mighty tales of Craig Shergold and the “Good Times virus,” and they speak of memes, which are like infectious information spores, except that it doesn’t really matter whether or not they’re true. Even the more spectacular delusions tend, when provable, to collapse when exposed. But when folly and delusion hook into a larger system of beliefs, especially paranoid beliefs, they’re much harder to dislodge.
A lot of uninformation thrives because it has to do with alleged secrets, which by definition can’t be disproved. There’s a secret plot by the government to hush up the story of a downed aircraft; there are secret tapes proving the murder of a prominent political figure. To a degree, this is garden variety conspiracy theory, but it is conspiracy theory elevated to the level of information.
The Net also has theories, which are its favorites, about itself. They grow from an increasingly widespread fear that the culture of information is a culture in which people, too, have been turned into data. Invisibly and secretly, our phone numbers, addresses, credit histories, medical records, buying patterns, movie rentals, tastes, predilections, and personalities are being collected, compared, analyzed, and stored. Somewhere on the global information network, we have a double, a statistical self. Through this other self, we, oblivious, are tracked, targeted, and manipulated. Naturally enough, the medium of manipulation is, once again, information: direct mail, targeted banners, spam—all promising new facts, great opportunities, and products we need.
There’s been a fair amount of alarm about this lately, most of it focused on the Net. We fear that the Internet has become the latest frontier of a vast surveillance system. Web sites give information with one hand, and take it with the other. The browser is our friendly guide, and it is also a spy. And what is its chief instrument of surveillance? The fearsome cookie.
And what is this insidious tool? Writing for Brill’s Content, Esther Dyson, one of the media’s approved Internet experts, informs us that “Every time you log on [to the Web], a digital record of your movements (a ‘cookie’ in tech-speak) is created.” And that’s not all. “Thereafter, it resides on your hard drive, invisible to you, but not to the site.” A secret, potentially malicious agent on my hard drive—is this some kind of virus?
Actually, you could call it a meme. It’s the collective notion of those who’ve used the Net enough to have heard of cookies, but not enough to know any better. It’s been refuted in detail on the Web and the Usenet, but the belief system is strong. Somewhere deep down we want to believe Dyson when she says that on-line advertisers not only “determine where you came from,” but even “what you were doing before you arrived at their site”—all by “examining your cookies.” They do this, she riffs, so that they can discriminate against people who browse the “wrong” sorts of sites. (Though you wonder what they were doing advertising there in the first place.)
Unfortunately, Dyson is on to something. There are sites that do track users, sometimes hoping to better deceive you, and sometimes hoping to gauge their own effectiveness. CNET, for example (www.cnet.com), records users’ navigation around the site—but to measure the clicks, not the users. They want to know what tempts users to click, but they don’t necessarily care what tempts you to click. More to point, they achieve this feat without cookies.
And there’s no reason they’d want to. Mind you, CNET, in all its guises (cnet.com, download.com, computers.com, snap.com, etc.), is one of the most cookie-happy sites on the Net. I know this because, in fact, it’s not true that every site is digitally recording your every move all the time—in which case all sites would be equally cookie-happy. And I know it because I found my Netscape cookie file, which turns out not to be invisible after all, and which listed a good dozen CNET cookies (just beating Netscape’s own 11). And when you see them, cookies really don’t look all that fearsome. Here’s one:
.cnet.com TRUE / FALSE 946684827 u_vid_0_0 00012005
That’s the whole thing, verbatim. Only the “00012005” part sets my CNET cookie apart from anybody else’s CNET cookie. And while I don’t know what it means, I know it’s not very important.
Logic suggests that if cookies aren’t really invisible, and if they are small, and if they are optional, and if any user can destroy them at any time, then they’re not much good for anything too nefarious or invasive. They’re certainly capable of some low-level surveillance, but then the question becomes: Surveillance of what?
The bargain one makes with the Internet has always involved giving away information. Before Netscape ever thought of the cookie, browsers were sending servers data about you: the machine you were connecting from, your browser, your operating system, and where, if anywhere, you were linking from. So if I were browsing at www.nicesite.com and clicked on a link to www.badsite.com, then my browser would tell badsite.com that my IP address was 126.96.36.199, I was browsing with Explorer 4.0.1 on a Power Macintosh, and that I’d been referred by nicesite.com.
Such has always been the case. And if anyone wanted to use this information to “track” you through their site, they could just analyze their logs. But what they’d know, at the end of this somewhat grueling process, wouldn’t be about you; it would be about 188.8.131.52. Furthermore, while a site may be able to track what you’re viewing and clicking and buying internally—information they’re arguably entitled to—they don’t and can’t know a single thing about what you’ve been doing elsewhere. At most, badsite.com knows that you arrived via nicesite.com—but, once again, why shouldn’t a site know who’s linking to it? (If you type in badsite’s URL, rather than clicking a link, no referrer is reported.)
Perhaps you can live with this level of scrutiny. But you’ve been hearing that cookies go way beyond IP numbers and referrers, and that they pass your name, address, e-mail accounts, credit card numbers, and what have you to any site that wants a peek. This makes cookies sound much smarter than they actually are. In fact, it’s not cookies that pass information to sites, it’s sites that pass information to cookies. That is, a cookie only knows what a Web site tells it, and a Web site only knows what it can get from two sources: your browser, and you.
If you don’t tell a site your e-mail address or what’s on your hard drive, it can’t know. And if you do tell it, it could just as easily store that information in a back-end database, where what they were doing would really be invisible. Cookies, each limited to a measly 4K, aren’t much help in collecting and storing real data. What cookies do help with is the problem of transience—the stateless an impermanent nature of Internet communication. Using cookies, a site can remember your browser from session to session, and this is the part that scares people, because they really don’t want Web sites to know them that well. You might not mind if sites track you through a single session, but still be bothered by the idea that you’re constantly feeding them data, and that you’re not an anonymous visitor, but a pattern.
On the other hand, memory is often a convenience. It’s annoying to type in a user name and password every time you visit a registration-based site like The New York Times (www.nytimes.com) or GIF Wizard (www.gifwizard.com). If you give them permission to track your identity with a cookie, then they don’t have to ask you who you are, and you don’t have to type in a thing. Cookies are also useful for heavily-trafficked sites that would like to store data such as simple user preferences or the date of your last visit—say to generate a customized “what’s new” sidebar. It’s just easier to “save state” in a cookie for each user than to maintain and churn through a database that big.
I yet to hear anyone seriously argue that sites should be obliged promptly to forget everything you tell them. We seem to be living with the fact that credit card companies get to keep our names and addresses even after we’ve canceled the card. Nobody likes it, but nobody’s raising much of a fuss either. Can Web sites compare notes on what they know about you? Of course. But they do it the old fashioned way, just like everybody else. They don’t do it with cookies.
Cookies by design can’t be shared. If nicesite.com gives me a cookie, it can specify which portions of nicesite get to use the cookie. So, in the example above, “.cnet.com” means that the cookie is visible to the entire cnet.com domain (including www.cnet.com, builder.cnet.com, and so on). The “/” means that any page on any of those sites can use the cookie. But cnet.com can’t make the cookie visible to any site that doesn’t end in “cnet.com” (not even download.com or news.com, which are its properties). That’s its utmost extent. In other words, CNET can’t get any information from any cookie but its own, and anything important it might learn from its own it already knows.
DoubleClick knows what page your viewing, since that’s the referrer of an image. This is extra data, above and beyond click-throughs, which could theoretically be used to compile a list of every page you’ve ever seen with a DoubleClick ad. Thus they would not only know which ads you’ve seen, and how many times, but would also be able to build a profile of your “interests.” This would obviously make you a bigger target.
But this demographic feat couldn’t happen without the cookies. Most dial-up users get a different IP address every time they log in, so they’re essentially moving targets. Cookies solve the problem by giving your computer a unique identity, which DoubleClick can check before consulting its database. This is just a meaningless string—my ID is “715a00f.” But that’s who you are to DoubleClick. They don’t need your name or address, which they’ll never get anyway, and frankly, they don’t really care. A number’s good as a name for a pattern of behavior.
And isn’t that what we should be upset about? Not that Web sites are prying into our private lives, but that, a far as they’re concerned, we don’t have any?
– 30 –
Originally published as part of my “Browser Window” column in InterActivity magazine (November, 1998)