<<

. 32
( 53 .)



>>

nology of production and reproduction had advanced “ with the
advent of the fountain pen, the typewriter, and the mimeograph dupli-
cator “ but the basic form of a government ¬le would have been famil-
iar to a public servant transported forward from the Foreign Of¬ce of
the early 1850s.1 In 1850, and still in 1950, technological constraints
on the production and distribution of documents compelled of¬cials
to be deliberate in the composition of new records, and limited the
growth of the total stock of of¬cial records. A document could more
easily be inferred to be important, to say something authoritatively,
because it was not easy to produce.


199
Blacked Out


To a large degree this conception of “the of¬cial ¬le” still perme-
ates popular consciousness. The narrative about bureaucratic mis-
conduct that is constantly replayed in news and ¬ction still hinges on
the damning, but hidden, government ¬le “ a manila folder contain-
ing the “smoking gun” memo, with the words TOP SECRET heavily
inked at its head. Many FOI laws are written with the expectation
that they will reveal these of¬cial ¬les; in fact, the laws are drafted on
the assumption that this conception of “the of¬cial ¬le” is an accu-
rate one. The archetypal FOI request is one that seeks the disclosure
of a bounded number of tangible records that are presumed to say
something de¬nitive about government policy.
The time when this conception of the of¬cial ¬le was defensi-
ble in the advanced democracies is now long past. Over the last
three decades, advances in information and communication tech-
nologies have caused profound changes in the character of infor-
mation held within government agencies. In many instances, elec-
tronic media have replaced paper as the preferred method of stor-
ing information. The number of transactions that are documented
in digital form has exploded, and the number of forms in which
digitized information may be encapsulated “ word processing docu-
ments, spreadsheets, presentation ¬les, e-mails, structured databases,
audio or video recordings, and so on “ has grown. The cost of revising
records has plummeted, causing a rise in the number of versions that
may exist for any one record. The stockpile of government informa-
tion has been liqui¬ed “ broken down into a vast pool of elements
whose signi¬cance, taken independently, is not easily grasped.
The metamorphosis of of¬cial information is already changing the
battle over governmental openness. The struggle for access to “struc-
tured data” “ the digitized information held in massive governmental
databases “ has been underway for decades, while the ¬ght over access
to the much larger pool of digitized “unstructured data” held by gov-
ernment agencies is still in its very early stages. In either case, the
digitization of government information could have the unintended
consequence of producing dramatic increases in transparency. But
this outcome is not a given; on the contrary, there are strong bureau-
cratic and political forces that may prevent it. Nor is it clear that
we should want such an outcome “ particularly if the information
at stake is personal data, or if disclosure has the effect of crippling
government™s ability to act effectively.

200
Liquid Paper


Structured data
The revolution in information and communication technologies has
wrought two broad changes in the pool of information held by gov-
ernment agencies. The ¬rst is the growth of large electronic databanks
that contain details about routine government activity, and about the
businesses and individuals with whom government agencies inter-
act. Large databases are not themselves novel: Early government
projects such as the post-Civil War pension or the national census
in the United States required the mass aggregation of information
about citizens. However, this data existed in unwieldy paper form;
the process of digitization, which gained momentum in the 1960s,
dramatically reduced the cost of duplicating and manipulating such
information. The application of technology to work processes also
meant that agencies began to collect large amounts of information
about their internal operations in new digitized databases.2 Because
the information contained in these databases is highly standardized “
containing similar details for each person or company, for example “
it is sometimes known as “structured data.”
The emergence of large digitized databases in the years following
the Second World War roughly coincided (in the United States) with
the strengthening of laws that established a right to information held
by government agencies. It was inevitable that the two trends would
eventually collide. In the last two decades, many groups outside
of government have become adept at exploiting the opportunities
posed by the accumulation of digitized structured data within public
agencies.
Journalists, for example, have become increasingly skilled at using
bulk electronic data to scrutinize government operations. In fact,
this has become a well-de¬ned ¬eld of journalistic practice, known
as computer-assisted reporting or CAR. Some major media outlets,
such as the New York Times, have CAR editors, and since 1989 the
¬eld has had its own support organization, the National Institute for
Computer-Assisted Reporting, that acts as a clearinghouse for key
government databases.3
In 2000, the Times used data from the Fatality Analysis Reporting
System “ a database maintained by the U.S. Department of Trans-
portation “ to demonstrate that fatal crashes involving Ford Explorer
sport utility vehicles were three times as likely to be related to tire

201
Blacked Out


failures as fatal crashes involving other brands. The Times™ stories
substantiated concerns about the reliability of Firestone tires that
were routinely installed on new Explorers. Despite growing contro-
versy over Firestone™s tires, budget-constrained federal regulators had
not detected the pattern in their own database.4
A later New York Times analysis of data collected by the fed-
eral Occupational Safety and Health Administration revealed a long-
standing failure to seek criminal prosecution of employers whose will-
ful violation of safety rules had caused worker deaths. In an echo of
the Ford case, the agency had never studied its own data on deaths
caused by deliberate noncompliance with regulations.5 In 2004, the
Times used data collected by federal railroad regulators to demon-
strate inadequacies in procedures intended to reduce the number of
deaths caused by collisions with trains at grade crossings. Its inves-
tigation led to the resignation of a top regulator and legislative pro-
posals for tougher oversight of the railroad industry.6
Other journalists have exploited the potential of computer-
assisted reporting as well. The Newark Star-Ledger, using information
collected by the federal Food and Drug Administration, found that
recalls of faulty medical implants were on the increase, a trend that it
linked to weaker procedures for reviewing new implants.7 In Mother
Jones magazine, reporter Ken Silverstein matched data from three
U.S. agencies “ the General Services Administration, the Environ-
mental Protection Agency, and the Occupational Safety and Health
Administration “ and found major contractors who continued to work
for government while ¬‚outing its environmental and workplace safety
rules.8
Academic research centers and public interest groups have also
tapped government databases. Since 1989, the Transactional Records
Access Clearinghouse (TRAC) at Syracuse University in New York
State has used the Freedom of Information Act to obtain internal
data on the activities of federal law enforcement agencies.9 A 2003
TRAC study suggested that the federal government™s efforts to pros-
ecute cases of alleged terrorist activity had yielded few signi¬cant
convictions,10 while another of its studies found a marked decline in
audits of corporate taxpayers and prosecutions for violation of fed-
eral tax law.11 Another organization, the Center for Public Integrity,
combined federal contracting data with data on political contribu-
tions to demonstrate that contracts for post-war reconstruction in

202
Liquid Paper


Iraq went to ¬rms that gave heavily to the election campaigns of Presi-
dent George W. Bush.12 In 2004, the center used data from the Internal
Revenue Service to show that political nonpro¬t organizations had
abused federal rules in ways that understated their level of political
activity, quickly prompting a promise of more vigorous enforcement
of reporting rules by the IRS.13
Environmental advocacy groups seized on the possibilities posed
by the Toxics Release Inventory (TRI), a database established by
the U.S. Environmental Protection Agency under the 1986 Emer-
gency Planning and Community Right-to-Know Act (EPCRA). EPCRA
required companies to report regularly to EPA about their use of listed
toxic chemicals, and contained the unusual stipulation that these
reports should be combined in a database “accessible by computer
telecommunication” to the public. Environmental groups quickly
used early rounds of TRI data to shame heavy polluters, often with a
remarkable impact on industry behavior.14
The internet “ a largely unknown technology at the time that
EPCRA was drafted “ gave advocacy groups the ability to go fur-
ther, creating their own websites that allowed the public to search
TRI data for information about polluters in their own community.15
By the end of the 1990™s, the Clinton administration was promot-
ing TRI as an archetype of a powerful new approach to regulation, in
which nongovernmental organizations collaborated with government
to achieve regulatory objectives without resorting to conventional and
heavy-handed enforcement measures.16
The advances that journalists and nongovernmental organiza-
tions have made in exploiting stockpiles of structured data within
government agencies have been signi¬cant, but should not be over-
estimated. One major dif¬culty has been the lack of resources for
pursuing this sort of work: Extracting and analyzing data can be a
time-consuming and technically demanding task. (And if this is true
with regard to the community of media and nongovernmental organi-
zations that surrounds the U.S. federal government, it is doubly true
with regard to the community that surrounds U.S. state and local
governments, or even the national governments of other advanced
democracies.) The need for a heavy investment of resources has been
aggravated by the strong and continued opposition of government
agencies, and private industry as well, to the release of structured
data.

203
Blacked Out


For three decades, many federal of¬cials resisted the idea that
there could be any right under the Freedom of Information Act to
information contained in government databases. As a technical mat-
ter, many databases were not designed with the possibility of public
access in mind; they were built for internal use and lacked features
that would allow data to be easily exported for use by nongovern-
mental organizations. In these cases, new computer programs had to
be written to make possible the extraction of data. This meant added
work for agency staff, and even more dif¬culties for smaller agencies
who lacked the staff with the ability to do the programming.
In a 1989 survey undertaken by the U.S. Department of Justice,
over ¬fty federal agencies took the position that they had no obligation
under FOIA to do special programming to extract information from
their databases “ and if information was extracted, they had no obli-
gation to provide the information in easily managed electronic form
rather than in less useful print formats. Any other position, depart-
ments warned, would “seriously disrupt their operations” and pos-
sibly make the entire FOIA program untenable.17 Added to this was
bureaucratic frustration with the uses to which information was put
once released from government databases. Many of¬cials complained
that FOIA would be corrupted into a tool for businesses™ exploitation
of commercially valuable government data.18 Others protested that
nongovernmental organizations often used internal agency data to
present a misleading and un¬‚attering view of their operations.
The result of this bureaucratic resistance were cases such as
Public Citizen v. OSHA, which grew out of an attempt in 1985 by
Public Citizen to gain bulk data on enforcement actions by the fed-
eral Occupational Safety and Health Administration. OSHA refused
the request for information, arguing that it had no obligation to do
the programming needed to extract the data. In a contemporaneous
case, Dismukes v. Department of the Interior, federal of¬cials insisted
on providing oil and gas leasing data on micro¬che, even though the
data was also available in electronic form. A federal court upheld
the department™s position.19 Several other courts were equally hos-
tile to FOIA requests for bulk data.20 Finally Congress stepped in,
by amending the Freedom of Information Act in 1996 to make clear
that departments had an obligation to extract bulk data from their
databases and “ reversing the Dismukes decision “ an obligation to
provide data in easily manipulable digital formats.21

204
Liquid Paper


The 1996 changes “ known as the Electronic Freedom of Infor-
mation Act Amendments, or EFOIA “ improved matters, but of¬cial
balking at requests for electronic data also continued. In 1998, the
Department of Housing and Urban Development refused a request
for access to a database on money it owed to mortgagees, arguing
that the request would be “extremely burdensome”; the refusal was
eventually overturned by a federal court two years later.22 In the
same year, defense of¬cials argued that the work of extracting data
from a database on malpractice claims against military medical staff,
requested by the Dayton Daily News, would be too onerous; a federal
court disagreed, and the Daily News later won a Pulitzer Prize for its
reporting on the subject.23
Despite EFOIA, Syracuse University™s Transactional Records
Access Clearinghouse also dealt with recurrent efforts by the Depart-
ment of Justice to withhold information from its database on federal
criminal investigations and prosecutions. The department was stung
by the reports of the clearinghouse, which seemed to reveal weak-
nesses in federal efforts to enforce antiterrorism laws; it responded
by attempting to argue that release of the data jeopardized pub-
lic security.24 In 2004 TRAC complained that the Internal Revenue
Service had also stopped complying with court orders that required
the release of bulk data on its enforcement of tax laws. The Center
for Public Integrity encountered a novel claim while attempting to
extract information from the federal government™s database of for-
eign government lobbyists: The Department of Justice claimed that
the database had become so fragile that an attempt to process the
request risked a program crash and “major loss of data.”25 The case
is not unusual; it is one of several instances in which rapid techno-
logical change has made databases practically inaccessible.26
Governments have also had more material incentives to resist the
release of structured data under FOI law. Throughout the eighties
and nineties, budget-constrained government agencies attempted to
¬nd ways of realizing the commercial value of the information locked
in their databases “ either by selling the information directly, or by
enlisting businesses to re¬ne and market their databases. (“The con-

<<

. 32
( 53 .)



>>