Disseminating and Analyzing Longitudinal Historical Data
Amsterdam, 21 March 2006
Program committee: George Alter and Kees Mandemakers

Population register 1850-1860, Moerkapelle (Zuid-Holland)
The historical community is now fortunate to have a growing number of large-scale, public databases of life histories from the past. Some of these databases have been under development for a long time, such as the Demographic Database in Umea, the Utah genealogical database, the Scania database in Lund and the PRDH and BALSAC in Quebec. Others are relatively recent, such as the Historical Sample of the Netherlands, and data collection activities are underway in other European countries as well as Japan and China. See also: Questionnaires longitudinal databases.

Although many of these databases are intended to be public resources and available to any qualified researcher, relatively little work has been conducted by researchers not affiliated with the organization that collected the data. In comparison to the resounding success of the IPUMS project, which made historical census data commonplace in the social sciences, historical longitudinal databases have not found their way into the mainstream of social science research. Since longitudinal databases are exceptionally rich and address a host of questions not covered by cross-sectional data, the difference is striking. This workshop is intended to examine this problem.

By organizing a workshop on the dissemination and analysis of historical longitudinal data the HSN wants to bring together two different worlds. The world of the builders of databases, who know the pitfalls of the data and face the problem of providing data to a diverse group of users, and the world of users, who often struggle to convert the public databases into forms suitable for analysis and hypothesis testing.

The workshop is structured in four sessions. In the first session 'database-builders' will discuss some of the challenges faced by managers of longitudinal historical databases. The second session will focus on potential uses of these databases in biomedical research. The third session will examine life histories reconstructed from non-longitudinal sources, like civil status registers, tax records, or census data.

Finally we will discuss the big question: How do we make longitudinal databases easier to use? We will concentrate on issues like:
- How do we construct descriptions of life course, households, kinship over time (in theoretical and practical terms)?
- What are the key organizing structures of these databases: events, individuals, households, kinship?
- How do we deal with information at multiple levels, such as the individual, household, and community?
- How do we structure records to capture kinship and intergenerational relations?

  • 09.00 - 09.30   Informal start (coffee)

  • 09.30 - 09.45   Welcome

  • 09.45 - 10.30   Opening Remarks
    Chair: Lex Heerma van Voss
    George Alter, Myron Gutmann, and Kees Mandemakers,
    "Problems and Possibilities for Distributing Longitudinal Historical Data"

  • 10.30 - 11.45  The Role of the Database Administrator
    Chair: Myron Gutmann
    Bertrand Desjardins, "Should database administrators restrict access to datasets?"
    Sören Edvinsson, "The Scylla and Charybdis of open research databases - between complex and well-prepared"
    Steve Ruggles, "The Case for Open Access to Data"

  • 11.45 - 12.00  Short break (coffee, cookie)

  • 12.00 - 13.00  Historical Data and Biomedical Research
    Chair: Tommy Bengtsson
    Marc Tremblay and Hélène Vézina, "Genealogical databases and their use in genetic research: the Quebec experience"
    Geri Mineau, "Opportunities and data requirements for biomedical and demographic research with longitudinal historical sources"

  • 13.00 - 14.00  Lunch

  • 14.00 - 15.00  Life Histories from Discontinuous and Heterogeneous Sources
    Chair: Michel Oris
    Gunnar Thorvaldsen, "Longitudinal and bitudinal microdata, problems of linkage and representativity with special reference to censuses"
    Jerome Bourdieu, Joseph Ferrie, Lionel Kesztenbaum, "A tale of two datasets: Generating, coding, and disseminating historical longitudinal data from France and the U.S."

  • 15.00 - 15.15  Tea break

  • 15.15 - 17.30  Discussion; How do we make longitudinal databases easier to use?
    Chair: Kate Lynch
    The longitudinal character of life history databases raises a number of important problems. Household composition changes, raising the question: what is a household and what is not? Relationships between individuals change over time. How can changing relationships be defined and reconstructed? Because of these problems it is much more difficult to find a standard data format for longitudinal data than for unlinked cross sectional datasets. On the other hand, common standards would have great benefits in promoting more comparative research with data from different times and places.

    The following list suggests common problems that should be resolved firstly in a conceptual way and secondly in a technical way. A common conceptual framework is a precondition for using standard data formats. These data formats may be implemented in a direct way, in the process of transforming source data into output data, or in an indirect way by transforming output of a specific database into standard forms that function as a kind of umbrella covering all longitudinal databases.

    We propose the following areas in which standards across databases are needed:
    1. Identifying the principle(s) that organizes inclusion in the database over time: individuals, marriages, households, ...
    2. Specifying time at risk for various kinds of events. Procedures for handling incomplete dates.
    3. Locating subjects as they move from household to household and place to place, so that household composition and neighborhood context can be reconstructed.
    4. Representing family relationships, so that kinship can be reconstructed as widely as possible.
    5. Procedures for moving from database to analysis.

    The discussion will begin with an overview of longitudinal historical databases based on a questionnaire distributed in preparation for the workshop.

  • 18.30 - 21.30  Other location: Drinks/dinner