Oh well, a new person to the field, with ideas shaped by another,
to whine some about what's available. Nothing new there. But maybe
my whining can provide targets (some things I complain about might
be solved) or, as we continue, some sup****t for doing certain things
could develop. I could write some suitable software to implement
certain ideas, if it looked worthwhile.
I've done some back reading as I get into the subject, including
the gedcom/xml arguments, and am not really trying to go back to those.
One interesting thing to me was the mention of the GENTECH
Genealogical Data Model. The sad news there being that, apparently,
nobody actually implements it. Or anything particularly close.
I come to the computing/data from a science field (oceanography)
and one of the things which has promptly bothered me is that the
software available (paf, legacy, reunion) seems far too aimed
at conclusions rather than evidence, and even more poorly aimed
at representing source information trails.
The evidence trail is something particularly bothersome
to me. From my field, let's say our original observation is that it
was 22.2 C. Now, if that was all we had, we'd be ticked, because it
doesn't tell us when the observation was taken, where it was, or
how it was taken. All these metadata are im****tant, and usually you can
get them (with sufficient patience and phone calls, rather like
genealogy in that, it seems).
But that is only the proverbial tip of the ice berg. Because
that 22.2 C observation (with rest of sup****t) is almost certainly not
exactly the number we're going to use for analyzing the air-sea
heat flux, or sea surface temperature, or whatever it is we're doing.
The thing is, each observing method has biases. We know this, so
adjust for them as relevant to our problem at hand. The problem that
we _could_ run in to is that the 22.2 we now see is not the actual
original observation. Someone could already have made the adjustment
for intake temperature bias. How we avoid this is that the data
(are supposed to be) are given histories. The original observation
(and its metadata) are augmented by a new value and _its_ metadata
(22.4 C after George applied John Doe's intake temperature bias
correction, say), and this additional information then follows along.
I could decide that John Doe's correction method is not the best,
and instead apply, myself, Mary Roe's -- to the original 22.2, now
that I know the 22.4 was after somebody else applied a correction I
don't like to arrive at it. Not clear to me yet (I've been doing
some light reading of the data model do***ent, but not carefully
nor complete) whether the GENTECH sup****ts this sort of consideration.
A different problem is that the typical software treatment seems
to be that it has little or no ability to track exactly what the
evidence and sources are. For instance, it seems that if I im****t a
file from someone and they cite a census record, I have my choice of
ignoring that _my_ source was Jane Genealogist, not the orignal record,
and preserve the census citation, or I can _add_ Jane as a source.
Now this is a problem, in my mind. When I look later, it will show
two sources -- the census, and Jane. But my real state of knowledge
is only that Jane _said_ the census had some information. This isn't
two independant sources, it's 1 source, 1 step removed from the
primary do***ent. (Please, no jumping on that usage, I realize that
there's a trade meaning to the term 'primary do***ent', and census
isn't an example.) What I want the software to do is, when I im****t
a file that has citations, mark that my source is Jane, and her
sources were ... whatever she said. If I'm making a 20th generation
copy/im****t (of a copy of a copy ...), then the software should show
the prior 19 im****ters as well as the original person who looked at
a do***ent. GENTECH seems to sup****t this concern of mine, but
with no implementation thereof, I'm still sol.
--
Robert Grumbine http://www.radix.net/~bobg/
Science faqs and amateur
activities notes and links.
Sagredo (Galileo Galilei) "You present these recondite matters with too
much
evidence and ease; this great facility makes them less appreciated than
they
would be had they been presented in a more abstruse manner." Two New
Sciences


|