Saturday, January 3, 2009

Full Circle or Infinite Loop?

After a full day of family history research finding data that supports my own records, I started wondering why it was so difficult to find information on my ancestors in the first place. The information was available on numerous and various sites. “Ain’t the Internet great?”, I thought.

And then I started noticing identical sourcing comments, identical typos, my idioms and occasional misspellings. Obviously, there was a lot of copying going on. Everywhere.

Then I noticed some obvious errors that I remember correcting over the past few years. Upon closer inspection of the sites, I recognized that I was looking at my own data that had been posted and reposted over and over. My old errors carried forward on these sites.

I then started searching sites that I don’t use often. Some fine person had posted information on them that tied to my family. Maybe the data would provide clues to help in my own ancestral quest and sourcing of my old data. (Folks of my generation will remember that in the ‘old days’, we rarely wrote down the sources, just notes quoting pieces of the documents we’d found).

The guy posting the data was pretty good. The dates and places matched what I had in my own constantly updated records. Even the source quotes were the same.

And then the house of cards came tumbling down. ‘The Guy’ was me. I’d posted the data over the past few years during my all night research forays and had promptly forgotten all about it.

They say that driving with a lack of sleep is just like driving drunk. I think that researching from 7 pm to 7 am is just as detrimental to your cognitive abilities as drunk driving. I literally don’t remember posting the data, but there was my name as the poster and sure enough, my login allowed me to change the data. Fortunately, everything I found was ‘posted’ not ‘copied’.

GenDataLoop1The wholesale copying of data isn’t new to most of us, but my experience certainly reinforced its inevitable weakness. If you haven’t traced the data and proven it yourself, all you have is a story and as often as not, the story is wrong or full of grievous errors.

Will our posted data ever fade from the scene? Will our errors ever be stamped into submission by well sourced corrections? Are we forced to live with them forever?

The movie “Notting Hill” was playing in the corner of one of my screens while I was finding this mess. Julia Roberts said something to the effect that even though today's newspapers will soon line the bottom of canary cages, a copy of the data is always on file and would always be resurrected to embarrass us in the future.

I suppose that all but the “perfect” among us data posters will be embarrassed from now on. The bad information will float up from the Ancestral File, the Pedigree Resource File, Ancestry Family Tree postings and of course any of the way-back screen scrapes that exist.

A number of years ago, I started embedding spellings, idioms and verbiage patterns in my notes and postings that I’d easily recognize in a future day. It is ‘interesting’ watching them surface over and over. Adding copyright statements aren’t relevant in relation to governmental records of dates and places and apparently aren’t all that useful when applied to the text you include in your postings either.

A few years ago, I attended the funeral of one of my uncles only to hear his eulogy parrot the one I gave at my brothers funeral a few months earlier. Of course the names and dates and a few sentences were changed, but it was my original text, quotes and dialog. Someone had passed on a copy of my text and apparently it fit the bill in another grieving situation.

It was an interesting experience and unfortunately caused me to smile and shake my head at the wrong time. However, I did restrain the body shake of a quiet chuckle and thus further embarrassment was avoided.

Avoid seeing someone shake their head at you. Yes indeed, we need to work together in our research but do it aboveboard.

Looking through my own data, I can see notes that I obviously didn’t write. Where and when did they enter my records? I don’t know. Now I have to excise them and follow the source hints to recreate factual records.

Wholesale theft isn’t the answer to doing research. Someone may actually believe your data to be 100% correct and stop doing research that may find the “hopefully few” grievous errors that you’ve introduced.

If we don’t do our own research or confirm the work done by others the ‘story’ will be perpetuated as fact until it is almost impossible to fix, repeal and supersede with truth.

The New Family Search program and database opens up the opportunity to dispute each others ‘facts’ and statements but at present can’t possibly resolve all the errors that submitters have introduced over the years. It won’t be open to the general populace for some time if my ‘sources’ are correct. Go ahead. Quote me on this. I enjoy future embarrassment. An open chuckle is encouraged at this point.

It will be interesting watching how this issue sorts itself out, if it ever does. Until then, make sure you contact the posters of data and work with them. Ask if you can quote their work and / or team up with them in your common research efforts.

If you steal their data and work, shame on you for the theft and if you don’t prove the data yourself, double shame on you for aiding and abetting the perpetuation of any errors in it.

For now, I’m going back and finding additional ‘forgotten’ wee hour postings that I've made. If they are wrong I’ll fix them and quote them in my research notes. Maybe that will close the loop and that “smart guy” I discovered will be properly neutralized. Full Circle. The circuit should be complete and shorted out…. at least in my records.


TK said...

Lee, I swear, this is the best post ever on this topic! One of the reasons history was never a favorite subject of mine--well, okay, in the beginning it was the boring names, dates, and wars that meant absolutely nothing to a little girl, duh!--but as an adult, I am all too aware that history depends a lot on who's telling it, and that no matter how much of it you know, there may be a relevant bit overlooked that changes the story somehow. Even the official version of any bit of history might have errors or omissions or biases, and any interpretation of the official version might also have errors or omissions or biases, so perfection or absolute truth may be a lot to expect.

And of course we all want to keep a grip on what we do, so we can correct our errors when we find them, and we share what we know in hopes that someone will be able to help us find out what we don't know. And we lose our control when someone else adds our data to their own, maybe thinking to use it as a roadmap for more of their own research but not getting around to it in a timely way and forgetting where it came from, and then putting "their own" info out there hoping for the same help, and yes, there's your crazy data loop!

What I appreciate is your ability to shake your head and chuckle, and to wish for a better process without going into a pointless rage over it. Someone told me a long time ago, the only person whose behavior you can control is yourself. Obviously you get that, and I found your comments not only very interesting, but also a pleasure (and kind of a relief, if the truth be told) to read due to your calm and very thoughtful presentation. And yah, I chuckled a time or two as well, always a nice plus!

Westie99 said...

Best genealogy post I've read to date. It wasn't me, though the word verification say "usinsin",. I kid you not.

It's easy to scrap the net with Google's help. Yea just what I wanted, to boast about having the worlds largest genealogy database. Uncross referenced/and unproven.

Can the big guys get full paticipation?

I remember helping index a few projects including the Automated Genealogy Project and the 1870 US census. Errors abound!

Lots of census errors/info from the distributed CD-ROMS Volumes out there also.Certainly still has been a big help.

The oldest published book I've read with clues to my New York relatives
is probably 1860.

Also saw a scanned letter from 1795.

Inquiries still seem to be the ticket as well the boards on some of the bigger sites.

I'm not missing you're point about the ones whom are not contacting the sourcer with his own family in the data.

I proclaim you the guardian keeper. Somewhere along those lines.


Lisa / Smallest Leaf said...

I appreciate your comments on these concerns, Lee. I have recently been reminded of the importance of truly building a case for my genealogical conclusions. It is so tempting to jump to accept what seems to be "a fact", when it is truly just one of several possibly scenarios.

When we are posting data online, it is even more important to report those sources and/or explain the reasons behind our conclusions.

Thanks again for your well-written post. A good reminder for us all...

Carnival of Irish Heritage & Culture
Small-leaved Shamrock
A light that shines again
100 Years in America

Anonymous said...

It's a bit like trying to find your ancestors in English Parish records, you cannot prove anything unless you can see all the records ever written. Even then there are lots of entries which somehow passed through the process without recording a given name. A positive minefield of errors waiting to happen