I’ve been asked to create a tracking database for one of the studies my group is doing, and am running up against the limits of my knowledge as far as how to deal with the non-normalized data file that needs to be included.
About half of the information used for tracking comes from a flat-file, non-normalized Excel worksheet which our SAS programmer sends me. It has fields like MRN_INFANT_1, MRN_INFANT_2, and MRN_INFANT_3 (which will cause trouble if a set of quadruplets comes along during the study), and so on. Some of these essentially-duplicated fields I can ignore, I think. Dealing with the toxicology results, for instance, I’ll leave to the SAS programmer who set up the file in the first place.
Some of them I can’t ignore, though, and I’m wondering how to deal with them. Each infant’s contact information, for instance (we can’t assume that an infant shares contact info with its mother; some don’t). Is there a way to set this up so that a normalized side table can interact with the non-normalized Excel data, or should I grit my teeth and create a mess of a table with fields like INFANT_1_PHONE, INFANT_2_PHONE, INFANT_1_ADDRESS, INFANT_2_ADDRESS and the like?
Any advice or pointers to other references will be most appreciated!