This project has moved. For the latest updates, please go here.

How do Locations work

Coordinator
Aug 12, 2013 at 11:17 PM
Can I also ask how do you read the locations? Do they need to be recorded in a particular style?
Under Counties I get the complete address in the Country column with all other columns empty.
In the Regions list I have a mixture of details in the Country and Region columns with the rest empty.
In the Parishes section Country again is mixed (country / town / county), Region and Parish is a bit of everything except country and the last 2 columns empty.
Addresses seems to be more consistent with Country mainly correct - Region is mainly the county, Parish primarily the town, Address mainly the street with Place empty.
Places is again fairly consistent with the Place column mainly being the street.
I guess that a comma is the delimiter that is recognised and that the columns are populated as last entry in the first column etc. Not sure though why the different tabs are populated so differently.

Perhaps if I were to follow a consistent format this would correct itself - any advice would be appreciated as to the best method to use.

thanks

Bob
Coordinator
Aug 12, 2013 at 11:18 PM
Locations takes a location record from any fact and decodes it using commas. So each comma separates part of the location. As it loads a file it tries to work out the country/county if those details are missing.

eg: many people don't bother entering a country name, some don't enter a county (or region) name. Thus if it can the program will "correct" this.

The locations tab then displays the various levels. Ideally in the countries tab all you would have is a list of countries recognisable by the UN or at least as they would commonly be known!!

However if you have missed out a country name then your countries tab might have odd things like "Lanccashire" this is because a typo in the name of the county meant the program didn't match up Lancashire and so left it alone. Fix the typo and that will be treated as Lancashire, England. Note you can double click on a location name to see who you have recorded at that location.

The idea is that if you want to tidy up your locations you can use the various tabs in order to see how consistent you have been. First fix up your countries to see if they all make sense as actual countries. Once you've got a lovely list of countries fixed. Have a look at the region tab and see if all the regions in your file make sense. Maybe you have England, Derby and England, Derbyshire? Or you have England, Srhopshire. etc The idea is you can weed out errors in the counties. Then look at parishes and weed out spelling mistakes and inconsistencies there etc etc.

Does that make it clearer?
Editor
Aug 17, 2013 at 9:14 PM
I've been struggling to get my head round the sub-tabs in the Locations tab. I think that there is a huge amount of duplicated data, but I may be wrong.

I think the tab which is labelled "Countries" is the final CSV element from the address of any event with a GEDCOM "PLAC" string. If an address happens to have only one line (probably because the address is incomplete) or is truly only know to a country-level, it will appear only on this TAB.

Events has an address with two or more CSV items, that address will appear on both the Countries and Regions tabs. Likewise events with three or more CSV items on the Countries, Regions & Parishes tabs, but not those with only one or two lines.

As a way of pointing a spot light at data inconsistencies its a hugely useful tool, particularly the click-through feature.

But I'm wondering if all that is a required to achieve the same end result is a single table with all the addresses, including some rows where one of more columns may be blank. Or perhaps you've tried and dismissed that idea?

On either approach, I'd suggest the following would be hugely helpful

(1) the ability to sort on any column, for example to be able to answer the question "Does the same Parish appear in multiple Regions" (it may be an error in my data or just reflect shifting county boundaries.)

(2) to have house numbers stripped from street address (if such can be easily recognised) and placed in their own column. House numbering in major cities as they expanded was extremely fluid over the years, so being able to easily see all occurrences of the same street name (and its mis-spellings) would, imho, be extremely helpful.

Or have I completely misunderstood??
Coordinator
Aug 17, 2013 at 10:56 PM
Edited Aug 17, 2013 at 10:58 PM
Yes you have largely understood. I specifically went for multiple tabs however as it is dramatically easier to see in the countries tab if you have only 20 rows if there are errors. Similarly on the regions it is a lot easier to see errors in couple of hundred rows.

I also specifically exclude things that don't have a value in the last column perhaps I should include them too?

I've resisted the sorting idea as that could cause confusion as to the purpose of this report. However I do now agree that your suggested use does have a lot of merit.

One of the major issues of this report is the complete lack of any useful documentation :) Unfortunately as a programmer end user documentation is a very weak point of mine.

House numbering stripping out is possible however tricky given that depending on the level of detail it may appear in address or place. eg: I have an address that gets displayed as England, Glocestershire, Bristol, St.James, 9 Bloomsbury Court. which is all 5 fields. I also have ones like England, Yorkshire, Rotherham, Kimberworth, Grocers Shop. 23 Sarah Street. which has the number in the middle. Also ones like Scotland, Aberdeenshire, Aberdeen, 26 Gallowgate, Union Court which is one of 2 courtyards (back alleys) off the front end. Basically the old slum type one door that led into a close with 2-3 courtyards into buildings at right angles to the main street. Common I believe in denser packed cities especially London.

So some examples of how stripping house numbers isn't that straightforward. I can put it on the to-do list but I'm not confident of how it can be tackled.
Editor
Aug 19, 2013 at 11:21 AM
I specifically went for multiple tabs however as it is dramatically easier to see in the countries tab if you have only 20 rows if there are errors.
Might the use of a TreeView widget based approach give a similar effect to multiple tabs, if your data structure can easily accommodate it?

Initially only the top level (last address line) should be visible giving the same level of detail as the current tab labelled "Countries". Clicking the "plus" adjacent to a specific county would expand only that section of the tree, with the advantage of keeping the exposed data to more manageable proportions and providing a level of detail similar to the current Regions tab etc etc

However, a TreeView based approach would not be as amenable to sorting on a specific sub-field. So perhaps the Location tab might evolve to have just two sub-tabs a TreeView and a sortable GridView not dissimilar to the current "Places" tab, but which includes all addresses with less than five CSV elements padded as necessary (a full list of all addresses found).

Perhaps approaches to stripping out house numbers is a topic in its own right for later
Editor
Aug 21, 2013 at 9:41 PM
Edited Aug 21, 2013 at 10:03 PM
Levva wrote:
I specifically went for multiple tabs however as it is dramatically easier to see in the countries tab if you have only 20 rows if there are errors. Similarly on the regions it is a lot easier to see errors in couple of hundred rows.
It is probably a little bit late to agree with your comment as you have now changed it in version 2.1.0.0
I was just getting my head around the way it worked and just started a document for the Locations tab - will have to re-write it now.

Personally I find the new layout with what appears to be every entry on every tab very confusing and makes it more difficult to pick out the errors - and there are many of them in my tree.

If you wish to stick with the new method I would suggest that a sort by column option would be essential.

I have also noticed that where I have used Durham as the county FTA has now determined that to be the city and added the entry County Durham. I guess I will have to change all my entries to match. Will it effect searches on FMP etc where Durham is normally used as a county?
Coordinator
Aug 21, 2013 at 10:33 PM
Nothing fundamental has changed with v2.1 and locations. There is the cosmetic display in bold of recognised locations and items that previously disappeared when there was no data at a lower level have re-appeared.

The basic principal if you wish to tidy up location data is :
  • Open the countries tab and view the countries.
  • Are there things there that are clearly not countries?
  • If so double click on the location to see who has that location recorded and fix that in your family tree software
  • Repeat with other "invalid" countries
  • Export your tree and see the results so far
  • Repeat until your countries resemble proper countries (please report any proper countries the program fails to recognise)
By this stage you should have data that's a lot more consistent. So you can move on to viewing Regions and repeat the process by looking through your list of regions and fixing things which at this stage should be limited to typos or different ways of recording the same region. eg: Lacashire.

By adoption this find what's wrong and who it is wrong for, fix it and see how far you've got you gradually improve the quality of the data and remove inconsistencies.

I can add a sort feature but I'm really struggling to see how this would help at all. Can you give me an example so I can get my head around it. I'm sure there are perfectly sound reasons I'm just having a hard time visualising them, which is the peril of having got so used to one way of doing things, sorry.
Coordinator
Aug 21, 2013 at 11:39 PM
Added location sorting for v2.1.0.1.

I'd still be interested to hear the usage to see what I'm misunderstanding.
Coordinator
Aug 21, 2013 at 11:57 PM
Edited Aug 22, 2013 at 12:00 AM
Hmm I just noted the comment "the new layout with what appears to be every entry on every tab".

Every entry most certainly does not appear on every tab. On the countries tab it is "squashed up" to only include what it recognised as countries. If you have locations without country names then the county or even town/parish will appear here, if you haven't got that then you will get the lowest level it could work out. It tries to split on commas if there are no commas then I could understand it being confusing as there would be no structure to your data.

Perhaps an example from my own tree?
My countries...
Image

and my Regions (well the top bit)
Image
Editor
Aug 22, 2013 at 5:03 PM
Previously all of the locations with only one element (regardless of whether country or not) appeared in the Country tab, all with 2 elements appeared in the Region tab, all with 3 elements in the Parish tab etc etc. Each tab was only populated by locations where the last element (in the table) related to that tab.
Now Country tab has not changed but Regions tab shows all entries with both 1 or 2 elements, SubRegions shows all with 1, 2 or 3 elements and so on until Place tab shows 1,2,3,4 & 5 element locations.
There may be a reason for this that I have missed but it does make the Place tab heavily populated.

Previously I could quickly look through the relatively few entries in each tab and find the ones that need correcting or more information adding.
However being able to sort a column does enable me to push all of the empty rows for that column to the bottom - thank you.

I hope this explains my comment better.
Coordinator
Aug 22, 2013 at 10:33 PM
Yes thanks. Hopefully the new treeview will make the intention of the original tabs even clearer.
Editor
Aug 23, 2013 at 6:01 AM
Thanks the Tree View certainly helps with tracking through the different levels of the locations and also highlights where the same "place" appears intwo or more levels.

However I feel that I still haven't explained my view clearly enough.
If a location is only a country why does that location appear in all of the other tabs and likewise for each of the other levels?
If the location does not have one of the lower levels surely it should not be listed in the tab for that lower level.

IMHO there are 3 accurate ways of presenting the location reports (excluding the new Tree View):
  1. Only one tab showing all of the locations which can be sorted by the column being examined.
  2. Each tab populated with the priority on the lowest level in the location eg if the lowest level is SubRegion it will not appear in a higher level tab nor in a lower level tab
  3. With the priority on the level itself - so if there is an entry for that level in the location it will be listed in that tab - therefore the Country tab would list every location and each subsequent tab would not list any location where that level is not present.
Whichever is the best one would be determined by the considered use for sorting and examining this information - my preference is for #2 but others I am sure will disagree.

Or have I totally misunderstood the whole idea?
Editor
Aug 23, 2013 at 7:31 AM
Edited Aug 23, 2013 at 7:58 AM
My thanks also for the Treeview report.

I think the fact that all the tabs now have column sorting enabled is a great step forward, but I wonder about the benefit of being able to sort the Places on the Countries tab, etc ;-)

@trebor2 - Just to muddy the waters a little, most of the genealogy data managers we individually use will store the component parts of each address we enter in discrete datbase fields. The labels used will vary from GDM to GDM, but might be, for example, Street, Village/Area, Town/City, County/Region, Country for a UK style address and Street, City, County, State, Country for a US style address - five fields being typical in my experience.

How we populate some or all of those fields will be down to personal preferences. For myself, for example, I know that when adding an event representing a GRO index record, I start with the Registration District as the second line of the address (Area), no third line (Town/City), and then Registration County and Country to complete it.

When your GDM comes to create its GEDCOM export it has choices with respect to addresses. Most will represent the address as a comma separated variable string, putting a comma between each field from your GDM. Some will ignore the blank lines in your GDM but some will respect a blank line and output " ," to represent the missing field.

Now factor in the quirks of user data entry. Most GDM's will allow an address field to contain (eg) "Flat 1, Back Court, 123a High Street" as a single line. So those five GDM address fields might appear in the GEDCOM as anything from one to a unknown number of CSV strings.

And then deciding how to parse that CSV string for reports such as this is a nightmare. I suspect at the very least we should ask for a configuration option at import time to respect or disregard blank lines within an address field in the GEDCOM file. See https://ftanalyzer.codeplex.com/workitem/12070 If the user chooses to have blank lines/missing fields respected, then expect to see blank cells in the various reports, since an address might in the extreme contain just Country and Places (to use FT Analyzer terminology).

I think your option 1 is the current "Places" tab. Since that is a full and complete list, I'd prefer not to loose that report.

If I'm understanding your option 2 correctly, if I have an entry with at typo Negland (sic England) and just a Place (with missing places respected), or a full five line address, it would not be revealed until I review the fifth Places tab. Surely most user would want to work through to correct the all the Countries and might do the Regions (Counties/States etc) and perhaps some might clean up their SubRegions, but I doubt the majority would consider tackling Address and Place. --- Just my perspective.
Coordinator
Aug 23, 2013 at 9:59 AM
That might be true for TMG EmmArrBee, however from what I've seen at least 3 programs store locations as a single line of text and not split up into different fields this includes the market leader Family Tree Maker. I can now see why you have focused on this issue in our discussions.
Coordinator
Aug 23, 2013 at 10:10 AM
Trebor, your understanding isn't flawed but it only looks at one aspect of the reports existence.

The primary purpose is not to just list your locations that is trivial and something that your family tree program should do. The primary purpose is to ANALYZE the data and show where you have errors.

Thus the countries tab shows ONLY the top level of the location you have entered. If you want your data to be structured then this should ONLY be recognisable valid countries. If you simply have a list as you suggest you lose the ability to see if you have valid countries or not at the top level. Although admittedly the treeview probably makes that clearer.

I'm open to ideas but the primary purpose has to be to analyze the data and highlight inconsistencies, not just to list things.
Editor
Aug 23, 2013 at 10:13 AM
Edited Aug 23, 2013 at 10:31 AM
I accept that not all GDM use discrete database fields, but none the less even FTM can create the following GEDCOM....

0 @I235@ INDI
1 NAME Test /Person1/
1 SEX M
1 BIRT
2 DATE BET. JAN - MAR 1881
2 PLAC , Southampton Registration District, , Hamshire, England

And from a somewhat vintage copy of Legacy, something very similar

0 @I1035@ INDI
1 NAME Test /Person2/
2 GIVN Test
2 SURN Person2
1 SEX M
1 BIRT
2 DATE Jan-Mar 1881
2 PLAC , Southampton Registration District, , Hampshire, England
1 DEAT Y
1 _UID CBB9F1072F094A5ABEA70FF850C4C95F00E5
1 CHAN
2 DATE 23 Aug 2013
3 TIME 11:25

Sorry but it not a TMG specific issue, but rather a generic one about the way users choose to enter data into their GDM.
Coordinator
Aug 23, 2013 at 3:24 PM
Oh absolutely! I'd just not considered it before. I wasn't aware that FTM could output locations with spaced addresses unless you actually type them in the location field you setup. Can you advise how this works?

My preference given the discussion is an option that users can turn on to detect/ignore extra commas/whitespace. I won't use the term whitespace though as that is probably too technical. Taking EmmArrBee's supplied example data A, B, C files I'll aim to have options that understands and copes with all 3 formats.
Editor
Aug 23, 2013 at 6:49 PM
Edited Aug 23, 2013 at 6:49 PM
Can you advise how this works?
In both programs typed as seen..... including the typo for Hampshire into FTM ;-)

See also http://ftanalyzer.codeplex.com/workitem/12070
Coordinator
Aug 23, 2013 at 9:31 PM
Ah so in FTM its a case of just typing extra commas giving that effect, whereas in TMG you have specific fields and a selectable option on export as to how to write it to GEDCOM?
Editor
Aug 23, 2013 at 10:25 PM
Levva wrote:
Trebor, your understanding isn't flawed but it only looks at one aspect of the reports existence.

The primary purpose is not to just list your locations that is trivial and something that your family tree program should do. The primary purpose is to ANALYZE the data and show where you have errors.

I'm open to ideas but the primary purpose has to be to analyse the data and highlight inconsistencies, not just to list things.
I appreciate and accept your comments but must emphasise that in previous posts I have regularly mentioned the need I have of locating errors / inconsistencies in my records (there are plenty of them) and have never indicated a need to just list the records. I mentioned the option of a single tab because it could exist and would be an acceptable option for some users - I did not state and do not think that it would be an ideal or even acceptable option.

If I am trying to find errors or inconsistencies in (for example) my regions tab I do not want to find it cluttered with entries that only include a "country" - I can resolve those problems in the Country tab.

However if my interpretation of how the program could work is in conflict with yours then I am happy to accept what I get (after all it is your program and you have done all of the work and I do appreciate what I get - which is an abundance) - you have given me the option to sort by column so that does (in a fashion) give me what I ask for. If you were to try to please every user I am sure that it would become an impossible task.
Editor
Aug 23, 2013 at 10:36 PM
Just as a possible affirmation of my interpretation EmmArrBee wrote on Aug 17
I've been struggling to get my head round the sub-tabs in the Locations tab. I think that there is a huge amount of duplicated data, but I may be wrong. I think the tab which is labelled "Countries" is the final CSV element from the address of any event with a GEDCOM "PLAC" string. If an address happens to have only one line (probably because the address is incomplete) or is truly only know to a country-level, it will appear only on this TAB. Events has an address with two or more CSV items, that address will appear on both the Countries and Regions tabs. Likewise events with three or more CSV items on the Countries, Regions & Parishes tabs, but not those with only one or two lines.
This sounds very similar to some of my comments but possibly better explained.
Editor
Aug 23, 2013 at 10:45 PM
EmmArrBee wrote:
If I'm understanding your option 2 correctly, if I have an entry with at typo Negland (sic England) and just a Place (with missing places respected), or a full five line address, it would not be revealed until I review the fifth Places tab. Surely most user would want to work through to correct the all the Countries and might do the Regions (Counties/States etc) and perhaps some might clean up their SubRegions, but I doubt the majority would consider tackling Address and Place. --- Just my perspective.
Then perhaps option 3 would enable this scenario to be found much easier - the error would then appear on the Countries tab and also each lower level where data exists - there are possible arguments for each option but your comment holds a strong point which I cannot deny.
Editor
Aug 23, 2013 at 10:49 PM
My apologies that my last few posts are out of time sequence.
Coordinator
Aug 23, 2013 at 11:31 PM
Interestingly your "Option 3" is exactly how the locations worked up until very recently when someone pointed out that it was extremely confusing to have some locations "disappear" if no data appears at that level. On reflection I thought this sounded very valid as it also preventing you finding people with only a country or only a region etc in their location. As I understand option 3 you are looking to reverse this change?

I'm reluctant to do that as that then prevents you finding people with missing regions, sub-regions etc.

As an example at each tab we now have say England - on the countries tab double clicking on this shows everyone living in England.
On the Regions tab you have say England, blank and England, Hampshire. Here you get the behaviour that double clicking on England with a blank region gives you everyone in England who HAS NO REGION. ie: they only have a country. Double clicking on England, Hampshire gives you everyone who has England, Hampshire in a fact.

So having a Country repeated on the region tab with a blank gives you completely different data. It gives you people who have JUST a country and NO region, isn't this a useful feature? You can then see who is missing region data.

So if you appreciate that the country with a blank region actually gives you EXTRA information you can perhaps see why I'm reluctant to revert the change? It isn't cluttering up the report it gives you the option to see extra lists of people that you can't otherwise get.
Editor
Aug 24, 2013 at 8:54 AM
Levva wrote:
Ah so in FTM its a case of just typing extra commas giving that effect, whereas in TMG you have specific fields and a selectable option on export as to how to write it to GEDCOM?
Correct on both counts
Editor
Aug 26, 2013 at 8:12 AM
My apologies for the delay in replying to your response.
I had not seen the option 3 method in use in FTA - perhaps this was before I started using the program.
I am not suggesting that option 3 should be used only that it appeared to fulfil EmmArrBee's comment. Personally I prefer option 2 which is how it appeared to work prior to the recent change.

I can now see the reasoning behind your method - it lists all of the entries with missing data whereas my options listed only entries which include specific data.

It appears that the majority of users are happy with the method you have selected as there are no other requests for change so I will happily accept this. I do not expect change just because I would prefer something different.

Thanks for allowing me to view my opinion and for discussing it with me. And thanks again for the program.
Coordinator
Aug 26, 2013 at 11:17 AM
Thanks for the input trebor. It has clarified some of the reasons for user confusion. I do think there is merit in doing some re-organisation of the locations tab to make it clearer to see why things are in the places they are.

One of the suggestions was to rename the columns Level 1-5 or something so as to be more neutral and thus avoid confusion that there are parishes in the country field. Also a good suggestion is to include the original text the user entered.
Editor
Aug 31, 2013 at 10:19 PM
Having "seen the light" I have been able to create documentation for this tab - when you are able to, please check it out in case I have any errors or omissions.

I do agree that with the many various options available for the creation of a location there could be merit in changing the description of the headings or, if possible, allowing the user to create his own. Also I agree that where FTA adds information derived from its understanding of City/County/Country that this is identified as not part of the original GEDCOM.
Editor
Aug 31, 2013 at 10:42 PM
Levva wrote:
Ah so in FTM its a case of just typing extra commas giving that effect, whereas in TMG you have specific fields and a selectable option on export as to how to write it to GEDCOM?
I guess that this refers to adding a comma to create an empty element.
My program (My Heritage Family Tree Builder) removes duplicate commas and also those separated by a space.
I have not tested it but I assume that it will allow me to put a character between commas which would be meaningless in FTA - is it possible that such a meaningless character be stripped out leaving the intended blank entry?
Coordinator
Sep 1, 2013 at 8:08 AM
You could try a - or a .
Editor
Sep 1, 2013 at 8:35 AM
Many thanks
. works a treat