About 2500 localities were eventually selected (sampled) and assigned coordinates. The post describes the problems, sources of the coordinates and some other issues encountered in the preparation of the final “clean” list of places used in the statistical analyses.
2.1 Finding coordinates
Place names were located using a number of webtools, each with their specificities, starting with the desktop version of GoogleEarth Pro for Linux and the web-based version of OpenStreetMap (OSM). OSM has the advantage that it displays a list of places with identical name in such a way that it is usually easy to pick the ones that are of of interest. Next, there is Wikipedia.de of which the disambiguation pages often provide a way to identify which of several places is the relevant one. Interestingly, two rather generic tools have often better information than GoogleEarth and OSM for micro-localities , hamlets or mills. They include Stadtplan.de and Mapcarta.
When a locality cannot be found immediately, it is usually necessary to go through generic national or international lists of places like the Gemeindeverzeichnis, geonames, Längengrad und Breitengrad, geoportal.de, Bundesamt für Kartographie und Geodäsie, as well as the general internet. A locality near Schleiden was located thanks to a newspaper article reporting the discovery of an uncontrolled rubbish dump! Several were discovered in scholarly historical records of libraries or scanned historical publications.
Then, there is of course the lists of localities by district, such as this for the Vulkaneifel, Church records and national and international databases of mills (this one, or this one), German names of localities outside Germany (e.g. in Czechia, or Poland or in Bohemia). etc
The Meyers Orts- und Verkehrs-Lexikon des Deutschen Reichs (5th edition, 1912) deserves a special mention as it is available on-line as Meyers Gazetteer and shows places on actual historical maps. An example is given on a separate page for Korneshütte. The author has also assembled some lists, for instance for Alsace and Lorraine (downloadable). Other historical place names can be obtained from Dr Radermachers deutsch-österreichisches Ortsbuch (1871-1990) and Neumanns Orts-Lexikon des Deutschen Reichs 1894/A .
In addition, a large number of emails were written to institutions, individuals and municipalities. Many are mentioned in the Acknowledgements.
2.2 Sampling the database
As indicated above, there are 4570 “raw” place names. The original intention was to identify them all, i.e. find their actual and current name and provide them with coordinates. It appears that many of them cannot be identified nor located.
The range from Aach (47.842400, 8.853800) to Freilingen (there are seveal localities with that name) includes 1096 localities, of which 114 (i.e. just over 10%) could not be identified. to Freilingen The share of “stripped names locations” locations which could not be located between A and Freilingen amoiunt to 114 to 1096 . This leaves Of the remaining , 108 were reassigned to names (from Gdansk to Wülfrath) outside the range. This leaves 3376 localities in the range from Freisen to Zwoll. I have randomly selected 1410 localities between Freisen and Zwoll so that the total becomes 2500. Not all the 1410 localities could be identified. For those than can be located, all the “related localities” were included (provided the coordinates could be found). For instance, we have these two marriages in Sarmersbach. If Hünerbach was randomly selected as part of the Freisen to Z batch, then Kradenbach was added to the selected localities. In the second case, if Gefell was selected, then we already have Beinhausen in the A to Freilingen batch. As a result of the sampling, about 11000 spouses out of 120000 have not been taken into account in analyses that involve coordinates.
|Sarmersbach||1871-06-24||Wagner||Carl Josef||Hünerbach||Schlosser||Maria Catharina||Kradenbach|
Eventually, the final list of ”clean names” with coordinates this study is based on amounts to 2473. The total number of “stripped” names is 3720, with each of them assigned to the corresponding “clean name”. The file with the “stripped names” and the “clean names” they have been assigned to is can be downloaded from the Project in various formats.
2.3 “Raw” , “stripped” and “clean” place names
The terminology “raw place names” was used before. They are the place names as they appear in the KJT files which sometimes includes a large number of variants for the same locality. They may be due to spelling mistakes, but many other factors play a part, some of them related with history and administrative changes etc. There is also the fact that whoever recorded the place of origin of spouses wanted to add additional information, especially for locations far from place of wedding, or in the case of homonyms (see 2.4 below). While this is mostly useful information, it also prevents the automation of the coordinates attribution.
A prime objective of this Project is to replace all names that stand for the same location with a unique final “clean” name, with no spaces and as homogeneous as possible as far as format is concerned1. The first step was strip “raw”of parasitic characters such as spaces, commas, slashes, underscores, hyphens, as this significantly reduces the number of variants.
The table below has some examples. Many more are given in file S2C.
2.3 Homonymous places, i.e. places with the same name
There are about 200 homonymous localities, i.e. places with at least one other place with the same name. This includes for instance Albach near Siegburg and Albach near Mötsch. They were assigned a conventional name, in this case Albach_1-Siegburg (50.829794, 7.264152) and Albach_2-Mötsch (49.962610, 6.553380). Other examples include Altrich_2-Wittlich (49.958163, 6.909900) Vs. Altrich_1-Daun (50.192164, 6.810181), to Steinborn_1_Daun (50.210455, 6.789322) Vs. Steinborn_2_Seinsfeld (50.068736, 6.628421) and Weiler_1-Ulmen (50.141763, 7.077021), Weiler_2-Vordereifel (50.310477, 7.120245), Weiler_3-Bingen (49.956332, 7.866523).
The largest numbers of homonyms occur for Auel2, Born3, Neunkirchen4, Rodt5 and, finally Bruchhausen6.
In some cases, the actual locality a person hails from can be identified when genealogical information or other additional data are available, i.e. a birth certificate of Joseph Mackenbach from “Born” identiìfies the locality as Rodt_1-BEL rather than one of the other options. In most cases, a homonym can be identified from the distance if Homonym_2 is just a couple of km away from the wedding place while Homonym_2 and Homonym_3 at hundreds of km away. A later section has additional details about the way homonyms were handled in the generic case when computing distances and bearings.. Karl Josef Tonner was kind enough to attempt to solve the case of ….
2.4 “Average” homonyms
Several “clean” names include the word “average”. They were created when the homonyms could not be sorted out and the locations under consideration are far off from the centre of the Eifel locality cluster. We have, for instance the wedding in Gerolstein on 1813-10-08 of Léonard Chabante from Forgeas, in the French Departement of Haute Vienne (as specified in the wedding record) with Anna Margaretha Castert, from Gerolstein.
According to OSM, there are two places called Forgeas in the Département Haute-Vienne:
Forgeas_1: Municipality of Saint-Bazile, Rochechouart, coordinates: 45.7341744, 0.8002645
Forgeas_2, Municipality of Compreignac, Bellac, coordinates: 45.9638059, 1.2871934
The distance between the two is 46 km, and the distance between Forgeas-average ands Gerolstein is 641 km. The heading from Gerolstein is 222 degrees. It is obvious that we could have selected any of the Forgeas with a a very small change only in the distance and the heading.
This is why, when the two homonymous localities are close to each other, or when they are far away from the centre of the Eifel “cluster” , the creation of conventional “average” localities does only insignificantly affect the distances and heading statistics.
17 artificial localities were created. In addition to Forgeas-average; they include Baar_2-average, Höhscheid-average, Klausen-average (Austria), Lutzerather-mühle-average, Niersbacher-Mühle-average, Pölerter-Mühle-average, Rußhütte-average, Schwarzbach-average , Schwetz-average, Sevenich-average, Steinheim-average, Streithagen-average, Unterbruch-average, Wey-average, Wickrath-average and Wöhlsdorf-average.Notes
- spaces are replaced by hyphens or underscores, “Umlaut” (the typical German letters like ä, ü, Ö) are used systematically, Sankt is spelled out instead or using St. and other variants[↩]
- 6 localities: Auel_1-Sankt-Goar, Auel_2-Blankenberg, Auel_3-Gerolstein, Auel_4-BEL, Auel_5-FRA and Auel_6-Ohlerath[↩]
- 6 localities: Born_1-BEL, Born_2-Eggenscheid, Born_3-Ratingen, Born_4-Radevormwald, Born_5-Brüggen and Born_6-Steinheim[↩]
- 6 localitries: Neunkirchen_1-Westerwald, Neunkirchen_2-Saar, Neunkirchen_3-Wittlich, Neunkirchen_4-Seelscheid, Neunkirchen_5-Unterfranken and Neunkirchen_6-Daun[↩]
- 7 localities: Rodt_1-BEL, Rodt_2-Marienheide, Rodt_3-Gummersbach, Rodt_4-Schweich, Rodt_5-Dresel, Rodt_6-Lennestadt and Rodt_7-Taben-Rodt[↩]
- 11 localities: Bruchhausen_01-Remagen, Bruchhausen_02-Arnsberg, Bruchhausen_03-Olsberg, Bruchhausen_04-Pohlhausen, Bruchhausen_05-Ottbergen, Bruchhausen_06-Much, Bruchhausen_07-Sundern, Bruchhausen_08-Kelberg, Bruchhausen_09-Schnörringen, Bruchhausen_10-Unkel and Bruchhausen_11-Erkrath[↩]