Which Western European languages are closest to each other?

The major Western European languages fall broadly into two family groups – those descended from Latin, and those descended from Proto-West-Germanic (PWG, a language contemporary with Golden Age Latin but much less attested than it; to the north, Nordic languages are descended from Proto-North-Germanic, better known as Old Norse, which then shares a common ancestor farther back with PWG – of course, ultimately they all share a common ancestor when you go back far enough).

Even this is, of course, a simplification: English is derived from PWG (which gives it most of its core vocabulary – ‘here’, ‘three’, ‘give’) but took on board a lot of vocabulary first from Nordic languages (‘take’, ‘they’, ‘egg’) and then from Norman French (‘animal’, ‘chief’, ‘mayor’) and other Latin-derived languages (‘concert’, ‘cuisine’, ‘chocolate’). Words are shifting between those groups all the time in any case: German has borrowed words from Latin-derived languages (‘tchau‘, ‘studieren‘, ‘Nation‘) and Latin-derived languages such as Italian have borrowed words from German (‘fon‘, ‘norte‘, ‘guerra‘). Like a good song, a good word travels (pizza, anyone?)

Statistics

Sometimes attempts are made to show how closely languages are related. One very common way is to compare “lexicon” (vocabulary): a number of words is taken from a text (or even a dictionary) and then the researcher looks for cognates in another language. For example, Italian costa is very obviously cognate with Spanish costa, fairly obviously with English coast, and perhaps marginally less obviously with French côte (the circumflex gives it away there, as it often marks what was once a following -s). You will then see figures banded around such as “85% lexical similarity” (a fairly typical score for Western Romance languages). There are problems with this, however: how far back do you go? Ultimately German zehn is cognate with Portuguese dez ‘ten’, but that is not at all obvious either in modern pronunciation or in modern spelling and the link is many millennia ago. Also, what happens if you have a word with a different meaning (for example Italian controllare does not typically mean the same as English “control”, usually translating more as ‘check’; and the French anniversaire typically has a much more specific meaning of ‘birthday’ than English “anniversary”)?

Another attempt was made through a process known as lexicostatistics, which happened to be the topic of my own dissertation. This lined up a particular corpus of words (say, the most common one hundred, or a particular series of verbs) and then identified how many were cognate, with the figure then used to determine when the languages divided from their common ancestor. My own work demonstrated this to be, well, somewhere between problematic and utterly pointless…

Of course, all of this statistical work also focuses solely on vocabulary – not on phonological or grammatical similarities, for example. Ultimately, I do not think there is a purely objective way of determining how closely related any two languages are (nor indeed if they are even to be regarded as different languages at all – that is, really, a socio-political question).

I can then give only an opinion, and even that will be specifically with regard to the standard languages that are actually taught to outsiders.

West Germanic

From northwest to southeast, the major West Germanic languages are English, Dutch and German; practically, between Dutch and German needs to be inserted Afrikaans.

There are, as I have written before, also other considerations here. Luxembourgish now exists with its own standard separate from German, but is spoken by only a few hundred thousand people (and is intelligible, after a bit of tuning in, to German speakers). Swiss German is an umbrella term for the dialects of “German” spoken in Switzerland, Liechtenstein and (arguably) the western Austrian province of Vorarlberg; ultimately, however, its speakers still regard Standard German as their written language (and switch to it happily in the presence of outsiders). Low German, spoken in northern Germany and perhaps parts of the Netherlands (depending on definition) was once a language of considerable prestige in medieval trade and administration, but has largely dropped out of use leaving only a substrate; Scots (and its variant in Northern Ireland and Donegal known as Ulster Scots) has a similar linguistic status to Low German – it was once in a stronger position but retains a considerable historical prestige and modern revival movement. Frisian, spoken in different varieties in the northern Netherlands and small parts of Germany, is regarded as the closest language to English (largely because it too has made /k/ affricate, as in English ‘cheese’ versus Dutch kaas), but it is a discernibly regional rather than national language. Some may also argue that “Flemish” should be regarded as separate from “Dutch”, but most accept it is just a different name for the language as used by a narrow majority of Belgians overall and as the official language across the north of the country. Yiddish is also a language in religious use (though much less so than was once the case), which is a High German variety and thus very close to Standard German (although written in a different script and with markedly different vocabulary in particular areas). We should arguably not be discounting English-based creole languages, such as Jamaican Patois or Tok Pisin, which are sometimes in widespread use in Africa, the Far East and the Caribbean.

Leaving those considerations to one side and focusing on just the four major languages, there is little real doubt that:

  • the closest to English is Dutch;
  • the closest to German is also Dutch; but
  • by far the closest to Dutch is Afrikaans (and vice-versa).

My impression is that English is the outlier given its notably simplified grammar and considerable Latin and Norse lexical influence; that said, Afrikaans is also an outlier grammatically (in many ways, grammatically, the closest to Dutch is German).

One quirk here is that I would say Swiss German is slightly closer to Dutch both phonologically and particularly grammatically than Standard German is, despite being geographically farther removed from it (though obviously Swiss German is closer to Standard German than it is to Dutch).

Western Romance

From west to east, the major Western Romance languages (i.e. those derived from the language of “Rome”, thus Latin) are Portuguese, Spanish (aka Castellano), Catalan, French and Italian.

Again here we are omitting Romanian (most obviously), the national language not only of Romania but also of Moldova, because it is isolated to the east (and also, frankly, because I don’t speak it at all…); we are also omitting some consequential regional languages in Spain, most obviously Galician (which is particularly interesting as it shares its origins more directly with Portuguese than with Spanish); we are also including Valencian under Catalan (which would be controversial to some) and we are discounting Occitan spoken across southern France (again interesting because its origins are more closely bound to Catalan than to French); and we are removing from consideration the various dialetti of Italy (translating these as “dialects” does not do them justice; they are really separate languages derived from Latin separately from the version which became Standard Italian, but they have all – broadly to a greater extent in the north than the south and on the mainland than on the islands but to a considerable extent everywhere – been heavily influenced now by it). There is also a national language of Switzerland, Romansh, alongside related regional languages of Italy, Ladin and Friulian, which are long attested and hold some role in education even today, but are not considered here.

With the five remaining languages, I would say personally (but this is very open to debate) that:

  • French is the clear outlier (phonologically, grammatically and lexically), but of the other four Italian is closest to it followed by Catalan (and then actually by Portuguese);
  • Italian is a secondary outlier (definitely grammatically), not least because it is the farthest east, but is still on balance closer to the Iberian languages than to French (except perhaps lexically), although which Iberian language it is closest to is debatable (I err just towards Spanish but there is a good case, particularly lexically, for Catalan);
  • Catalan is on balance just about closer to Spanish than it is to the others, but it is much less clearly the case than you would perhaps expect – French would probably be the nearest challenger but it is also quite close, particularly lexically, to Italian;
  • Spanish, however, is closer to Portuguese than it is to Catalan (and then closer to Italian than it is to French);
  • Portuguese is closest to Spanish as you would expect, but then it is debatable – phonologically there is a case for French being text but, taking grammar and vocabulary into account, it is probably actually Italian followed by Catalan.

The geography does make some sense, but only if you recognise that it was once easier to cross sea than land: so Portuguese does naturally become Spanish which then kind of branches upwards into Catalan and sidewards into Italian, with the latter two then kind of meeting French together.

Esperanto

For the record, Esperanto has a distinct structure and quite a Slavic phonology, but lexically it is roughly two thirds Romance, a quarter Germanic, with most of the rest Slavic; on balance, lexically, it is probably closest to French then Italian (there was no specific attempt to take words into it from the Iberian languages, though common Romance words such as granda ‘big’ do appear). There is also a slight grammatical bias towards Romance generally in that the language contains a grammatically marked future tense, unlike Germanic languages.

However, as noted, this is really quite subjective – others will have legitimate alternative views on all of this!

Leave a comment