Category Archives: Language

How to learn languages – story so far

We have established so far that all major national languages in Western Europe are derived from Indo-European, a language which was itself of extraordinary complexity by modern standards. Its phonology was marked by aspiration, strong and various <h> sounds, and probably tonal distinction – making it in many ways quite unlike even its daughter languages such as Classical Latin and Old Norse. Grammatically it was also quite distinct, exhibiting distinction by case, use of postpositions as often as prepositions, distinction primarily by aspect rather than tense, and a wide range of declensions and conjugations. Nevertheless, core vocabulary and basic aspects of grammar are already in some ways familiar.

We took at the oldest script in any Germanic language, the 4th century Bible translation into Gothic, to see how Germanic had developed in the centuries after Christ; and notably we also looked at Vulgar or Late Latin, which itself already demonstrated half the changes from Classical Latin to modern Latinate languages such as Spanish, Portuguese, French and Italian. These languages are more markedly modern phonologically, as they have generally lost tonal distinctions and the range of <h> sounds. They are also grammatically a little closer, distinguishing more definitively by tense rather than aspect and beginning to shift decisively towards using prepositions (rather than postpositions or case). However, they remain strange; in spoken form they would be utterly unrecognisable, and even in written form they look familiar but are still distant.

We also saw, through Middle English, how modern written standards are often based on Medieval pronunciation (we will see how remarkably often this is the case as we go on). Here, as one correspondent noted, we also see how inadequate the so-called “Latin” alphabet really is to represent the complexity and combination of sounds actually used in modern speech. This is so complex that even the invented language Esperanto, with 28 letters, failed to deliver on its own avowed objective of one sound to one letter. We have also seen how social disruption (such as the Black Death) or technological disruption (such as the invention of the Printing Press) can have dramatic effects on language change – either encouraging it or stalling it (although, as one correspondent noted, these effects generally speed up or stall processes already ongoing, rather than causing new ones).

I am always grateful for correspondence on this series – next up, we are moving to the modern day with a look at contemporary Standard Italian.

How to learn languages – Esperanto



Just to test this idea with reference to modern languages, I thought I would start with (supposedly) the most simple widely spoken language in the world – albeit a constructed one.

So, what do we need to know about Esperanto?


Esperanto adheres to the strict rule that each letter has the same pronunciation, regardless of position. It is seriously dubious whether this can strictly be achieved, but nevertheless it does make Esperanto easier to read (and write) than most natural languages.

Esperanto’s rhythm varies depending on the native language of the speaker; some suggest that it should sound something like Italian.

For most learners, Esperanto’s accented letters (the most common of which are usually in fact written <cx>, <gx>, <jx> and <sx> and pronounced respectively as ‘church’, ‘geography’, ‘pleasure’, ‘shop’) are the trickiest to distinguish and use. Also, <c> can catch out most learners, pronounced as if <ts> (in violation of the supposed ‘one letter, one sound’ rule). English speakers also need to note that, from their point of view, <j> is pronounced as if <y>.


The language has a Standard form based on the work of its founder, L. L. Zamenhof, and his work known as the Fundamento published in 1887.

An Academy in effect protects this Standard and applies it to new words (and technology) as required. In practice, some grammatical variation within the ‘Standard’ is permitted.

There is a tendency in Esperanto to reinforce positive responses to “yes/no questions”:

  • Cxu vi vidis tion? – Jes, gxiuste!
  • ‘Do you see that?’ – ‘Yes!’


Esperanto’s vocabulary is mainly Romance (usually directly from Latin, e.g. pluvi ‘to rain’, vidi ‘to see’; or French, e.g. grava ‘important’, preskau ‘almost’; but occasionally also from other languages such as Spanish almenau ‘at least’, Italian ankau ‘also’ or just general granda ‘big’), with a significant minority from Germanic (from English, e.g. jes ‘yes’, birdo ‘bird’; or German, e.g. tago ‘day’, lau ‘according to’) and some also from Slavic (po ‘at a rate of’, prava ‘true, right’). There is even the odd extra (e.g. kaj ‘and’ from Ancient Greek).

Key numbers:

  • 1 unu; 2 du; 3 tri; 4 kvar; 5 kvin; 6 ses; 7 sep; 8 ok; 9 nau; 10 dek;
  • 11 dek unu; 12 dek du; 16 dek ses; 17 dek sep; 20 dudek; 21 dudek unu;
  • 100 cent; 1000 mil; 456789 kvarcent kvindek ses mil sepcent okdek nau.

Esperanto has an innovative (but at first sight unfamiliar) list of ‘correlatives’ which serve most pronoun uses; it also has personal pronouns in a specific class of their own.

Key personal pronouns in Esperanto:

  • singular mi, vi, li/sxi/gxi; plural ni, vi, ili; indefinite oni

This indefinite is widely used to avoid the passive:

  • Oni diras, ke sxi estos tie – ‘It is said that she will be there’

Vocabulary is often built up through a series of meaningful affixes – for example arbo ‘tree’ plus -ar- ‘collection’ gives arbaro ‘forest’.


Nouns are marked by the ending -o; this is amended to -oj for the plural. They can also be in the “accusative” case (when used as objects or to mark motion towards), marked -n.

Verbs are marked for one of three tenses or two moods but not both (“conditional” is generally regarded as a mood rather than a tense in Esperanto, although it does not matter). All verbs in the present are marked -as, past -is, future -os, conditional -us and subjunctive -u. Unlike modern Romance and Germanic languages, tense is relative (i.e. if referring to a future event in the past, use the future).

Esperanto also allows zero subject in certain circumstances (where English typically requires a “dummy subject” such as ‘it’ or ‘there’):

  • pluvas multe ‘it is raining a lot’
  • estas tri arboj tie ‘there are three trees there’

Adjectives are marked by the ending -a and agree with their noun, typically appearing after it, although this is stylistic (arbaro granda ‘big forest’; en arbarojn grandajn ‘into the big forests’). However, words which must appear before the noun, notably the article la ‘the’ and numbers, do not agree (en la tri arbarojn grandajn ‘into the three big forests’). Adverbs are marked by the ending -e; notably, they tend to be used with the verb ‘to be’ (similarly to Slavic languages, but not Romance or Germanic): Estas klare ke mi vidis arbaron grandan ‘It is clear that I saw a big forest’.

In modern Esperanto, adjectives and adverbs can be turned directly into verbs in preference to using the “copula” (esti ‘to be’):

  • Estas grave ke vi ne vidis tion / Gravas ke vi ne vidis tion ‘It matters that you did not see that’
  • Vi laudire estas prava / Vi laudire pravas ‘Apparently you are right’

Exactly when this is deemed “allowable” varies according to usage and style.

The only article is la. The article may be omitted, and must be if it has an indefinite meaning (similar to English ‘a/an’).

Prepositions have very strict meanings which (in theory at least) must not be breached. There is a spare preposition je for when the meaning is unclear.

Key prepositions: case prepositions are de, al, kun; other prepositions include por, en.

Note the accusative is used with motion towards, except with case prepositions:

  • Mi estas en la arbaro ‘I am in the forest’
  • Mi iras al la arbaro ‘I go to the forest’
  • Mi iras en la arbaron ‘I go into the forest’

In modern usage, je is often abandoned and prepositions are increasingly used in line with English:

  • je 1887 / en 1887 ‘in 1887’
  • je la mila fojo / por la mila fojo ‘for the thousandth time’

Word order is generally SVO; but the passive is generally avoided, which can give different word orders (Mi vidis arbaron grandan ‘I saw a big forest’; Arbaron grandan mi vidis ‘A big forest was seen by me’).


Esperanto is deceptively Romance-looking. In fact, its phonology and some of its characteristics (notably the question particle cxu required for “yes/no questions”) are markedly Slavic, a product of its geographical origin.

Adverbs and word-building are a key feature of the language, particularly when combined: mia ‘my’ + opinio ‘opinion’ = miaopinie ‘in my opinion’; plena ‘full, complete’ + Esperanto plenesperante ‘completely in Esperanto’; kontrau ‘against, opposing’ + flanko ‘side’ =  kontrauflanke ‘on the other side’.

What next?

Let us now move on to the natural modern national languages (at last!)

We will go through the Romance ones to start with, starting with Italian (for reasons to be discussed).

Patro nia, kiu estas en la cxielo, Via nomo estu sanktigita. Venu Via regno, plenumigxu Via volo, kiel en la cxielo, tiel ankau sur la tero. Nian panon cxiutagan donu al ni hodiau. Kaj pardonu al ni niajn sxuldojn, kiel ankau ni pardonas al niaj sxuldantoj. Kaj ne konduku nin en tenton, sed liberigu nin de la malbono.



How to learn languages – Middle English

Let us just stop on the way through linguistic history to take a quick glance at Middle English.

It seems astonishing now, but English just before the Black Death in the mid-14th century was a colloquial language of low status. The administrative and high language of England was Norman French (and the ecclesiastical language was Medieval Latin, based on Classical). Furthermore, English was spoken only in England and parts of Wales; the language descending from Anglo-Saxon in use in Scotland was recognised as a separate language, Scots.

The Black Death changed that somewhat, as it was indiscriminate, killing the French-speaking aristocracy in big numbers. As survivors rose up the social scale consequently, so did English; the King’s Speech was presented to Parliament in English for the first time in 1362. Soon, English also had Chaucer, a major literary figure. This, all combined with ongoing wars with France, saw English become the language of late medieval English nationalism. The rest of the rise from there to global status is history.


So what was the language of Chaucer like?


There was no standard English at the time, but of course the bizarre linguistic truth is that modern Standard English spelling reflects it well, being based on the pronunciation of Middle English, not Modern English. This means a word like name ‘name’ was pronounced exactly as it looks (and as it still is in modern German); <e> was never silent. Words such as night were just losing the middle consonant sound (close to IPA /x/, cf. modern German Nacht) at the time of Chaucer. In words such as write, knife or gnat, the initial /w/, /k/ and /g/ were sounded; as was the /l/ in talk.

Anglo-Saxon regarded /f/ and /v/ as the same letter (the distinction was only brought in by the influence of Norman French), and these were still variously pronounced around the country and thus used in writing almost interchangeably in some areas. Scribes also used <v> and <u> interchangeably, treating them as absolutely the same letter.

Early Middle English also retained the letter “yogh” <ȝ>, which is usually (but not always) now /g/; it was pronounced somewhere between /g/ and /y/ before /e/ and /i/ (and similar vowels), but more like a hard /x/ (as in Scottish ‘loch‘) otherwise. It also retained “thorn” <þ>, now a <th>.


In the Middle English period, variations in spelling and usage were widespread, depending on geographical origin, exact time, and even on simply fitting on to the page or the line. People even wrote their own name variously! This mattered less, as proportionately fewer people were literate.

A “chancery standard”, forms to be used by the Civil Service in effect, did develop from the fifteenth century, but widespread standardisation only occurred well after the invention of the printing press into what is regarded as the (Early) Modern English period.


Vocabulary was similarly mixed between Latinate (French and Latin) and Germanic origin as now, although there was a greater awareness of the distinction (the oldest known song in the English language, Sumer is icumen in ‘Summer has arrived’ dates from early Middle English, but its vocabulary is entirely Germanic).

Key numbers:

  • 1 one, 2 tuo/twei, 3 thri, 4, fower, 5 five, 6 six, 7 sevene, 8 eight, 9 nine, 10 ten;
  • 11 eleven, 12 twelve, 16 sixteen, 17 seventeen; 20 twenty, 24 fower-and-twenty;
  • 100 hundred; 1000 thowsand;
  • 456789 fower hundred six-and-fifty thowsand sevene hundred nine-and-eighty

NB: one was pronounced to rhyme with alone.

Ordinal numbers generally added -the, but from thri this was thridde.


Nouns had largely lost the Anglo-Saxon “case” system, although the possessive remained, written -(e)s (no apostrophe). More irregular plurals remained in regular use (e.g. namen ‘names’).

Verbs “agreed” with their subject and had a wider range of endings with which to do so. There were more “strong” verbs (marking past forms by vowel change rather than an ending) than in Modern English; thus irregular help-halp-(i)holpen stood alongside sing-sang-(i)sungen. The i- or y- prefix (cf. German ge-) on participles was generally lost during this period.

Present of liken ‘to like’ (1st, 2nd, 3rd person):

  • Singular I like, thou likest, he/sche/hit liketh;
  • plural we liken, you liken, they liken.

Past participle liked; present participle likand; gerund liking.

Imperative like (singular); liketh (plural).

Past of liken ‘to like’ (1st, 2nd, 3rd person):

  • Singular I likede, thou likedst, he/sche/hit likede;
  • plural we likeden, you likeden, they likeden.

Past of singen ‘to sing’ (1st, 2nd, 3rd person):

  • Singular I sang, thou songe, he/sche/hit sang;
  • plural we songen, you songen, they songen.

Adjectives only “agreed” with nouns by adding -e after the definite article, a possessive or in the plural (but not otherwise): his longe name ‘his long name’, longe namen ‘long names’; a long name ‘a long name’. Adverbs were beginning to be distinguished (usually by the ending -liche, often reduced to -lie).

Pronouns maintained a distinction between the singular þu (later thou; object þe/thee) and plural ye (object you). Singular possessive forms came to be distinguished between mine/thine (the original forms, used latterly only before vowels) and my/thy (used before consonants) – cf. usage even in Modern English of the indirect article an/a.

Key personal pronouns (1st, 2nd, 3rd person):

  • Direct: Iþu/thou, he-sche-hit; we, ye, heo/they
  • Oblique: me, þe/thee, him-hir-hit; us, you, hem/them
  • Possessive: mine (my), þin/thine (thy), his-hir-his; oure, your, hire

Prepositions were similar to today, but there were also combined forms with locational pronouns in much wider use that in Modern English: hence ‘from here’; whither ‘where to’. One noteworthy preposition since lost was umbe ‘around’ (cf. modern German um).

Word order was predominantly SVO, and VSO in questions (Likest thou me? ‘Do you like me?’), although there were notable exceptions (the main part of the verb phrase often went to the end in subordinate clauses: whan he hath hire name sungen ‘when he has sung their name’). Negation was predominantly by addition of the particle nat (or similar) after the verb: he singeth nat ‘he does not sing’. This could be supported by a pre-verbal particle ne (effectively meaning doubled negation reinforced the negative): he ne singeth nouȝt ‘he sings nothing’.


Middle English was more quintessentially Germanic in character than the modern language, but much less so than Anglo-Saxon.

Dialects varied but, unlike modern “BBC English”, Middle English was almost certainly spoken with a rising intonation; and it would have been more vocalic than the modern language (notably because /e/ was always pronounced).

What next?

Let us get to the modern day now… but with a twist…

Fader oure that art in heuene, halewed be þi name: come þi kyngdom: fulfild be þi wil in heuene as in erþe: oure ech day bred ȝef us to day, and forȝeue us oure dettes as we forȝeueþ to oure detoures: and ne led us nouȝ in temptacion, bote deliuere us of euel.

How to learn languages – Gothic

Gothic? I mean, come on…


Gothic is important because it is the earliest attestation of a Germanic language – the family which includes German, Dutch (and Afrikaans), the Scandinavian and Insular Nordic languages, and of course English. It offers the best comparison, therefore, between Germanic of the time that Classical Latin became Late Latin (and thus of the ancestor of languages like German and English at the same time as the ancestor of languages like French and Spanish).

The parallel is, unfortunately, not exact. Gothic was an East Germanic language, and in fact has no surviving daughter languages; nevertheless, it would have been largely mutually intelligible with Anglo-Saxon, Old German dialects and Norse and therefore it shows many of their distinct Germanic features.

It is also useful because it is attested in a Bible translation (which makes understanding far easier). This dates from the fourth century and thus, as noted above, from the time of Constantine (when even written Latin began to display some of the features of Late rather than Classical Latin).

What was Gothic like?


Gothic was, fundamentally, not unlike Vulgar Latin phonologically but with a lot more fricatives (/f/, /v/) rather than plosives (/p/, /b/, etc).

The biggest distinction was that Gothic displayed stress generally on the first syllable of the word; Classical Latin had moved this, typically to the penultimate. Thus, in terms of intonation, the two languages would have sounded significantly different. Another marked difference was that Gothic almost certainly maintained a glottal stop before words beginning with a vowel (partially a consequence of its stress system, perhaps), whereas Latin did not.

Otherwise, it had similar sets of consonants and vowels, and numerous diphthongs (although these differed in some ways). The consonants <b> and <d> had much softer sounds in certain contexts, almost like modern English <th>.

Consonants were devoiced at the end of a word (as is still the case in Modern German), but there was no sign yet of rhotacism (switching from /s/ to /r/, which occurred in all other Germanic languages – cf. English ‘lost’ versus ‘forlorn’).


Gothic had no ‘standard form’ as such, and most of its speakers were illiterate. However, written forms are taken from Wulfilas’ Bible translation of the fourth century (to some degree his writing therefore constitutes a ‘standard’ version in retrospect).


Key numbers:

  • 1 a’ins, 2 twa’i, 3 þrija, 4 fidwor, 5 fimf, 6 sai’hs, 7 sibun, 8 ahta’u, 9 niun, 10 tai’hun;
  • 11 ainlif, 12 twalif, 16 sai’hstai’hun, 17 sibuntai’hun; 20 twa’i tigjus, 60 sai’hs tigjus;
  • 70 sibuntehund; 100 taihuntehund; 200 twa’i hunda, 1000 þusundi;
  • 456789 fidwor hunda sai’hsuhfimftai’hun þusundjos sibun hunda niunuhahta’uhund

Vocabulary was almost entirely Germanic in origin, but this meant not always Indo-European – some linguists suggest as much as a third of Germanic vocabulary is of different origin (it is thought that Germanic tribes were the fastest to move west from the Proto-Indo-European homeland).

Gothic contained the verb þulan ‘to tolerate’, which remains in (Ulster) Scots thole


Gothic maintained three genders and the Indo-European declension system (where noun endings were different according to groupings determined by the final vowel), which was also retained to an extent even in Late Latin, but interestingly was probably already largely lost by this time in other Germanic languages (which retained merely a “strong” and a “weak” declension). It also therefore retained three genders and even three numbers (including dual; known in Ancient Greek but not even in early Latin).

Similarly to Latin in all ages, Gothic verbs “agreed” with their subject in person (I, you, he/she/it etc) and number, although there were no distinct 3rd person dual forms. Endings or changes to root vowel could mark one of two voices (active or middle, effectively now passive) or three moods (indicative; optative, effectively now subjunctive; or imperative). Infinitives, present participles or past passive forms could be turned into nouns. Where Gothic verbs were markedly different from Latin was that they could only be marked for two tenses, past and present (or “not past”) – a marked comparative simplification. Gothic verbs were either “strong” (forming their past by way of a vowel change: e.g. bindan ‘to bind’, band ‘bound’) or “weak” (forming their past essentially by adding -d or -t); this division is maintained in all Germanic languages to the modern day, although the number of strong verbs has declined considerably (from probably approaching 1000 in Gothic to under 200 in most Germanic languages and dialects today).

The Gothic verb sōkja ‘to seek’, in the present active indicative (1st, 2nd, 3rd person):

  • singular sōkja, sōkeis, sōkeiþ; dual sōkjōs, sōkjats; plural sōkjam, sōkeiþ, sōkjand

Adjectives “agreed” with nouns for case, gender and number. The subsequent division between “weak” and “strong” endings was not yet relevant.

As with Latin, Gothic made use of clitics to mark whether a question was being asked – Gothic -u was equivalent to Latin -ne. These were lost in all other Germanic languages.

There is some dispute over Gothic word order, which was relatively free but seemingly essentially still SOV.

Key personal pronouns in Gothic (1st, 2nd person, nominative/accusative/genitive/dative):

  • Singular: ik/mik/meina/mis; thu/thuk/theina/thus. 
  • Dual: wit/ugkis/igkara/ugkis; jut/igqis/igqara/igqis.
  • Plural: weis/uns/unsara/uns; jus/izwis/izwara/izwis.

3rd person also existed with singular and plural in all genders (but no dual).


It is hard to assess the character of the language as almost all we have of it is a religious translation.

Although clearly Germanic (displaying many of the sound shifts which typify it), Gothic is remarkably conservative, probably more so than unattested contemporary Germanic languages to the north and west.

Atta unsar þu in himinam, weihnai namo þein, qimai þiudinassus þeins, wairþai wilja þeins, swe in himina jah ana airþai. Hlaif unsarana þana sinteinan gif uns himma daga, jah aflet uns þatei skulans sijaima, swaswe jah weis afletam þaim skulam unsaraim, jah ni briggais uns in fraistubnjai, ak lausei uns af þamma ubilin, unte þeina ist þiudangardi jah mahts, jah wulþus in aiwins.

How to learn languages – Vulgar Latin

What are referred to as “Romance” languages are all derived from Latin. That much most people know.

There is a tendency, therefore, to compare them (the relevant national Western European languages are Portuguese, Spanish, French and Italian) to Latin when it was at its most prestigious – which, literary academics ancient and modern would generally agree, was the Classical Latin of Cicero and Caesar in the century before Christ.

However, Latin remained a coherent, single spoken language for many centuries afterwards. For hundreds of years even after the fall of Rome, people could travel from modern Portugal to modern Romania and still be understood in their native tongue. However, the Latin language in the centuries after the fall of Rome was as distant from Cicero as Modern English is from Chaucer. Not only was there the time difference during which the language changed, but also even in their own day the formal language of Cicero and Caesar was already markedly different from the colloquial language actually spoken in the streets (leaving aside that the prestige language of government and high culture in contemporary Ancient Rome was not Latin at all, but Greek).

Therefore, it is not hugely helpful to compare modern Romance languages with Classical Latin, when there is a later version of Latin which was still in use many centuries later and which is of more practical use for comparison. Around half the changes which took place between Classical Latin and modern Portuguese, Spanish, French and Italian had already happened before those languages split. Therefore, this later version, referred to as “Late Latin” or “Vulgar Latin” (linguists dispute the exact distinction between these terms), is the one to focus on.


So what was “Late Latin” like?


Phonologically, final post-vocalic <m>  (and also often <s>) was already lost in all but the most careful speech in Classical times, and the distinction between long and short vowels was soon lost too. This meant that the distinction between, for example, mensa (subject), mensam (object) and mensā (ablative, ‘by’) was already not generally maintained in the speech of citizens of the Roman Republic.

There was significant “palatisation” of consonants (in effect, the subtle pronunciation of a sound written in English as after the consonant) in some positions, particularly before high vowels (usually written <i> or <e>). The most notable instances were /k/ (usually written <c>) and /g/; it also affected /t/, giving it a sound more like /ts/ before high vowels (cf. Classical Latin gratiae, modern Italian grazie ‘thanks’). The exact outcome of this palatisation in different dialects varied (and some insular dialects of Late Latin avoided the change altogether.)

The letter <v> moved from Classical /w/ to more modern-sounding /v/; the letter /h/ was dropped altogether.

Stress became more marked than in Classical Latin, which may have been more pitch-based. Along with the distinction between long and short vowels ceasing to be contrastive, numerous unstressed syllables were lost and various consonant clusters simplified. This meant Late Latin had a considerably more vocalic sound than Classical Latin (although still not as markedly as modern Italian).


Most Late Latin speakers remained illiterate, although a sizeable minority could read and write. What they read and wrote, however, was Classical Latin (at least until around the seventh and eighth century). Speakers would have been aware that there was a marked distinction between the way they spoke and the way they wrote, but that the agreed (Classical) written form was essential to understanding in education and the church. From the eighth century on (although the exact time varied from location to location) there was an understanding that Classical Latin was a long way from the spoken language, and that was when the ‘daughter languages’ (French, Spanish, Portuguese, Italian and others) began to develop as recognisably distinct tongues.


Vocabulary remained overwhelmingly from Classical Latin. However, over time, some words were lost as others expanded their meaning. For example, fabulo ‘I tell stories’ came to be expended to mean simply ‘I speak’; meaning loquor ‘I speak’ was lost; Classical Latin caballus was specifically ‘nag’, but Late Latin caballu meant ‘horse’, meaning equus was lost (or narrowed in meaning to merely ‘mare’).

Key numbers:

  • I unu, II duu, III tres, IV quattor; V cinque; VI ses; VII septe; VIII octu; IX nove; X dece;
  • XI undeci; XVI sedeci; XVII septedeci; XX veinti; XI veinti unu; C centu; M mil.


In theory, nouns retained their “declension” system (the five groupings of Latin nouns, determined primarily by their stem vowel at the end of the word before the ending). However, because of the aforementioned phonological changes (plus, perhaps, some Germanic influence), distinctions between the five core noun cases of Classical Latin were lost, regardless of declension. Initially these were reduced and then, in some dialects, extinguished altogether; for example (using ‘table’) mensa-mensam-mensae-mensae-mensā became simply mensa-mensa-mense-mense-mensa – thus distinguished only between a “general” case mensa on one hand and a combined “possession/indirect object” mense on the other; similarly (though initially not quite identically) Classical fundus-fundum-fundi-fundo-fundo became just fundu-fundi. Ultimately this was reduced to one in most (though not all) dialects, based usually on the accusative (the singular object form which, in Classical Latin, had generally ended in -m). Plural forms varied along a broad West/East split – typically Western dialects adopted the accusative (object) plural form for all cases (mensas, fundos); Eastern dialects effectively maintained the nominative (subject) plural form for all cases (mense, fundi); and there were some exceptions (some northern dialects maintained -s endings in the singular for masculine nouns; some eastern dialects maintained a separate genitive/possessive plural form).

Verbs remained marked primarily for tense; also for voice and mood:

  • the present tense remained a single tense marked almost exactly as in Classical Latin;
  • the past tense retained a distinction between “imperfect” and “perfect” action (repeated action or single action), but endings were shortened;
  • the pluperfect tense (past in the past) was generally lost and came to be expressed in other ways (usually using the past “perfect”, see next point);
  • the present “perfect”, consisting of the verb abere ‘to have’ or essere ‘to be’ followed by a participle form (e.g. cantatu ‘sung’, amatu ‘loved) originally marked a past action affecting the present, but came to be used in general in some dialects to refer to a single action in the past (and, with the past form of abere or essere, it took over entirely as the pluperfect);
  • the future tense was retained but the Classical form was replaced entirely by a form using the “infinitive” form of the verb (ending -re; e.g. cantare ‘to sing’) with the verb ‘to have’ (cf. modern Italian cantare ‘to sing’ plus ho ‘I have’ gives cantaró ‘I will sing’; Spanish cantar plus he gives cantaré the exact same way);
  • an additional near future tense was formed from ire ‘to go’ with the “infinitive” (vas cantare ‘you are going to sing’);
  • the conditional tense was retained by all dialects in varying forms (usually again involving ‘to have’);
  • the imperative (ordering) and subjunctive (counter-factual) mood were retained, and in all tenses (although the past subjunctive became unstable and was replaced in some cases by old pluperfect forms); and
  • passive verb forms were lost, replaced by a construction with essere and the past participle (es cantatu ‘it is sung’) or even a simple reflexive (se cantat).

Verbs did not require subject pronouns – canto on its own meant ‘I sing’, cantas ‘you sing’, and so on, as in Classical.

Verb endings in present tense (-a- stem; 1st, 2nd and 3rd person):

  • canto, cantas, cantat; cantamus, cantatis, cantant

Note also “infinitive” cantare; “past participle” cantatu; “present participle” cantante; “gerund” candandu.

Adjectives continued to agree with nouns in all ways and all cases, tending to be placed after the noun (but this was not compulsory). Contary to Classical Latin, however, adverbs were formed by the feminine singular form of the adjective plus the word mente ‘of mind’; thus lentu ‘slow, tedious’, feminine lenta, adverb lentamente ‘slowly, tediously’. The irregular adverbs bonu ‘well’ and meliore ‘better’ were retained.

However, the most obvious difference with Classical was perhaps the explosion in prepositions, and the introduction of articles. Because nouns were no longer so clearly marked for case, prepositions were required to establish meanings – so words such as de, ad and cum came into much wider use (although not always as prepositions; with pronouns, for example, cum was often a postposition – tecu(m) ‘with you’ [lit. ‘you with’]). For the same reason, the determiner ille/illa/illu ‘this’ expanded its meaning to appear in front of nouns widely, thus generally translated as the definite article ‘the’ (these were also adopted as third person pronouns in most dialects); and the numeral un(us)/una/unu ‘one’ expanded its meaning to become the indefinite article ‘a/an’.

Word order shifted in Late Latin from the SOV of Classical Latin to SVO, but only where the object was a noun (SOV was retained where the object was a pronoun). This generally remained the case for questions, although VSO was also possible. Negation was formed simply, as in Classical Latin, by way of the particle non.

Classical Latin subordinating conjunction quod became que (eventually pronounced without the /w/) during the Late Latin period.


Late Latin was of Latin-Faliscan origin, but unlike Older Latin was spoken at a time that all other Romance languages had been lost.

Late Latin was markedly more vocalic and verbal than Classical Latin. Many of Classical Latin’s complex constructions around nouns were replaced by clauses centring on verbs.

Late Latin remained a solely spoken language (all the forms given here are reconstructed rather than actually attested). Literate people still wrote and preached Classical Latin, albeit with some influences (e.g. more prepositions than in ancient times). Every speaker would have been aware of the different registers. Historical records suggest it was not until into the eighth century that this became a real problem, with Late Latin speakers only then having genuine difficulty understanding sermons (precipitating a growth by the year 800 of the use of the vernacular even in formal contexts).

As noted above, it was at this stage that the commonality of Latin broke down into local dialects, which were then in subsequent centuries rebuilt into the national languages of modern-day Portugal, Spain, France, Italy and Romania (with official use in neighbouring countries also).

What now?

Let us have a look at where Germanic languages came from on the same basis next week; then on to the modern day!

Patre nostru, qui es in illi caeli, santificetu es tuu nome. Adveniat tuu regnu. Es tua volunta, sic quomo in ille caelu et in illa terra. Nostru pane quotidianu danos hoie, et nos dimitte nostra debita sic quomo nos dimittimus illi debitori nostri. Et non nos induce in illa tentatione, mae nos libere de ille malu.

How to learn languages – Indo-European

So, following on from last Friday’s general introduction, let us start at the beginning.

imageThis is the “family tree” of Indo-European languages. It is slightly simplistic, as it does not take account of languages which have been heavily influenced by other languages (not least English!)

This means that over 400 languages, including all national languages in Europe bar Finnish, Estonian and Hungarian, are derived from a single tongue spoken around 5000 years ago, probably in or near modern Ukraine, which we now call “Proto-Indo-European” (PIE). Half the world’s population speak a daughter language PIE natively. PIE then broke up over the centuries into different dialects as tribes moved geographically and language changed (for a range of reasons from basic language change to coming across new things to describe and, of course, coming into contact with other languages).

So a good start is to have some idea what PIE was like.


Clearly, we do not know precisely what PIE sounded like.

However, we can, through reconstruction, work out that it had a lot of various sounds similar to those typically represented by modern English <h> and <l>. Most of these have been lost, but we can tell they existed from the way words developed subsequently.

We can reliably guess more about consonants than vowels, although we do know the most commonly occurring vowels were /e/ and /o/. Consonants were distinguished not just by “voiced” (e.g. /b/) and “voiceless” (e.g. /p/), but also “aspirated” (as Classical Latin <ph>). There would also have been considerably more of these (i.e. individual consonant sounds) than in most modern languages.

Most noteworthy of all, perhaps, is the clear indication that PIE relied on pitch rather than stress; and that this was applied at the start of words (perhaps with the exception of words with prefixes, which were exempt). This would have given it a markedly more different sound from any Western European language now.


Proto-Indo-European speakers had not, of course, developed the technology of writing. Written forms of the language are, therefore, the reconstructions of academic linguists.


Most of our vocabulary originates from PIE (though in fact this figure is lower for Germanic languages such as English than it is for Romance languages derived from Latin).

Key numbers:

  • 1 hoi-no-; 2 dwo-; 3 trei; 4 kwetwor-; 5 penkwe; 6 sweks; 7 septm; 8 oktou; 9 newn.; 10 dekm.

Note also k’m.tóm ‘a large number, a hundred’

PIE did have nouns, verbs and adjectives (this is not the case for all languages worldwide). However, other classes were less clear – what are now prepositions in most daughter languages were often postpositions or simply affixes, for example.


Nouns in PIE had eight, perhaps nine, cases – marked by endings to distinguish whether they were being used as subject, direct object, indirect object, possessor, recipient and so on. They had three numbers (singular, dual, plural) and three genders (masculine, feminine, neuter), and fell into a number of classifications. Some were further grouped – those ending -r, for example, often marked family relationship (and generally still do).

Verbs were marked, either by changes to the root vowel or by an ending (or both), primarily for aspect (rather than tense, as such) – whether something is relevant to the present or not. There were also complex moods – essentially marking whether something was certain, optional, counter-factual, and so on. Verbs could also be marked directly for mediopassive – the passive (effectively switching the subject and object around) or reflexive (making the subject also the object). They came in four classes – marked by the stem vowels (i.e. those generally appearing before the ending) /a/, /e/, /i/ or none – and were themselves classified by aspect (as being stative, reflecting a state; imperfective, reflecting something ongoing; or perfective, reflecting something complete – thus, where in English it is correct to say both ‘I boil the water’ and ‘The water boils’, PIE would not have allowed the same form for both).

Common (thematic) verb endings (1st, 2nd and 3rd person):

  • Singular: -oh,-esi, -eti 
  • Plural: -omos, -ete, -onti

Dual also existed, but is not relevant to modern Western European languages.

All adjectives agreed with nouns; it is unclear how much distinction there was between adjectives and adverbs.

Pronouns were markedly different from how we currently understand them. For example, there were first and second person pronouns (‘I’, ‘you’, ‘we’) but not third person (no ‘he’, ‘she’, ‘it’, ‘they’).

Key personal pronouns (in nominative/accusative) were:

  • singular h,eg’oH/h,me’, tuH/twe’; plural wei/nsme’, yuH/usme’

Word order was generally SOV, although the range of cases (and other marker particles) would have allowed significant variation for emphasis and there was a shift in some dialects late on to SVO. The key negative particle was ne.


Clearly, it is hard to assess the character of a language spoken thousands of years ago.

We do not know exactly what its own origins were, and whether they were shared with any other language tree (this is keenly debated by linguists, but seems unlikely to me).

We know something about the culture. We can tell from the language that society was clearly patriarchal, for example. Much of this too, however, remains keenly debated.

What now?

Let us move forward then to the earliest “Romance” and “Germanic” languages.


How to learn languages – General

This list of vocabulary items proved popular among a number of correspondents.


So I intend to run a trial series on Fridays on how best to learn other Western European languages – please participate (and correct me where appropriate)!

The idea is to give an absolutely basic grounding, from which you can develop knowledge in the ways I have suggested in the past. Remember, motivation is essential!

Here is one absolute essential: the trick to speaking a language is not to know everything, but to get around what you do not know. That is what this is about!


To speak any language, you will of course need to know how it is pronounced. You need only the basics to start with – most consonants are pronounced the same way in any language, so you will need to know the vowels, probably the diphthongs (two vowels pronounced together), and perhaps some awkward consonant clusters (consonants appearing together).

Over time, it pays to mimic the rhythm and intonation of the target language. To speak Italian like a Cockney or French with an Ulster accent is like trying to learn the words of a song without the tune. You will never get it absolutely perfect, but you want to get to the stage where you are not immediately identifiable as an English speaker (not least because that makes it hard to practise if the other person knows, or thinks they know, English).

As a quick tip: not all letters are entirely individual. Many are actually closely related to each other, and this can have an impact on how they change from language to language. For example, pairs such as /b/ and /p/, /v/ and /f/, /g/ and /k/ or /z/ and /s/ are in each case voiced and voiceless versions of what is otherwise the same letter; some languages may distinguish them, others may not.


Standardisation is an essential part of this – each modern Western European national language has a written standard. Such standards have developed in different ways – some gradually through time through constant updating, some based on deliberately conservative usage of a particular geographical dialect, some as deliberate mergers of dialects. Exactly how deliberately standards were developed and how widely accepted they are varies from case to case – but knowing something about how a standard developed will always help guide a learner to a general understanding of the interconnection between the spoken and written language. (Of course, some learners may specifically wish to focus on specifically on spoken or specifically on written – a decision worth making at the outset.)


Firstly, you will want to have a basic idea where most of the vocabulary comes from. This is often quite easy – most Italian words come from Latin. However, it can be tricky – English is a Germanic language, yet much of its vocabulary is directly or indirectly from Latin. Knowing this means you can take a reasonable guess even at vocabulary you do not know (remember – the trick is to get around what you do not know).

Secondly, you will want a reasonable list of pronouns/determiners (including articles) and prepositions – in English, such words include ‘that’, ‘the’ and ‘to’. Such words do not directly translate from language to language (remember, no vocabulary does!!), but it is absolutely necessary at least to recognise many of them at the outset, and then begin to use them by mimicking the patterns you hear.

Thirdly, you will want the above list. What is it? It is a list of what I have found to be “core vocabulary”; words which are essential to saying things. Remember, again, the key is to “work around” what we do not know – for example, we do not need to know the word ‘often’, provided we can say ‘nearly always’ or ‘sometimes’. The above list is the ultimate “work around”!


Unfortunately, you cannot get anywhere without grammar. This is often dreaded because it tends to be taught in too much detail. To start with, you need only the basics (and to know which quirks to watch out for); the detail can come freely once you are using the language.

Firstly, you will want to know how nouns work – they or the words around them may or may not be marked for number (in English, singular or plural), gender (masculine, feminine, neuter) or case (in English, direct ‘they’ versus oblique ‘them’; many languages have far more than this).

Secondly, you will want to know how verbs work – they may or may not be marked for (or supported by other words to mark) tense (in English, past or present), aspect (whether ongoing and/or relevant, e.g. ‘I have been’, ‘I am being’) or to “agree” with the subject (‘I like’, ‘she likes‘). They may also be marked directly for mood (in English, indicative or subjunctive) or voice (active or passive).

It is worth noting that tense is a peculiarly Indo-European thing; languages around the world often have verbal systems which indicate the evidential basis of the action (whether I felt it; saw it; heard about it first-hand; heard about it from other sources; etc), and some have no concept of time within their structure or vocabulary whatsoever. The comparative obsession with tense is itself a relatively recent innovation within Indo-European – originally, the focus was more on mood and aspect (essentially on relevance rather than particularly time).

Thirdly, you may want to know how adjectives work – they may or may not “agree” with nouns; and they may or may not take the same form as adverbs.

Fourthly, you will want to know at least basically how clauses are structured, including the main word order (English generally is “SVO” – subject, verb, object), negation, and connecting words (‘He came but you stayed’, ‘I like that she was here’, etc).

There will also be other particles to deal with – how to link things together, express questions or exclamations, and so on.

This seems like a lot, but you can do it in stages – work out how nouns work, then verbs (and put those together), then adjectives (and add those in), and then structure, picking up the particles as you go along.


Character? I think knowing a language’s character before you begin is as relevant as anything.

Firstly, you want to know the background to the language. Where does it come from? What influences are contained within it? For example, English is a Germanic language heavily influenced by Norman French, marked also by a significant sound shift from around 1350-1600. Knowing this means you can make sense of why the vocabulary is the way it is (with basic words generally Germanic, and high culture words French or Latin), why the spelling appears so odd, and even to some extent why the grammatical structure is relatively simple.

Secondly, you may want to know generally whether the language is predominantly nominal or verbal – in other words, does it build clauses predominantly around nouns (facts) or verbs (actions)? This is general, and no language is absolutely one way or the other, but knowing this gives you a real feeling for how the language is used.

Thirdly, languages are not standalone things – they are products of a culture. You will need to learn something about the character of those who speak them too. (However, steer clear of stereotypes, which are often unfair and unhelpful!)

What now?

So, let us try this with a few languages over the next few Fridays – we will go back in time to start with, to touch on this final “character” point (and also make us realise how lucky we are that languages simplify over time). Then we will try some modern Western European languages.

2017 – Twenty seventeen

A notable feature of confusion in the English language over the past generation or more has been the pronunciation of years in the 21st century. Did the London Olympics occur in twenty-twelve or two-thousand-and-twelve?!

Interestingly, there is less doubt as to the long-term answer to this than the short-term. In the short term, the tendency was to carry over the tendency from the first decade of referring to, for example, the Beijing Olympics of “two thousand and eight”. Thus, even the Rio Games were, for some, in “two thousand and sixteen”. There is a certain logic to this – “2016” is so pronounced otherwise, and most other languages of which we are aware (admittedly those more distant linguistic cousins of Latin rather than Germanic origin) make no distinction between the pronunciation of years and the pronunciation of numbers in general.

However, the long term trend in English is towards first two digits followed by final two digits pronounced as separate numbers. This year will, a generation from now, always be pronounced “twenty sixteen”. This tendency will work backwards too; 2012 will almost certainly come universally to be “twenty twelve” within the next few years. In fact it is not impossible that, in the second half of this century when it is out of living memory for most, even 2008 will come to be “twenty oh eight”, although this is less predictable.

The same did not quite apply in the 20th century, but there is a slight parallel. At the time, the years of the Edwardian era were lengthened by many speakers so that, for example, by maternal grandfather was born in what was often pronounced at the time as “nineteen hundred and six”, even occasionally “nineteen and six”. It was only later that “nineteen oh six” became universal.

Perhaps because of the extra syllable, next year will come even in its own time to be almost universally “twenty seventeen”, and it is this which will see previous years gradually re-pronounced by analogy (a significant aspect of linguistic change which is still very much apparent).

So, Happy New Year and wishing readers a very prosperous twenty seventeen!


German – great for bossing people about with clarity!

Hier geparkte Fahrzeuge werden kostenpflichtig abgeschleppt. 

So goes one of my favourite sentences in any language. It, or slight variations of it, can regularly be seen all over urban Germany.

Loosely, it means “Vehicles parked here will be towed away at the owner’s cost”. But directly it is far better, more or less “Here parked vehicles will be costs-dutybound towed away”. I love the clinical nature of it – I mean, actually you can park here, but the specific penalty will be (will be, not may be) paying to retrieve your car from the tow company before you have access to it again.

I came across a similar principle on a sign on the fencing around a non-league football pitch in Hamburg last weekend.


Simply brilliant. Loosely “Anyone cursing or offending the referee must count on a dismissal from the sports ground”. Fantastic – I mean in theory you can curse the ref, but if you do you must (must!) count on dismissal.

This is so much better than “Do not block access” or “Swearing at the ref will not be tolerated”. These lack clarity. What if I do block the access – will you merely be slightly miffed or are we talking prison? Not tolerated by whom and what will they do about it – it could be anything from a stern glare to a visit to the local police station! Such a spectrum of potential consequences means I am probably more likely to risk it. But if I know I am getting towed away or removed from the ground, well, at least I am absolutely clear about the odds and not likely to test them! (Mind, when the home team was 3-0 down at halftime in that evening’s game, some may have been tempted to let off a bit of steam in return for not having to watch any more!)

This is all a bit of fun of course – but maybe another example of how culture reflects language and vice-versa?!

“Tense” is a very Western thing…

French spelling is notoriously conservative, but it had at one stage moved from Latin tempus ‘time’ to the spelling tens. This was subsequently re-Latinized to the modern spelling temps (although words such as tentation remain), but not before English had borrowed the word ‘tense’.

Any language learner will be familiar with ‘tense’. Indeed, we are very familiar with the notion that there is a ‘past’, a ‘present’, and a ‘future’. These assumptions are widespread, and even make their way into artificial languages such as Esperanto.

In fact, they are profoundly wrong, in two main ways.

Firstly, Germanic languages such as, well, English, do not in fact have three tenses. English has a present (‘like’, ‘break’) and a past (‘liked’, ‘broke’) – and that’s it. The modal verb ‘will’ (or, archaically, ‘shall’) can be used to mark that something is due to occur in the future, but it is far from necessary – ‘Tomorrow I am going to Germany’ is present grammatically, marking future; but ‘He will go on and on about it’ is marked as if future while in fact present. The notion that English has three tenses derives from Latin, but English is not a Latin language.

Secondly, the very concept of tense itself is rare. It has become widespread in Indo-European languages, spoken by half the world’s population natively, so we assume ‘tense’ and ‘language’ go together like ‘fish’ and ‘chips’. In fact, very few languages beyond the Indo-European family routinely mark for ‘tense’.

Indeed, Indo-European languages themselves marked originally for ‘aspect’ – not when something happened in relation to the present, but rather whether it was relevant to the present (the difference fundamentally between ‘I liked’ and ‘I have liked’). This notion of relevance and indeed general evidence as to whether something has occurred is much more common in languages such as Chinese and Indonesian; these routinely mark for closeness to the action in various ways, but not for tense unless for some reason time is very relevant. Indo-European languages hint at this too in their use of ‘mood’ – German for example distinguishes between whether something is definitely the case marked by the usual indicative form (sie hat es getan ‘she has done it’) or allegedly the case marked by the rarer conjunctive (sie habe es getan ‘she is said to have done it’).

It is beyond my expertise to explain how relevant this is socially and culturally, but inevitably it means that non-Indo-European-speaking societies are (and were) generally less focused on time than Indo-European-speaking ones. The very concept that time has a beginning and an end and is a single spectrum from past to future is an Indo-European one, not backed by other linguistic frameworks (and not, actually, by science – though Einstein’s theory of relativity is beyond the scope of this blog). Other societies globally see time as much less relevant, and may view it as circular or simply marginal.

The fundamental here of how language shapes society and vice-versa is subject to much debate. However, we can at once see why that debate is so keenly participated in!