How do we know what Latin sounded like in Caesar’s mouth? Did Caesar pronounce /c/ or /k/? The traditional pronunciation was developed and handed down over many centuries of teaching Latin. It is most similar to the mediaeval pronunciation, based on the Frankish variant of Latin from the 9th century. But wasn’t Cicero actually pronounced “Kikero”? – asks prof. Kinga Paraskiewicz, director of the Institute of Oriental Studies at the Jagiellonian University.

TVP WEEKLY: Half of the world’s humanity, including ourselves, speak Indo-European languages. There are new theories emerging about their origin, spread, and kinship – recently, the so-called hybrid theory has surfaced. How do you study languages to arrive at conclusions about their “cradle” and age?

As a traditionally educated philologist, I deal with languages that are or were natural, spoken, documented in writing, or can be reconstructed using the historical-comparative method. I am interested in languages that have already been deciphered and those still hidden from our understanding, like the Linear A script from Crete. On the other hand, artificial languages are a human creation...

And aren’t natural languages also a human creation? It’s just that not one person sat down and invented a language system, like Ludwik Zamenhof did with Esperanto.

An artificial language is a conscious human creation, while a natural language evolved in a population over a longer, natural process. That’s why languages are so diverse – both genetically (in terms of origin) and typologically – they have different structures. Linguists study languages from various angles. They describe them at a particular stage of development (e.g., how spoken or written English looks today) or focus on their history, the evolution of their structure. There are theoretical linguists who build models of individual languages or theories regarding their universal features. And then there are researchers interested in the practical aspects of language, the so-called applied linguistics, implemented in teaching foreign languages or in translations. In any case, linguistics is about working with text. Nothing certain can be said without a text.

Since half of humanity speaks Indo-European languages, the rest speak others. What are these groups and where do they come from?

There are two main classifications: historical, where languages are segregated according to the degree of common ancestry, and typological, which divides languages based on their systematic-structural features. The first (also called genetic or genealogical, but this is only a technical term – no DNA research is used in its determination) thus studies the kinship of languages on the model of family relationships. Languages are arranged in a sort of genealogical tree, from contemporary ones to the oldest ancestor, which is usually reconstructed because it is extinct.

Apart from the largest, the Indo-European group, people communicate in Afroasiatic languages, which until recently were known as Semito-Hamitic – from the Biblical names of Noah’s sons, Shem and Ham. There was a third son – Japheth. And the existence of a family of Japhetic languages was postulated in Soviet Russia in the 1920s by the Georgian linguist Nikolai Marr, who considered, among others, the Caucasian languages. Incidentally, Marr was also the creator of the theory of language evolution, where the main factor was supposed to be the change in social formations (the so-called Marrism). Joseph Stalin, also a Georgian, initially favoured this theory, but fortunately, the Marrists fell from grace in 1950, which ended the life of this theory. Returning to language families, we also distinguish the Ugric-Finnic, now more often called the Uralic, Caucasian, Altaic, and Sino-Tibetan. So there are many such families. It is assumed that in each family all languages should derive from a common ancestor.

Was there one proto-language for all of us, like one primaeval father Adam?

Of course, there are such theories, e.g., the Nostratic theory (“noster”, meaning “our”), which tried to connect languages under a common proto-proto-language. In 1903, the Danish linguist Holger Pedersen postulated the existence of such a hypothetical language macro-family. Supporters of this theory include at least the Indo-European, Altaic, and Uralic families in this group. But these are merely interesting intellectual constructs. With the greatest probability, we can only say that language families form a natural system. That is, there certainly were common ancestors, for example, for the Indo-European or Afroasiatic group. The problem is the so-called isolated languages, which cannot be included in any group. In Europe, for example, the Basque language, whose genetic affiliation has not yet been determined.

The Basques biologically seem unique, possibly largely descendants of the first Paleo- and Mesolithic hunter-gatherers who appeared on our continent and survived in a mountainous enclave. And for unknown reasons, they have a lot in common with the native population of Scotland. Perhaps they simply did not abandon their “Paleolithic” language.

Exactly. These are the “orphans” that cannot be attached to any tree, yet their language lives. There are assumptions that Basque is related to the Kartvelian languages of the South Caucasus, mainly Georgia. Although the surrounding group of Indo-European languages has a fairly distant, documented history of development. The oldest available texts are from 4,000 years ago and belong to the Hittite language – now extinct. We also have ancient Sanskrit texts – the earliest in the collection, the Rigveda, was probably created between 1500 and 1200 BC, although some researchers believe it was earlier. We know the most ancient and later versions of Greek and Latin, so we can compare. This allows us to find certain phonetic regularities, named sound laws, similarities in grammar, or lexicon. Which, in turn, also allows us to divide the entire group of Indo-European languages into subgroups, such as Romance (derived from Latin), Germanic, Slavic, Iranian, Indian.

And does geography matter in all this, since we name these groups after the places where populations speaking these languages live today? After all, people migrate, form “intertribal” unions, from which children are born…

Geography in naming language families is not important, precisely for the reasons you mentioned. The Indo-European peoples migrated to Europe from Asia, although exactly from where – is in linguistics more a matter of conjecture and dispute, rather than certain statements. This “half of the world” that now speaks Indo-European languages is the result of massive migrations and colonisation by the Spanish, Portuguese, English, French, or Germans to the lands of both Americas, Oceania, etc. Very historically recent, totaling a maximum of 500 years. Meanwhile, the naming, division into linguistic subgroups, is a convention. The name “Indo-European” suggests that this is how people speak in India and Europe. The Germans, of course, have their own name for this group, traditionally calling it “Indo-Germanic.” This has a lot to do with politics, both past and present…

Indeed, Professor Mariusz Ziółkowski from the University of Warsaw recently told us about how Heinrich Himmler’s organisation for researching ancestral heritage – Ahnenerbe – was mainly built by German linguists, not archaeologists or anthropologists...

This specifically revolved around the Aryans, who are – linguistically – the ancestors of today’s residents of India, who use, among others, the Hindi language. They are also the ancestors of the Iranians, which is less commonly mentioned. In Sanskrit and Avestan texts, “aryana” means “ours, native”, as opposed to “foreign”. The name Iran is nothing but the old “aryanam”. In the 1930s, the Germans exploited the concept of Aryanism for their madness. However, it is worth noting that Iran is a changed name of the country, which we all in Europe knew thanks to the Greeks as Persia (though Pars/Fars is only one of the provinces). It was Reza Shah Pahlavi in the 1930s who asked the European powers not to use the name Persia in international nomenclature, diplomacy, etc., but Iran – a name that had been used for centuries in Iran itself.

Today, a similar situation is happening in India – PM Narendra Modi and the ruling party believe that the name of the state used worldwide is associated with the colonial period. In invitations sent out for the G20 summit, Draupadi Murmu was signed as “president of Bharat.”

Yes. In India, India is called “Bharat”. Poland in Iran, in Persian, is not called Poland, but “Lachestan”. But we do not protest against this – and rightly so, because it is a beautiful and historical name, recorded in old books. Italians also do not protest that we do not call their country Italia.

Let’s return to the appearance and changes of Indo-European languages. How do we know how people spoke long-extinct languages when we find it difficult to read and understand the Polish “Bogurodzica” [“Mother of God”; a mediaeval Roman Catholic hymn composed sometime between the 10th and 13th centuries in Poland – ed.] without footnotes?

Fortunately, most Indo-European languages are known to us, as I mentioned, from ancient texts. And we can go back to their time without any problem. For earlier events, we rely on the reconstruction of their ancestors based on sound laws, etymology, etc. In these languages, basic vocabulary remains constant and is not influenced by foreign elements, such as names for kinship or numerals. Take Sanskrit or Avestan, Greek, Latin forms – everywhere we have: matar, meter, mater. Similarly with father: pitar, patar, pater. The same will be with brother, sister, daughter, son, etc. Let me now count to 10 in Persian: yek, do, se, chahar, panj, shesh, haft, hasht, noh, dah – how many numerals did you recognise? This is modern Persian, and the numerals have survived from the earliest records of Old Persian, i.e., from the 6th century BC. Of course, the pronunciation has changed somewhat. In all these languages – with the interesting exception of the Anatolian languages – a common vocabulary associated with wheels and wheeled vehicles has also survived.

The most important for reconstruction are the sound laws and phonetics deduced from such analogies. In Łódź, there was Professor Ignacy Ryszard Danka, who not only dealt with the reconstruction of the Proto-Indo-European language but even wrote poetry and hymns in it. Of course, with phonetic notation, because we do not know the script of this language, if it even existed. When Czech orientalist Bedřich Hrozný deciphered the oldest Indo-European Hittite texts from the 2nd millennium BC, from the extinct Anatolian group, he was studying how this language, dead for millennia, was articulated. It was recorded in Old Assyrian cuneiform, of which he was a specialist. Knowing roughly how these cuneiforms were pronounced, he could read the text without yet knowing its meaning. And so the dead language of the powerful, flourishing Hittite state in Anatolia, which lasted for half a millennium, was gradually reconstructed.

So, it would be like a Russian who doesn’t speak German reading a German text written in Cyrillic?

Exactly. Assyrian cuneiform was widely used and “borrowed” by neighbouring peoples to record their own texts, with numerous modifications. Similarly, those who spoke Old Persian, for instance, during the time of the Achaemenid dynasty (like Xerxes, Darius) who warred with the Greeks in the 6th century BC – and hence some memory of them exists to this day – also utilised a script that was not their own, a slightly modified version borrowed from their neighbours, the Assyrians. Since the inscriptions were trilingual, they could be deciphered, just as Jean-Francois Champollion did with Egyptian hieroglyphs. Alphabets are adapted to an unwritten language; they don’t need to be invented each time. Today’s Indo-European languages use several major scripts: Latin, Cyrillic, Devanagari, and in Iran – Arabic. This does not stem from the nature of the given language, but rather from historical and political circumstances.

Here, it’s also important to mention how significant Hrozný’s decipherment of Hittite texts was, in which he found three consonants, the so-called laryngeals. Their existence had earlier been postulated by the eminent Swiss linguist Ferdinand De Saussure. He wrote that the vowel system in Indo-European languages is clear, but as if something was missing – sometimes the same vowels are long, sometimes short, they differ in tone colour, etc. Therefore, there probably had to be some laryngeal element that disappeared, affecting their colour and quantity. This gave rise to the laryngeal theory, which not all linguists share, but it allows, among other things, the reconstruction of the probable original sounds of extinct Indo-European languages.

So to a similar extent do we know what Hittite sounded like as Latin from the time of Julius Caesar?

How do we know what Latin sounded like in Caesar’s mouth? Did Caesar pronounce /c/ or /k/? The traditional pronunciation was developed and handed down over many centuries of teaching Latin. It is most similar to the mediaeval pronunciation, based on the Frankish variant of Latin from the 9th century. But wasn’t Cicero actually pronounced “Kikero”? We talk about centum (Latin “centum”, but Greek “he-katon”, e.g., “hecatomb” – “sacrifice of a hundred oxen”) and satem (Sanskrit “satam”) languages from the numeral “hundred.”

Changes in pronunciation could have been influenced by a change in diet – from the “hard” diet characteristic of hunter-gatherers or nomads, to the “soft” one typical of farmers. This could have affected dental development, and hence pronunciation. There was a study on this topic a few years ago in Science.

I’m not an expert in anthropology or the diet of our ancestors. For me, the most valuable languages are those that show the longest continuity in time, confirmed by texts. In Persian, this is from the time of the Achaemenids until the conquest of Alexander the Great (4th century BC). In the 3rd century AD, a continuation of Persian in writing appears (the so-called Pahlavi language), but this language is already changed. This allows us to see the trends of these changes. Old Persian was an Indo-European, inflectional language, where grammar operated with changes in word endings, it had cases and genders. Today’s New Persian is a language without declensions, without genders, and in its typological structure, it thus resembles English (also simplified, compared to German belonging to the same Germanic group).
Today, another theory has emerged regarding the place and time of origin, namely the so-called cradle of Indo-European languages. This theory was devised by specialists in linguistics, as well as archaeology and human archaeogenetics, from the Max Planck Institute for Evolutionary Anthropology in Leipzig. This theory attempts to reconcile the two dominant theories in this matter so far, the so-called Anatolian and steppe theories.

I’m not at all concerned with that. I have no expertise in archaeology or archaeogenetics. The language we speak is not written in our genes. Language cannot be extracted from human remains, nor from the geographical location of their burial, nor even from their grave goods or pottery – unless there are inscriptions on them. Again – if there is no text, we know nothing for certain. The more text there is – the more we know. The aforementioned lack of terms for the wheel and wheeled vehicles in Anatolian languages may be because they did not yet know about wheels. However, it cannot be ruled out that this vocabulary simply did not survive in the Hittite texts, because the texts we have – deciphered so far – are very few. Based on them, we can develop a grammar of the language. But in terms of lexical issues, we are looking at Hittite through a keyhole into a library.

History of culture is the history of language. It bothers me that we do not even know “somewhat for sure” when and where Indo-European languages originated. And what does it even mean that a “language originated”? What were the processes behind the emergence, separation, and evolution of new languages.

Linguistics cannot travel back in time based on strong evidence. The only chance is interdisciplinary research, but this will always lead to a multiplication of theories and those uncertain, interrupted lines with which languages’ spread is marked on maps. If we consider my beloved Persian, it has maintained continuity since the 6th century BC. Today, it is a completely different language, but this continuity of texts allows us to describe these changes very accurately and with a high degree of certainty. We do not have such comfort regarding many branches emanating from the Indo-European trunk, especially the trunk itself. Thus, one might claim, for political reasons, for instance, that the Kurdish language is a remnant of Median. The Medes created a solid state, were definitely an Indo-European people, and it would be good to descend from them. But information about their language is scarce, coming from Old Persian and Greek records, so I say: show me one Median text, then we can talk realistically, not hypothetically or politically about it.

