http://indo-european-migrations.scienceontheweb.net/
http://turkic-languages.scienceontheweb.net/index.html
-
Migration of Turkic languages Version 8.1 v.1 (04/2009) (first online, phonological studies) > v.4.3 (12/2009) (major update, lexicostatistics added) > v.5.0 (11/2010) (major changes, the discussion of grammar added) > v.6.0 (11-12/2011) (major corrections to the text; maps, illustrations, references added) > v.7.0 (02-04/2012) (corrections to Yakutic, Kimak, the lexicostatistical part; the chapter on Turkic Urheimat was transferred into a separate article; grammatical and logical corrections) > v.8 (01/2013) (grammatical corrections to increase logical consistency and readability, additions to the chapter on Uzbek-Uyghur, Yugur)
Abstract
The
internal classification of the Turkic languages has been rebuilt
from
scratch based upon the phonological, grammatical, lexical,
geographical
and historical evidence. The resulting linguistic phylogeny is
largely
consistent with the most prevalent taxonomic systems but
contains many
novel points.
Contents 1. Introduction 2. Collecting factual material
2.2 Dissimilar basic lexemes in the Turkic languages 2.3 The comparison of phonological and grammatical features
Some of the exclusive Bulgaric featuresYakutic Where does Sakha actually belong?On the origins of Turkic ethnonymy Altay-Sayan Tofa and Soyot closely related to TuvaGreat-Steppe Kimak-Kypchak-Tatar, Kyrgyz-Kazakh, and Chagatai-Uzbek-Uyghur seem to form a genetic unityKyrgyz-Chagatai Kazakh is closely related to KyrgyzKimak-Kypchak-Tatar The Kimak subtaxonOghuz-Seljuk Oghuz is still a valid subtaxonOrkhon-Karakhanid Orkhon-Karakhanid as a valid subtaxonYugur-Salar Yugur seems to be ancient
4.2 The taxonomic Classification of Bulgaro-Turkic languages 4.3 The Geographical Tree of Bulgaro-Turkic languages 1. Introduction The present study of the Turkic languages (2009-2012) was started as brief online notes that gradually grew into a series of online publications. The study is mostly an original research with relatively few references to previous theories. Most analysis was based upon factual evidence collected from dictionaries, grammars, language textbooks, native speakers on the web, sound and video fragments, books and articles containing detailed descriptions of specific languages. The resulting conclusions rarely draw from historically accepted opinions or assumptions produced by other researchers, rather attempting to build a logically consistent view of the spread of Turkic languages and their internal classification grounded in the nearly independent and relatively comprehensive step-by-step analysis. Nevertheless, the author deeply appreciates the extensive input from people who worked on the vast amount of Turkological literature dedicated to the numerous Turkic languages, as well as those who helped directly or indirectly by providing corrections and valuable notes by email or through web forums, without whose interest and collaboration this work would never have come to life. The present article provides all the linguistic argumentation concerning the internal classification of Bulgaro-Turkic languages. Furthermore, there are three other separate articles which can be regarded as part of the same work. The Lexicostatistics and Glottochronology of the Turkic languages (2009-2012) is a detailed research of Swasdesh-210 wordlists, which dates the Turkic Proper split to about 300-400 BC, and the Bulgaro-Turkic split to about 1000 BC. The Proto-Turkic Urheimat & The Early Migrations of the Turkic Peoples (2012-13) is a detailed analysis of the early Bulgaro-Turkic migrations largely based upon the results obtained in the glottochronological analysis above and the present classification. The Proto-Turkic Proper Urheimat area was positioned northwest of the Altai Mountains, and the earlier Proto-Bulgaro-Turkic Urheimat in northern Kazakhstan. The work explores the associations with the major archaeological cultures of the Bronze and Iron Age period in West Siberia. The Turkic languages in a Nutshell (2009-2012) embraces the final classification, trying to focus on the most well-established conclusions from various works including the present investigation. It also contains multiple illustrations, notes on history, ethnography, geography and the most typical linguistic features, which essentially makes it a basic introduction into Turkology for beginners. 1.1 Preliminary notes on the reconstruction of Proto-Turkic Before we proceed with the main analysis, let us consider the reconstruction of the Proto-Bulgaro-Turkic word-initial *j/*y, which has become a long-standing issue in Turkological studies, and which may affect certain conclusions in the main part of this publication. Many proto-language reconstructions in various branches of historical linguistics are often based entirely on the supposed readings of the ancient texts from the oldest family representatives. For instances, in the Indo-European studies we can avail ourselves of the wonderful attestations of Ancient Greek, Latin and Avestan. However, when the oldest representatives are poorly read and interpreted, such an approach can result in errors. Generally speaking, an ancient extinct language can only be seen suitable for reconstruction purposes, only if it meets several conditions, namely: (1) it is a uniquely preserved language closely related to a proto-state without the existence of any alternative sibling branches; (2) it is so well-attested that its data are completely reliable and no significant misinterpretations can occur from occasional mistakes in ancient writing, reading (e.g., from abraded petroglyphs), copying of the material, translation, interpretation, etc; (3) the script closely and adequately reflects the original pronunciation and we know full well how to correctly reconstruct that pronunciation from that script; (4) the linguistic material should should be dialectically uniform, in other word it should constitute just one language, not a mixture of various dialects or languages gathered by numerous contributors during generally unknown periods or from unknown areas [which is referred herein as the Sanskrit dictionary syndrome]. Obviously, the situation in Turkology does not meet these criteria. Orkhon Old Turkic, the oldest Turkic language attested in the inscriptions from Mongolia, fails to meet the first point (see details below), it barely gets in with the second one, and raises many objections with the third one. In other words, Orkhon Old Turkic may just be insufficiently old or much too geographically off-centered to be considered close enough to the proto-state. Moreover, there may be just not enough correctly interpeted material for the solid attestation and interpretation of ancient phonology. Orkhon Old Turkic is not as well reconstructed as, say, Latin and Greek in the Indo-European studies, so many readings are quite ambiguous. And finally, it often gets mixed in literature with Old Karakhanid, Old Uyghur and generally unknown Old Yenisei Kyrgyz dialects (given that not all of the Old Turkic inscription were made in Mongolia). Therefore one should not confuse the methodological basis established for the Indo-European reconstruction with the methods convenient for other language branches, such as Turkic. An old language is not always just good enough. As a result, the reconstruction of Proto-Turkic should be conducted by means of a completely different approach, namely using materials from the well-attested modern representatives of Turkic languages. In that case, we should build a reconstruction using a lineal formula with separately determined lineal coefficients representing contributions for each particular language branch. This method is drastically different from the old-fashioned old-language-for-all model. As an example, when reconstructing Bulgaro-Turkic, we could roughly assign about 50% to Chuvash and about 50% to Proto-Turkic Proper, and then more or less equally divide the second half among the most archaic representatives from the main branches, e.g. (1) Proto-Sakha, (2) Proto-Altay-Sayan + Proto-Great-Steppe, and (3) Proto-Oghuz-Orkhon-Karakhanid , hence each one of the main Turkic branches would receive only about 50% /3 = 17% (see the classification dendrogram at the end of this article). This example has been provided as a first-approximation approach to address the potential Old-Turkic-centristic attitude, which supposedly claims that "nothing that's not in Old Turkic could exist in Proto-Turkic" or that "Old Turkic is an ancient language, therefore it is more suitable for historical reconstruction". By contrast, the current revised method requires that Gökturk Old Turkic be considered as just one of several early Turkic branches, and it is hardly any more important for reconstruction purposes than about 17% or less. However, the figures for the lineal coefficients depend on the genealogical topology of the most basic shoots in the internal classification dendrogram. Therefore, using Turkic languages as an example, we come to a general conclusion that a consistent internal tree-like language group classification must be built before proceeding with the reconstrution of a proto-language. In other words, an internal classification should be constructed prior to further linguistic or geomigrational analysis. An example from the Revised Model: the reconstruction of the Proto-Bulgaro-Turkic *S- The above reasoning can be exemplified by the following reconstruction of the Proto-Bulgaro-Turkic *S- (the S-symbol should be seen herein as just an arbitrary way to designate the *y-/ *j-phoneme as in Turkic yer / jer "place, earth", yol /jol "way", etc ). A very common error resulting from the Turkish-for-all or Karakhanid-for-all model is the conclusion that the words with the y- were pronounced exactly the same way in Proto-Bulgaro-Turkic. This idea is very common even among Turkologists outside Turkey, and seems to go as far back as the Mahmud al-Kashgari's classical Compendium of the Turkic languages (1073). Note: Before proceeding with the further argumentation, we should confine ourselves only to the material internal to the Turkic languages, the Altaic and Nostratic languages being a completely separate issue that cannot be regarded herein at any length. This method can generally be called as an internally-based reconstruction vs. full reconstruction. Note: We try to consistently use the Anglophone-based transcription throughout all the articles as opposed to the German-based transciption that goes back to the 19th century's tradition, therefore /y-/ denotes a semivowel as in "year" and /j-/ or /J-/ an affricate as in "Jack". To avoid occasional confusion, the capital denotation /J-/ has been used in some places for additional emphasis. The digraph /zh/ or monograph /ž/ are approximately similar to the voiced sibilant in French "je" or English "pleasure", "treasure". The use of complex UTF signs was avoided for reasons of readability and technical compatibility. For further details on transcription see The Turkic languages in a Nutshell. The following table summerizes the pronunciation of the Turkic *S- in the most important branches:
This table shows that the pure /y-/ pronunciation is attested only within the following subtaxa: (1) in the languages historically connected with the Orkhon-Karakhanid and Oghuz-Seljuk subgroups, even though there seems to exist some /y-/-to-/j-/ allophonic distribution in Uyghur, some Uzbek dialects and some Oghuz dialects; (2) partly, in Yugur and Salar, which also belong to the southern Orkhon-Karakhanid habitat and may have been contaminated by it, considering they are located along the Silk Road outposts, where migrations were a very common phenomenon. (3) partly, in the /ya-/, /yu-/, /yo-/ syllables, in the languages descending from the late expansion of the Golden Horde, such as Kazan Tatar (but not the Kimak languages with an early separation, such as Karachay-Balkar). Nevertheless, even in Kazan Tatar, many speakers still report an allophonic distribution of this phoneme, therefore a clear-cut /y-/ exists mostly in the written standard, produced more or less artificially after the 1920's, as well as in the recently Russified speech, rather than in older dialects or geographically marginal languages, such as North Crimean Tatar, Eastern Bashkir, etc. Moreover, we still have /jil/, not /yil/ "wind" before a high vowel even in the standard Kazan Tatar. Consequently, we may conclude: (1) Only the languages related or adjacent to the Oghuz-Orkhon-Karakhanid branch seem to have a clear-cut historical attestation of the /y-/ semi-vowel, whereas the majority of other branches with an early separation and long isolation either get jumbled data or seem to be clearly going back to something like a strongly palatalized sibilant /s'-/, /j-/, /d'-/, /ch-/ or a similar consonant sound. This provides a purely statistical argument for our conclusion: there are more separate language branches that originally had an /s'-/- or /j-/-type phoneme than those that finally developed the /y/-phoneme. To put it in other words, it is statistically implausible that the supposed /y-/ > /j-/ mutation would have occurred simultaneously and independently in so many separately existing archaic branches. (2) As we can see in the fig. below, the distribution of the y-type phoneme seems to be located outside of the main historical diversification area of Turkic languages, therefore it appears to be a recent phonological mutation, apparently linked to the migration of the Orkhon-Karakhanid and Oghuz languages, which again implies that the development of /y/ might have been a rather unique phonological innovation in Orkhon-Karakhanid Old Turkic. This provides us with a second phono-geographical argument: only the J-type phoneme seems to be distributed near the putative homeland area of Turkic languages, not the y- semivowel. As to the existence of the allophonic /y-/-to-/j-/ phonological variation in the Kimak-Kypchak-Tatar languages of the Golden Horde, such as Kazan Tatar, the existence of /y-/ may be explained as an early Oghuz influence. As we will show below, the Golden Horde languages and Oghuz share many linguistic features at several levels, therefore this type of borrowing is well corroborated by other evidence of mutual interaction. (3) Moreover, if /y-/ were present in the proto-form, we would rather observe phonological variations of the semi-vowel /y-/ (not /J-/): e.g. we would find something like /y-/, /i-/, /0-/, /ê-/, /l'-/, /J-/, /zh-/ in the most archaic and diversified Siberian branches in the east (near the historical homeland of the Turkic languages), but what we do see in that area are the phonological variations of the palatalized consonant /s'-/: /s'-/, /s-/, /h-/, /ch'-/, /J-/, /zh-/, /d'-/, /ni-/, /y-/. On the other hand, the expected zero phoneme resulting from the loss of /y-/ is only present in the westernmost languages, such as Azeri (e.g. ulduz < yulduz "star", il < yil "year"), and, partly, in Turkish (cf. ïlïk, but Turkmen yïlï "warm"), which marks the /y-/-phoneme as a relatively recent and rather westernmost phenomenon connected with the spread of the Oghuz-Seljuk languages. This provides us with a phonological diversification argument: if the /y-/ semi-vowel were original, there would be a range of predictable sound changes in the most early diversified branches, but nothing of the kind is found there. Therefore, from the evidence internal to the Turkic languages alone, we may conclude that the *S- proto-phoneme in question can be placed somewhere within the range of sibilants {/s'-/, /s-/, /h-/, /ch'-/, /J-/, /zh-/, /d'-/}, and it could not have been similar to the /y-/ semivowel as in modern Oghuz-Seljuk languages. Actually, this conclusion concerning the reconstruction of the Proto-Turkic *S- is hardly novel and has been expounded several times by different authors, such as A.N. Bernshtam (1938), S.E. Malov (1952), N. A. Baskakov (1955), A.M. Scherbak (1970), as well as by the authors of the authoritative Russian publication, sometimes abbreviated as SIGTY, namely in its volume [Pratyurkskiy yazyk-osnova. Kartina mira pratyurkskogo etnosa po dannym yazyka. (The Proto-Turkic language. The Worldview of the Proto-Turkic ethnicity based on the linguistic data.), Moscow (2006)]. Note: Generally speaking, SIGTY [Sravnintelno-istoricheskaya grammatka tyurkskikh yazykov ("The Comparative Historical Grammar of the Turkic languages")] is a large and verbose multi-volume Moscow compehensive publication with detailed cross-comparative analysis of morphology, syntax, vocabulary, semiotics and other aspects of Turkic languages, produced between the 1970's and the 2000's. As an additional quite interesting argument, the authors of SIGTY suggest that, since other sonants, such as *r- and *l-, were absent or atypical in the word-initial position, there is no reason to believe that the /*y-/ semi-vowel, phonetically similar to a sonant, could be there either. The opposite view, which mostly goes back to Radlov's work in the end of the 19th century is usually based on the following incorrect presumptions: (1) that the Karakhanid Old Turkic of Makhmud al-Kashgari is equal to all of the Turkic languages (in other words, that Middle Turkic = late Proto-Turkic); (2) that Orkhon Old Turkic has been correctly and uncontroversially reconstructed from the script and it reflects /y-/, even though we hardly know the actual pronunciation in the Orkhon inscriptions; (3) that the high level of differentiation among different Turkic subgroups can be ignored, including the evidence for the maximum differencies in the Siberian languages and Chuvash — in this approach the evidence from the Kimak-Kypchak-Tatar languages, for instance, may play the same role as the evidence from Sakha, and indeed this was the situation in Russian and European Turkology until the beginning of the 20th century, when most Turkic languages were officially viewed as merely dialects of each other. Even in SIGTY, Chuvash is still unreasonably included into the mainstream Turkic languages, at least as far as the phonological reconstructions are concerned. As a final touch, we can describe a phonological calculation based on the above-postulated formula used in the reconstruction of the S-phoneme: 1/2 Proto-Chuvash /s'-/ + 1/2 [1/3 Proto-Yakutic /s-/ + 1/3 (1/2 (1/2 Proto-Altay-Sayan /ch'-/ + 1/2 (1/2 Proto-Kimak-Kypchak /j'-/ + 1/2 Proto-Kyrgyz-Kazakh-Chagatai /j-/)) + 1/3 Proto-Oghuz-Orkhon-Karakhanid /y-/)] = 1/2 Proto-Chuvash /s'-/ + 1/2 [1/3 Proto-Yakutic /s-/ + 1/3 (1/2 Proto-Altay-Sayan /ch'-/ + 1/2 Proto-Great-Steppe /j'-/ ) + 1/3 Proto-Oghuz-Orkhon-Karakhanid /y-/)] = 1/2 Proto-Chuvash /s'-/ + 1/2 [1/3 Proto-Yakutic /s-/ + 1/3 Proto-Central /ch'-/ + 1/3 Proto-Oghuz-Orkhon-Karakhanid /y-/] It follows from this expression that the original Proto-Bulgaro-Turkic *S-phoneme was most likely similar to a soft palatalized /s'-/ as in modern Chuvash /s'/, Russian /sh'/ or Japanese At a later stage, the phoneme began to change into a soft palatalized unvoiced /ch'/ or voiced /j'/ after the separation of Proto-Yakutic, whereas the mutation to /y-/ was a relatively recent innovative phenomenon typical only of the sourthern branch of Turkic languages. 2. Collecting factual material Comprehensive research in Turkology was often hindered by the large number of languages and dialects (somewhere over 50 when all the major dialects are counted) and the lack of detailed grammars and dictionaries for some of them. In many cases, the language descriptions were composed only after the 1920's or even after World War II. As a result, most of the 19th century's Turkological classifications had originally been built upon phonological criteria alone. The grammatical features were slowly added in in the course of the 20th century, whereas detailed lexcicostatistical and glottochronological analysis seems to be the thing of the recent past that appeared mostly in the 1990's. In the present chapter, we will briefly summarize the essential lexical, grammatical and phonological evidence collected as the basis for further examination in the next chapters. 2.1 An overview of the lexicostatistical research in Turkic languages In the beginning of the 21st century, several authors attempted to conduct some purely statistical studies of the Turkic languages, in most cases without any manual analysis of grammar or vocabulary. Starostin (1991) Sergey Starostin [STAH-res-tin] included some very detailed 110-word Swadesh-Yakhontov wordlists for 21 Turkic language in his book [Altajskaja problema i proiskhozhdenije japonskogo jazyka (The Altaic Problem and the Origins of the Japanese language), Moscow (1991)]. These lists were apparently later reintegrated into the Starling database. Dyachok (2001) A work conducted by M. Dyachok [pronounced: d-yah-CHOK] was published online as brief preliminary notes. In the introduction to his concise article, the author reminds the reader of the old geography-based classification by Samoylovich [sah-moy-LAW-vich] (1922), which had similar results, and then performs the lexicostatistical and glottochronological analysis of the 13 major Turkic languages. As a result, the Turkic languages were subdivided roughly into merely four basic subgroups (1) Bulgaric (2) Yakut, (3) Tuvan, (4) Western (= any other), which conforms to the idea that their area of maximum diversification was located somewhere in the east. Dybo (2002, 2007) The study by Anna Dybo [AHN-nah deh-BAW] was first published in 2001 as part of the articles collected in SIGTY [( Sravnitelnaja grammatika tyurkskikh jazykov (The Comparative Grammar of the Turkic languages)]. Then, it was republished in 2007 in a separate book [Anna Dybo, Lingvisticheskije kontakty rannikh tyurkov. Leksicheskij fond. (The Linguistic Contacts of the Early Turks: the Lexical Fund), Moscow (2007)]. The study cites Dyachok as a recent lexicostatistical publication and then briefly describes its own methodology, "All the languages, for which the 100-Swadesh wordlists could be collected from written sources, were included into our investigation. The 100-word Yakhontov-Starostin wordlists were employed, taken that they allow better accuracy [= than the classical Swadish-100]; they were processed according to Starostin's methodology by excluding the recognizable borrowings and employing the STARLING program [...]" As a result, the following dendrogram was obtained: Dybo, Anna, The Chronology of the Turkic languages and the Linguistic Contacts of the Early Turks (2006) There also exists a second version of this dendrogram that drastically differs from the first one, because of some kind of unexplained procedure that was applied to synonyms. This is slightly confusing and may result in the underestimation of the dendrogram's significance, however the first tree above (with the synonyms included) partly matches the outcome obtained in other investigations. Apart from such unconventional points as (1) the splitting of Turkmen and Turkish between two different taxa, (2) the positions of Yugur and Salar, (3) the slightly misplaced Kazakh (which cannot be directly related to Uzbek) and Uzbek position (which is known historically to be related to Uyghur), it is in fact in relatively good correspondence with other studies. However, the glottochronological part based on Starostin's formulas should be taken with a grain of salt. It should also be noted that the use of shorter 110-word lists results in lower statistical robustness than in the current series of publications that uses larger 215-word lists. Nevertheless, this work has an advantage of representing a greater set of languages, especially those of the Altay-Sayan area, which are normally underestimated or omitted in other studies. ASJP (2009) Another example of a phonostatistical research that merits mentioning is the automated dendrogram built by the Automated Similarity Judgment Program for most languages of the world. Here's a preliminary an simplified first-approximation phonostatistical dendrogram of Turkic languages (gif) from 04/2009. The study was based on a simple 40-word list. Many branches seem to be mispositioned, apparently due to certain limitations of the ASJP's initial approach, however you can see the early separation of Proto-Chuvash, then Proto-Oghuz, and then the rest of the languages, which is partly consistent with the conclusions obtained in the present work and other studies. Herein (2009, 2012) To prepare a lexicostatistical research for this publication, it was decided to use the readily available 200-word Swadesh lists from Wiktionary.org. After verifying and correcting the available materials, building some new lists for absent languages (such as Khakas, Tuvan, Altai) (2009), composing a php-program to do all the routine calculations, performing some additional meticulous examinations and adding some new lexical material thus expanding the lists to 215 entries (2012), another lexicostatistical study named The Lexicostatistics and Glottochronology of the Turkic languages was finally produced. It should be noted that the lexicostatistical figures obtained in 2009 and 2012 sometimes differed significantly from each other, because of different approaches used to account for the unavoidable synonymy. The 2009 approach had been much too basic and consequently was significantly enhanced in 2011-12, which included both reexamining the original lists and introducing changes into the program application, so the present version is to be considered more correct. Most borrowings (Persian, Arabic, Mongolian, Russian, etc) were excluded wherever possible, so only the verified cognates were counted in the final glottochronological section of the study. In the doubtful cases the cognacy was determined according to the [Etymologicheskij slovar chuvashskego jazyka (The etymological Dictionary of Chuvash), by M. Fedotov; volume 1-2, Cheboksary (1996)] and sometimes using the [Etymologicheskij slovar tyurkskikh jazykov (The etymological Dictionary of the Turkic languages), E. V. Sevortyan, Vol. 1-7, Moscow (1974-2003)]. The lexical lists presently differ from the Wiktionary.org materials and are available online as a Word document. As the final outcome of the study, several lexicostatistical matrices of Turkic languages were built.
However, we can use the values in the table to build a wave model of Turkic languages that would reflect the mutual language intelligibility through the calculated relationships in the basic vocabulary. The wave model should be based on the borrowings-included matrix, because it is supposed to represent the mutual intelligibility as it is, without any exclusions, for this reason you may notice some small discrepancy in percentages with the table above. The wave model of the Turkic languages with borrowings included from [The Lexicostatistics and Glottochronology of the Turkic languages (2009-2012)] 2.2 Dissimilar basic lexemes in the Turkic languages Another brief lexical table prepared in 2009 included a visual overview of certain lexemes that are known to be dissimilar within the core Turkic languages. These lexical data help to pick up dissimilarities between otherwise closely related groups and assist in identifying large supertaxa.
2.3 The comparison of phonological and grammatical features Mudrak (2002, 2009) The multivolume Moscow edition SIGTY. Regionalnyiye rekonstruktsii ("The Comparative Grammar of Turkic languages. Regional Reconstructions.") (2002) included an abbreviated article by Russian Turkologist Oleg Mudrak [aw-LEG moo-DRAHK; the name is etymologically akin to mudryj "wise, sagacious"] Ob utochnenii klassifikatsii tyurkskikh yazykov s pomosch'yu morphologicheskoy lingvostatistiki (On the clarification of the Turkic languages classification by means of morphological linguostatistics). It was subsequently republished in full as a separate book in 2009, and then briefly reviewed in a public lecture on the history of Turkic languages (available at youtube.com and as an magazine article). The study uses a unique statistical analysis of 96 morphological and phonological features counted up for as many as 42 Turkic languages and major dialects, and builds up trees with glottochronological dates (though based again on the apparently incorrect Starostin's glottochronological formulas), checking them for consistency with the major historical events. This purely morphostatistical analysis is an extremely interesting and apparently completely novel approach in historical linguistics. The obtained dendrograms roughly coincided with the present study by about 80%, though differed in certain aspects. The purely grammatical approach by Mudrak prompted us to take a closer look at the morphological features, which are well-known to be more resistant to borrowings than common words thus providing more robust results. Finally, a similar study of phono-morphological differences within the Turkic languages was conducted (2009). The following table contains a list of certain phonological and grammatical features known to be different across Turkic languages, so studying them helps to establish the exact order of their taxonomic diversification. It should be acknowledged that the former analysis of phono-morphological features by Mudrak (2009) seems to be more detailed, particularly as far as the number of included languages is concerned. However, even though many additional grammatical and phonological characteristics are not explicitly mentioned in the table of phonological and morphological differences, they are often described below under paragraphs for specific Turkic languages. Much of the morphological and phonological data in the table have been collected from the encyclopedic edition [Jazyki mira: Tyurkskije jazyki (The Languages of the World: The Turkic Languages); editorial board: E. Tenishev, E. Potselujevskij, I. Kormushin, A. Kibrik, et al; The Russian Academy of Sciences (1996)], which is a detailed, comprehensive and authoritative publication consisting of articles by specific authors and brief phonetical and grammatical descriptions of each Turkic language. Other data were collected directly from grammar books on specific languages. |
Some of the phonological and morphological differences within the Turkic languages The table may contain simplifications in transcribing vocal harmony | |||||||||||||||||
y-/ J- | -G-/ -w- | -d-/ -y- | b-/p-
t-/d- g-/k- G/q- | Instrum ental case | Other cases | Plural | Dative | "Perfect" Participle | Negation of adjectives, nouns | "We
did" ending | "We
do" Aorist ending | "I
do" Aorist ending | Use
of tur- or any other copula | Future Tense | someone, somewhere, no one, nowhere | you (plural) | |
Chuvash | s'- | -v- | -r- | p-,
t-, k-, x- | -pa, -pe | Goal-directed -shan, -shen | -sem | -a, -e | – | mar | -r-âmâr, -r-êmêr | -âpâr -êpêr | -âm -êm | – | -at-,
-et- -0- | ta-kam;
tashta; nikam ta; nishta ta | esir |
Sakha | s- | -0:- | -t- | b-,
t-, k-, k- | -nan | Partial -ta; Compar. -ta:Gar; | -lar,
-ler, -lor, -lör, -nar, -ner,
-dar, der, -tar, etc | -ga | -bit, -bït |
suox; buol-batax | -ti-bït/bit,
-li-bït/bit | -bït/bit, -pït/pit | -bïn/bin, -pïn/pin | verb-an+ tur + pronoun = past tense | -ïah-; -a:ya- / -eye-i = optative (apprehen- sive) | kim
ere, xanna ere, kim da + negative, xanna da + negat. | ehigi |
Tuvan | ch- | -0:- | -d- | weak
semivoiced : strong unvoiced: *q > x | – |
Directive -dïva, -dive,-duva, -düve,-tïva, etc | -lar,
-ler, -nar, -ner, -tar, -ter, -dar, -der | -ga/ge, -ka/ke | -gan, etc | eves; chok | -dï-vïs | -vïs,
-vis -vüs, -vus | men | verb + p + tur (chïdïr, olur) + pronoun =Present |
-ïr-; Gai/gei, qai/kei = optative | bir-(le)
kizhi; bir-(le) cherde; kïm-da: + negativ; kaida-da: + negative | siler |
Tofalar | ch- | -0:- | -d- | weak
semivoiced : strong unvoiced | – | Partial -da, -de, -ta, -te | -lar, -ler, -nar, -ner, -tar, -ter | -Ga/Ge, -qa/qe | -Gan/Gen, -qan/ qen | emes | -dï-vïs | -bis | men | verb + p + turu (chïêtïrï, oluru) + pronoun = Present tense |
-ar/er/ïr/ir-; Gai/gei, qai/kei = optative | -- qum-ta: + negat. -- | siler |
Khakas | ch-, n'- | -0:- | -z- | p-,
t-, k-, x- | -naN, -neN | Directive -za, -zer, -sar, -ser, -nzar, -nzer | -lar,
-ler, -nar, -ner, -tar, -ter | -ga/ge -xa/ke, -na/ne, -a/e | -Gan/gen, -xan/ken | nimes; chox | -dï-bïs | -bïs/bis -pïs/pis -mïs/mis | -bïn/bin -pïn/pin -mïn/min; -ïm, -am | verb
+ (p) + tur + pronoun = Audative or Archaic past;
| -ar/er/r-; Gai/gei, qai/kei = optative | kem-de, xayda-da; kem-de + negat. xayda-da + negat. | sirer |
Kumandy | ch-, n'- | -0:- | -y- | b/p-,
t-, k-, k(q)- | – | Directive -za, -ze, -sa, -se | -lar,
-ler, -nar, -ner, -dar, der, -tar, -ter, | -ga,
-ge, -ka, -ke -a, -e, etc | -gan, -gen, -kan, -ken | eves,
emes; chok, chox | -dï-bïs,
-di-bis, -dï-vïs | -bïs,
-bis, -pïs, -pis | -ïm, -am | verb
+ ïp + tur + pronoun = Audative past; verb + a/e + tur + -ar + pers ending = Present Future; | -ar/er/r-; -ad, -ed Gai/gei, qai/kei = Optative | kem-de, kayda-da; --- | sner, snir |
Standard Altai | d'- | -0:- | -y- | b-,
t-, k-, q- | – | – | -lar,
-ler, -lor, -lör, -dar, der, -dor, dör, -tar, -ter, -tor, -tör |
-ga, -ge, -go, -gö, etc | -gan/gên, -kan/kên | emes; d'ok | -(ï)bïs/(i)bis, ïs/is, -ïk/ik | -bïs, -bis, | -bïn/bin -pïn/pin -mïn/min | verb
+ dïr + pers ending = audative past;
verb + a/e + dïr + pers ending = Present Continuous; verb + ïp/ip + tur + d + pers ending = Past Continuous; | -ar/er/r-; -at/et-; Gai/gei, qai/kei = Optative | kem-de, *kayda-da; --- | slerler |
Kyrgyz | J- | -0:- | -y- | b-,
t-, k-, q- | – | – | ———————— -lar, -ler, -lor, -lör, -dar, der, -dor, dör, -tar, -ter, -tor, -tör | —————— -ga, -ge, -go, -gö, -ka, etc | -gan- | emes | -dik, etc | -(ï)bïz | -mïn | verb
+ ïptïr = audative past;
verb + ïp + tur (otur, Jat, Jur) + pronoun = Present Continuos; | -ar; Gai/gei, qai/kei = Optative | (kimdir)
birö:, kayda-dïr (bir Jerde); ech kim; ech kaida, ech Jerde | siler, sizder siz (polite) |
Kazakh | J-, zh- | -w- | -y- | b-,
t-, k-, q- | -men, -pen | – | -lar,
-ler, -dar, der, -tar, -ter, | -Ga,
-ge, -qa, -qe | -Gan,
-Gen -qan, -qen | emes | -dïq, -dik | -mïz, -miz | -bïn/bin -pïn/pin -mïn/min | verb + ïp + tûr (otur, Jatïrt, Jür) + pronoun = Present Continuos; | -ar/er/r; -baq/bek-, -paq/pek-, -maq/mek- | êlde-bireu,
êldekim bir Jerde esh kim; esh kaida, esh Jerde | sender; siz, sizder (polite) |
Uzbek | y- | -G- | -y- | b-,
t-, k-, q- | – | – | -lar | -ga | -gan,
-qan, -mïsh- | emas | -dik;
-dimiz (dialectical variation) | -(i)miz | -man | verb + ïp + tûr (ûtir, yot, yür) + pronoun = Present Continuos; | -a-,
-y-; -ar/r; | allakim,
kimdir -- hech kim; hech qayerda; | siz |
Uyghur | y- | -G- | -y- | b-,
t-, k-, q- | – | – | -lar, -lêr | -gê, -qa, -ka,-kê,-qê | -Gan | êmês | -duk, -tuq | -(i)miz | -mên | verb + ïp + tur (oltur, yat, yür) + pronoun = Present Continuos; | -i--; -ar/r; | kimdu,
biri -- hech qaysi, hech kim; hech yerde; | silêr, siz (polite) |
Chagatai | y- | -G- | -y- | b-,
t-, k-, q- | – | – | -lAr | -Ga,
-gä, -qa, -kä | -Gan,
-Gän -mïsh- (rare) | e(r)mäs, yoq | -dïq (or similar) | -(i)bïz | -men (-Am) | noun
+ dur(ur); verb + -A + dur-pronoun; verb +Yp + -dur; | -Gu- | kishi, | siz, sizlär |
Baraba | y- | -y- | b-,
t-, k-, q- | – | – | -lar, -nar, -tar | -qa | -Gan | tügil | -dïq, etc | -bïs, | -mïn, -Am | verb + ïp + tur (otïr, yat) + pronoun = Present Continuos (rare); | -ïr; - | silär; siz (polite) | ||
Karachay | J-, ch- | -w- | -y- | b-,
t-, k-, q- | – | – | -la, -lê | -ga/-xa/ -ge, -na/ -ne, -a/e | -Gan/gen | tüyül | -diq, -duk, -dük, etc | -bïz, -biz, etc | -ma, -me | verb + a/e + tur + pronouns = Present Continuous; |
-ïr; -rïq/nïq/lïq; | kim
ese da, qaida ese da, -- | siz |
Tatar | y-, Ji-, Je- | -w- | -y- | b-,
t-, k-, q- | – | Comparat. -day, -tay, -dêy, -dïy, etc. Locat-Temp. -dagï, -tagï, -dêge | -lar, -lêr, -nar, -nêr | -ga, -gê, -ka, -kê; -na/nê, -a/ê | -gan, -kên | tügel; participle + pers. ending + yuk | -dïk, etc | -bïz, etc | -m(ïn) | noun (3rd pers) + -dYr, -tYr | -ïr; -achak; | kemder;
kaidadïr; berkaida; ber kem (dê), hichkem; (ber) kaida da hich ber Jirdê; | sez |
Cuman-Polovtsian | -y- | b-,
t-, k-, q- | – | – | -lar, -ler | -Ga, -ge, -qa, -ke; -a, -ê | -mYsh- | -bïz | -man, -men | noun (3rd pers) + -dYr, -tYr | -Gai/-gei,
-kai/-kei | siz | |||||
Turkmen | y- | -G- | -y- | b-,
d-, g-, G- | – | – | -lar, -ler | -a,
-ä,
-e; -na, -ne | -mYsh Used only as audative particle | dêl,
participle + pers. ending + -ok | -dYk | -Ys | -ïn,
-in, -un, -ün | verb
+ ïp + dur (otïr, yat) + pronoun = Present Continuos; verb + ïp + tïr + pronoun = Past Audative; verb, noun (3rd pers) + -dYr, -tYr | -ar,
-ïr; -Jak, -Jek (no endings) | siz | |
Azeri | y- | -G- | -y- | b-,
d-, g-, G- | – | – | -lar, -ler | -a, -ê | -mYsh- Used as audative particle and perfect tense | deyil | -dYg | -Yg | -êm; -am | verb, noun (3rd pers) + -dYr, -tYr | -(y)acak(G-, -(y)ecek(G-) | hech kim | siz |
Turkish | y- | -G- | -y- | b-,
d-, g-, G- | – | – | -lar, -ler | -(y)a, -(y)e | -mYsh- Used as audative particle and perfect tense | deil, de(G)il | -dYk | -Yz | -ïm,
-im, -um, -üm | verb, noun (3rd pers) + -dYr, -tYr | -ar,
-ïr; -acak(G-), -ecek(G-) | kimse,
bir shey; hich kimse, hich bir shey | siz |
Khalaj | y- | -G- | -d- | b-,
t-, k-, q- | -la | Locative -cha | -lar | -ka, -qa, -yä | -mYsh- | daG | -dimiz, -dYk < Azeri | -(ï)mïz, -uq < Azeri | -Vm |
är
(conjugated copula) | -(ï)Ga | siz | |
Karakhanid | y- | -G- | -ð- | b-,
t-, k-, q- | -ïn,
-in, -un, -ün, -nïn,-nin | – | -lar, -lär | -qa,
-kê, -Ga, -gê, -a, -ê, -Garu, -gerü | -mïsh-,
-mish; -Gan-, -gen-, -qan, -ken- | ärmês; yok | -dimiz, -duk | -biz, -miz | ol (3rd pers. copula) | -Gay, -gey, -qay, -kêy | siz | ||
Khorezmian | y- | b-,
t-, k-, q- | -n, -ïn, -in, -un, -ün, -an, -än | -lar |
-qa, -kä, -a, -ä | -mïsh-, -mish- | ärmäz,
ärmäs; däGül, dügül (rare); yok | -duq, -dïq | -biz | -män | er-; -b turur = perfect past; -a turur = repetetive present | -Gay, -gäy, -qay, -käy, -Ga, -gä, -qa, -kä | (siz) | ||||
Old Uyghur (Kojo) | y- | - | -ð-, -d-, -z-, | b-,
t-, k-, q- | -ïn,
-in, -un, -ün, -nïn,-nin | Equative -cha | -lar, -lär | -qa,
-kä, -Ga, -gê, -Na, Nä; -Garu, -gärü | -mïsh-, -mish- | täGül; ärmäz | -tïmïz, -dimiz | -biz,
-miz,
-bïz -mïz | -män | ärür (copula) | -Gay,
-gäy -Galïr; -tachï, -dachï | siz | |
Orkhon Old Turkic | y-? | -G-, -G | -ð- | b-,
t-, k-, q- | -ïn, -in | Equative -cha | -lar,
-lär | -qa,
-gä, -ya, -yä; -Garu, -gärü | -mïsh-,
-mish; -Gan- | –; jok | -timiz, -dïmïz | -biz | -män | er- | -tachï, -dachï | siz | |
Salar | y- | -G- | -t-, -y- | weak
semivoiced : strong unvoiced | – | – | -lar, -lär, -ner | -Ga,
-ge, -qa, -ke, -a, -e | -Gan,
-gen; -mïsh- | emes,
emes-tïr, emes-ar, yox-tïr | – | – | – | noun
+ dïr (idïr-, oN; irar); adj + dïr
(idïr + oN; irar); verb + p + o(r) + (tur) = Present I; verb + qu(r) + ( tur) = Future I; verb + q/Gan + dïr = Past II; | -ar/er/ïr/ir; -qur/Gur |
k'em-ter -- niNgi -- | seler |
Yugur |
y- tsh-, | -G- | -d- | weak
semivoiced : strong unvoiced | – | Compar. -daG, -deg, -taG, -teg | -lar,
-ler, -nar, -ner, -dar, -der, -tar, -ter | -Ga,
-ge, -qa, -qe | -Gan | emes-tro; yoqer, yok-tro, yoq-pe-tro | – | – | – |
i:re = copula; verb + Gan + tïr = Present Tense; verb + qïsh + tro = Future; verb + Gan + tro = Past II; verb + ïp/ip + tro = Past III; | -ar; -qïsh-tro, -Gïsh-tro -qïsh-ere; -Gu, -gu, -Go, -go; -Gï, -ge, -kï, -ke -qïr/Gïr | qïm-er,
nier -- qïm-ma, nima | siller seler |
3. Making Taxonomic Conclusions With all the lexical and grammatical material collected in the previous chapter, we can finally get down to the analysis of each Turkic branch. Then, we will be able to attempt to make taxonomic conclusions concerning the position of each language in the phylogenetic dendrogram. Note: Taxon is a general concept of classification science borrowed from biology which encompasses other subdivisions, such as group, family, macrofamily, etc. However for all practical purposes, we do not usually dinstinguish between (sub)group and (sub)taxon in this article. The usage of expression "the (Name) taxon" is thought to be equivalent to "the (Name) languages". The term "family" cannot be used except for the language taxa of high order with a temporal separation of more than 5000 years, e.g. "the Indo-European family", but hardly "the Turkic family", except maybe in the context where it would be necessary to underline the early separation of Proto-Bulgaro-Turkic from Proto-Altaic. The Bulgaric subgroup Chuvash, the only modern-day representative of Volga Bulgaric within the Bulgaric taxon, was definitively shown to be related to Turkic by Nicholas Poppe [Chuvashskij jazyk i jego otnoshenije k mongolskomu i tyurkskim jazykam (Chuvash and its relatedness to Mongolian and the Turkic languages), Nicholas Poppe (1924)]. Poppe established regular phonological correspondences between Chuvash and other Turkic languages. In his work, he listed several influential Turkologists (Adelung (1820), Rask (1834), Ramstedt (1922-23)) who had understood and accepted the Turkic origins of Chuvash long before his publication. Moreover, according to Alexander Samoylovich, Poppe had shown that "the Chuvash and Bulgaric languages do not stem from "Proto-Turkish" (z-group), but rather from the common progenitor of both of these groups", thus setting Chuvash aside from the rest of the Turkic languages. [Alexander Samoylovich, K voprosu o klassifikatsiji turetskikh jazykov (Towards the question of the classification of Turkish This positioning of Chuvash within the Turkic tree has changed little ever since. For this reason, Chuvash has not been considered herein in much detail, mostly because of its evidently early separation that does not cause much controversy among scholars. Some of the exclusive Bulgaric features Bulgaric phonology (1) The famous Bulgaric rhotacism vs. the Turkic Proper zetacism, or the persistent use of /–r/ where other Turkic languages normally have /-z/ (though in some cases –r- can also be found in certain positions in Turkic Proper as well, for instance apparently in in the Aorist Tense). An intermediate pronunciation of /r/ and /z/ is found in Czech. (2) Chuvash /-l/ vs. Turkic Proper /-sh/; We have noted several times that the correspondant proto-Bulgaro-Turkic l/s- liquid seems to survive in modern Khalka Mongolian, cf. the pronunciation of ula:n "red" as /ush'a:n, uLa:n/, where /L/ denotes this unique liquid affricate. Practically speaking, the huge phonological difference between Chuvash and any other Turkic language can be easily observed by comparing almost any Chuvash word, such as 1-10 numbers, to its Turkic Proper equivalent. Bulgaric grammar (1) the peculiar plural marker –sem in Chuvash (of seemingly unknown origin), absent not only in Turkic but apparently in other Altaic languages. It has been conjectured by a Soviet scholar in a separte article that the Chuvash -sem, which rather regularly goes back to *-sen, may only be similar to Kamassian (South Samoyedic) -saN. [Kamassian located in the East Sayan Mountains could be in contact with the early Turkic languages, however there is no clear explanation for this phenomenon.] (2) a peculiar goal-directed case expressed by –shan, -shen; (3) many contracted grammatical forms and a rather simplified grammar in Chuvash (generally typical of contact or "creolized" languages); Bulgaric lexis The lexical difference between Chuvash and any other Turkic language amounts to an average of 54.5% (Swadesh-215, borrowings excluded). That is roughly equivalent or a little lower than to the lexicostatistical difference between English and any other Germanic language. A similar conclusion has been made by Talat Tekin in [Türk Dilleri Ailesi (The Turkic Language Family) // Genel Dilbilim Dergisi, Vol. 2, pp. 7-8, Ankara (1979)], who compared the actual difference between Chuvash and Turkish to the difference between English and German, the latter two, of course, apart from formally belonging to the same Germanic group and sharing a number of common basic words, are far from being closely related or mutually intelligible. There is a considerable number of Kazan Tatar lexemes found in the Chuvash basic vocabulary. These lexemes are normally recognizable by their typical non-Bulgaric phonological shape similar to Kazan Tatar or/and the existence of a parallel native word, e.g. yapâx "bad", yeshêl "green (about grass)", tinês "sea", chechek "flower", vârlâx "seed", kashkâr "wolf", kuyan "hare", utrav "island", yêbe "wet" (cf. Tatar jeben-, Bashkir yeben- "to get wet"), têrês "right, correct", etc. Such common words as kus' "eye" and pus' "head" may in fact be too the Tatar borrowings, taken that they lack the r-ending that is expected in the Proto-Volga-Bulgaric reconstructions *xêl and *pul. The abbreviated grammar and the considerable number of Kazan Tatar loanwords should be taken into consideration when making conclusions about the origins of Chuvash. Could the early Chuvash be strongly impacted by the Golden Horde language in the past? However, the number of borrowings in Chuvash is hardly much greater than in many other Turkic languages. Bulgaric glottochronology Glottochronologically, the separation of a language with the 55% of lexicostatistical differentiation should roughly correspond to anything between 900-1100 BC on the temporal scale. Note that this number has been calculated according to the local temporal calibration, which is neither the standard textbook figure, nor Starostin's method, see again The Glottochronology of the Turkic languages. However, there is some uncertainty concerning this value, because of the logarithmic and statistical nature of the glottochronological principles that makes them prone to errors, particularly in the cases of standalone languages. Indeed, the lack of any present-day Chuvash siblings that could allow for a statistical averaging to cancel out any fluctuations, raises doubts about the robustness of this figure. As a result, a relatively small error, which may be due, for instance, to the infiltration of Tatar borrowings, may result in even greater discrepancy when extrapolated beyond the calibration interval, logarithmically modified and projected onto the temporal axis. At any rate, despite these doubts, the number of about 54-55% is relatively stable, and nearly all the previous estimations performed between 2009-2012 (with the borrowings excluded or included, with different ways to treat synonymy, etc.) have pointed to the early separation of Chuvash, at least as early as 500 BC, but with 1000-1100 BC being a more likely period. Archaeologically, this era of 800-300 BC coincides with the onset of the early Iron Age in West Siberia, so we may further attempt to support this date by making tentative assumptions about the active use of iron weapons and horse harness during that period, which might somehow have contributed to the Proto-Bulgaric and Proto-Turkic separation. As it has been mentioned several times, the presence of relatively late dates for the Chuvash separation in other parallel works [Dyachok (2001), Dybo (2006), Mudrak (2009)] is most likely rooted in the application of Starostin's non-logarithmic formulas. Bulgaric history and geography In geography, a rather unique European position of Chuvash west of the Urals, a long way from the supposed Turkic homeland near the Altai Mountains (let alone Mongolia, as assumed in certain alternative Urheimat theories) is evident at the very first glance, which again indirectly corroborates the hypothesis of its early separation, given that longer distances presumably correlate with longer migration time. By the 13th century, Volga Bulgaria must have extended approximately within the 200-km (120-mile) radius from the confluence of the Volga and Kama River. It was probably almost entirely destroyed during the Mongol invasion, making the Volga Bulgarians take refuge in the forested areas of the Volga's right (western) bank, situated within the same 120-mile circle. There, near the forests of Chuvashia, the legacy of Mongolian and Tatar raids must have been less pronounced. These refugium-type Chuvash settlements in a small area along the Sura (=a tributary of the Volga) are very similar to those of the Mari in the forests and hills of the Volga's left and right bank in the nearby area north of Chuvashia. Unsurprisingly, both ethnicities seem to share certain common ethnological and lexical features (usually seen as Proto-Mari borrowings from Volga Bulgarian). Consequently, the Chuvash people seem to be those Volga Bulgarians that survived the 13th century's invasion or any later military and cultural interventions by confining themselves to the woodland of Chuvashia and ceding their former territory to the ancestors of Kazan Tatars. The latter ones were clearly first attested in the proximity of the Volga-Kama confluence by Ibn-Fadlan as "al-Bashkird" as early as 922, so their settlement was running almost parallel to that of Volga Bulgarians. The participation of Kazan Tatar people in the migrational seclusion of Chuvash is obscure. The Kazan Tatars did not necessarily occupy the Volga Bulgarian region by force as part of the Mongolian army in the 1230-40's, rather their settlement in the area of the present-day Tatarstan, though inevitably catalyzed by the disastrous Mongolian invasion, could have resulted from a long and slow migration and linguistic assimilation of Volga Bulgaria extending over a period of many centuries. It should also be noted that the Chuvash people were first attested in the historical sources only in 1508, and then in 1551, during the rule of Ivan the Terrible and the siege of Kazan by his army. The association of Chuvash with Volga Bulgarians has mostly been the outcome of the historical and linguistic analysis of the 19th century's Turkologists (Kunik, Radlov, Amsharin, etc.) [see the Brockhaus and Efron Encyclopedic Dictionary (1906)], however this conjecture is now considered to be well-demonstrated. Note: The ethnonym Chuvash is evidently a Tataricized pronunciation of S'uval, since the sounds in the former variant may not even exist in Proto-Bulgaric. The city named Suva:r is attested near the Etil River (=the Volga), for instance, on the map by Mahmud al-Kashgari (1072-74). He also noted, "As for the language of Bulgar, Suvar and Bajanak [= Pecheneg], approaching Rum [= that is, from north to south], it is Turkic of a peculiar type with clipped ends.[= apparently meaning the rather simplified Bulgaric morphology.] Conclusion: The discrepancy between Chuvash and other Turkic languages is so pronounced and its geographical position is so detached from the area of maximum diversification of other Turkic languages that it would be appropriate to separate Chuvash as part of a special Bulgaric taxon within the larger Bulgaro-Turkic supertaxon or family. For most practical purposes, we may assume the date of about 800-1100 BC to be a plausible period for the separation of Proto-Bulgaric from the rest of the Turkic languages. An important terminological innovation that is suggested in the present study is the usage of the term Bulgaro-Turkic instead of just Turkic for the two major groupings. This terminology modification seems to be reasonable, and arises from the practical need to avoid the continual use of periphrastic expressions like "Turkic Proper", "the Turkic languages outside Chuvash", "the Proto-Turkic homeland excluding Proto-Bulgaric", etc. The Yakutic subgroup Where does Sakha actually belong? It has been widely accepted since the 19th century's research work, that Sakha, the language of the Yakuts, is almost as distant from other Turkic languages as Chuvash. Nevertheless, the matter is not that simple. It has also occurred to several researchers that the Yakuts may actually be directly related to other Turkic ethnic groups of Siberia, such as Tuvan, Khakas or Altay. So instead of positioning Sakha and Dolgan into a stand-alone sub-group, the alternative hypothesis suggests the existence of a "Siberian" taxon which would include most of the Turkic languages east of the Irtysh River line. Trying to prove the existence of this "Siberian" taxon turns into a complicated Turkological problem. At first glance, Sakha differs drastically not only from any other Turkic language, but also from its closest potential Siberian neighbors. But in other respects, it seems to share with them certain linguistic features that are hard to delineate from common archaisms. Below we will study some of these shared "Siberian" features in detail. Yakutic phonology In phonology, the Yakutic subgroup is characterized by the following local innovations not shared by any other branches: (1) the loss of the Proto-Turkic perhaps aspirated *sH as in Old Turkic sekiz "eight" > Sakha aGïs; Old Turkic sen > Sakha en "you"; Old Turkic suNok [N=ng] > Sakha uNuok "bone"; (2) the stabilization of the strongly palatalized Proto-Turkic *S into an "ordinary" s-, cf. Chuvash s'altar but Sakha sulus "star"; (3a) the transition of the intervocalic -s-, -z- into -h- as in Old Turkic qïzïl > Sakha kïhïl "red"; (3b) the transition of -ch- into -X- as in bïXax "knife", as opposed to bïchaq in many other Turkic languages [Baskakov, 1969]. This aspiration is even more pronounced in Dolgan, the northernmost offshoot of Sakha, where the s- is converted into the h- even in the beginning of the word; (4) The late development of several diphthongs, as in uon < *on "ten". "Late" since the vocalism is normally much less historically stable than the consonantism and thus should belong to a relatively recent period; (5) Various assimilations and dissimilations, which mark the existence of a Proto-Yakutic substrate with strong lenition, which made many original sounds unpronounceable and created the hot-potato effect, such as in the borrowing pahï:ba from the Russian /spasiba/ "thanks"; Among notable archaisms, the following features can be listed: (1) The full retention of the archaic intervocal -t- as in atax "foot", xatïN "birch" probably with some fortition, which is similar only to Tuvan -d/t- (where this phoneme is semivoiced), but which is quite unlike the more lenitioned Khakas -z-; (2) The probable retention of the so called "primary" long vowels, as in sa:s "springtime", xa:r "snow", ti:s "tooth", which, in other branches, are mostly found in Turkmen and Khalaj, and are often believed to be possible remnants from the Proto-Turkic period. Yakutic grammar In grammar, in most respects, Sakha exhibits more grammatical differences than similarities to most other Turkic languages, with the exception of Tuvan, Khakas, Altay, where certain local Siberian similarities have been found. The following grammatical features in Sakha seems to be unique: (1) Sakha does not seem to use the negative form similar to e(r)mes or deGil, which is common in other Turkic languages, but rather the suox (after the verbs in the future tense and after the adjectives) and buol-batax (after nouns) are used instead. The latter seems to be unique among Turkic languages. Cf. men uchuta:l buol-batax-pïn "I teacher being-not-am." Note: The Bulgaro-Turkic *bol- > Sakha buol- is an obvious Nostratic parallel to the English "be", which is present in all of the Bulgaro-Turkic languages. (2) The loss of the genitive marker; (3) The usage of kini "he, she" and kini-ler "they" (along with the common Turkic ol "that (one)"). The former finds parallels probably only in the Bulgaric ku "this, that" and Yugur ku "he, she". There exists a hypothesis of its relatedness to Turkish kendi, Karakhanid kendü "self" (probably going back at least to Ubryatova (1960-80's), a researcher of Dolgan and Sakha (?)), which runs into certain semantic difficulties, though apparently plausible; (4) The phonologically odd plural pronoun ehigi (you) with its unique phonological shape, so different both from the conventional siz and seler; (5) The unusual comparative case with -ta:Gar, -da:Gar, -la:Gar, -na:Gar. A similar ending for the comparative case is also known in Kimak and Yugur. On the other hand, the following grammatical features in nouns and pronouns seem to be shared with the Altay-Sayan subgroup: (1) The typical and persistent usage of expressions like kim-da, kaida-da + a positive verbal construction denoting indefinite pronouns as in "something does", "somewhere is" and kim-da, kaida-da + a negative verbal construction denoting negative pronouns as in "no one did", "nowhere is", etc. Cf. Sakha kim-da, hanna-da; Tuvan kïm-da, kaida-da; Tofa qum-ta; Khakas kem-de, xayda-da; Kumandy kem-de, kaida-da; Standard Altay kem-de, *kaida-da; However, this syntactic model is by no means unique to "Siberian", since similar models also exist in Karachay kim ese da "someone", qaida ese da "sometimes", Tatar ber-kem (de), (ber) kaida da and probably elsewhere. In other western Turkic languages, these constructions have mostly been displaced by phrases of Persian origin, therefore this feature is most likely to be a Proto-Turkic archaism, not a Siberian innovation; (2) The peculiar instrumental case ending in -nan shared at least with the Khakas instrumental case ending in -naN, -neN. Nevertheless, this feature is evidently a retention, taken that Karakhanid, Old Uyghur, Orkhon Old Turkic and Khorezmian all had a very similar instrumental case with the (n)ïn,(n)un, (n)an, (n)ün marker. Furthermore, we will provide a brief summary of the Sakha verbal morphology:
Alternatvely, among the features shared with Orkhon-Oghuz-Karakhanid, and even going back to Proto-Turkic, the following could be mentioned: (1) The use of -myt- / -byt- tenses, which are akin to the Old Turkic and Oghuz -mïsh- tenses. These are used only in Oghuz, Salar, Old Turkic, Karakhanid, Khalaj, Cuman-Polovtsian, Uzbek, but not any Altay-Sayan or most Great Steppe languages. Based on the phonetic similarity of this suffix to Sakha buol- that comes from Proto-Turkic *bol "to be" (and the lack of any other specific Yakutic-[Oghuz-Orkhon-Karakhanid] innovations), we can infer that this suffix is most likely an archaism going back to the Proto-Turkic state. Semantically, both the -bït- and the -Gan- suffixes are in complimentary distribution across the Turkic languages, which basically means that if one is present, the other one is gone or has a different meaning, so apparently, -Gan- replaced -bït- in Altay-Sayan and most Great-Steppe languages because of the semantic similarity of both tenses. (2) The use of -dax- / -tax- / -daG- / -tax- tenses, which are apparently akin to the Old Turkic and Oghuz-Seljuk -dïG- / -tïG- masdar suffixes. (3) Cf. the usage of -er- instead of e-, i- as an auxiliary verb "is; to be", cf. Sakha oGo utuyan erer "the child is falling asleep" (also similar at least to Khalaj, Old Uyghur and Yugur-Salar), albeit also Sakha barar etim "I used to go", where the root of this auxiliary verb e-tim is similar to Modern Turkish-Azeri i-dim and other Turkic languages. Most of these featues can easily be assumed to be Proto-Turkic archaisms that survived independently in Yakutic and Orkhon-Oghuz-Karakhanid, because presently nothing suggests that they could be a recent innovative development. On the other hand, there also exist a few unstable Siberian-specific tenses, which can be regarded as suspected Siberian innovations, namely: (1) The tense with the -dïr-personal ending- as in *bar-dïr-men "maybe I go, if I go", which is actually very typical in the Altay-Sayan languages. However, similar forms have also been found in Turkmen dialects, and are said to be "understandable" by Standard Turkmen speakers, which may be indicative of their existence in Proto-Oghuz. (2) The tense with the -a ilik- construction exists in Altay-Sayan and Kyrgyz (where it is likely to be a borrowing from Altay). However, it seems to have become extinct in most Altay-Sayan languages, so presently it seem to be just a shadow of what it might have originally been, and there are doubts concerning its usage. See [Shirokobokova, N.N. Otnoshenije jakutskogo jazyka k tyurkskim jazykam Yuzhnoj Sibiri (The relatedness of the Yakut language to the Turkic languages of South Siberia), Novosibirsk (2005)] (3) The use of the -Gay participle to show the optative mood, as in bar-a:ya-mïn in Sakha and *bar-Gay-mïn "I'd better go" in Altay-Sayan, whereas in Orkhon-Karakhanid this tense normally expressed the direct future. Nevertheless, such a purely semantic feature is too unstable and could be a naturally occurring independent mutation in meaning both in Proto-Yakutic and Proto-Altay-Sayan; Most other verbal constructions in Yakutic cannot be found in other Turkic languages, making Sakha verbal morphology rather unique. Borrowings and odd words in the Sakha vocabulary Sakha contains lots of words which make one wonder where they could possibly have come from. In fact, Sakha was described as a mixed tongue at least as earlier as Radlov (1908), who counted that out of 1750 words in a glossary, about 33% were Turkic, 26% were Mongolic, and the rest were of unknown origin. Presently, we believe that all these borrowings come from at least the four main sources: (1) Middle Mongolian or the Middle Buryat dialect (pronunciation: /boo-RAHT/). (2) Evenk (Tungusic); (3) Russian; as in most "Siberian" languages, the number of Russian loanwords in the abstract and cultural vocabulary is exceedingly high; (4) an unknown early substrate, most likely of Yeniseian type; (1) Among potential Mongolic borrowings in the basic vocabulary, one could easily name the following words: (1) Khakas sïray, Altay chïray, Tuvan shïray, Sakha sirey "face" probably from Mongolic, cf. Middle Mongolian chiray, Buryat sharay. Also, meaning "beauty" in Kyrgyz and Kazakh; (2) Altay mechirtke, Tuvan merzhergen, Sakha mekchirge "owl" from Mongolic *begchergen, Buryat begserge "barred owl"; (3) Sakha kharba: "to swim", cf. perhaps Khalkha Mongolian khayiba, khaiva of the same meaning; (4) Sakha moGoy "snake", cf. Middle Mongolian moqai, Khalka mogoi; (5) Sakha ergilin "to turn", cf. Khalka ergeG "turn around"; (10) Sakha suruy "to write", suruk "letter, mail", cf. Written Mongolian zhiru-, Buryat zura- "to draw" The Mongolic origin of some other words is uncertain, though presumable: (1) Sakha khallan "sky", cf. Middle Mongolian e'ülen "cloud(s)"; (2) Tuvan iye, Sakha iye "mother", cf. Khalkha Mongolian ex "mother", Evenk eni:n; (3) Sakha mas "tree", cf. Khalka mod, Middle Mongolian mod-un, Daur mo:d, etc., as well as Evenk mo:, Nanai mo:, Written Manchu mo:; (4) Sakha bey-em, Tuvan bod-um, Khakas poz-ïm, Altay boy-ïm "self", which is probably akin to the Mongolian bod and biye "body", though this is not necessarily a loanword and could be a retained Altaism; (2) Some borrowings from Evenk were also found, although in some cases the borrowings could have come the other way around, that is, into Evenk, cf.: Sakha öydö: "understand", cf. Evenk uyde-mi:; Sakha oNocho "boat", cf. Evenk oNkocho "wood-board boat", umurechun "birch-bark boat"; Sakha d'i:e "house", cf. Evenk d'u:; Sakha tïl "word", cf. Evenk tïl "meaning"; Sakha tarbax "finger", cf. Evenk dial. sarbas; Sakha taba "correct", cf. Evenk d'abul; Sakha bulta: "hunting", cf. Evenk bulta; Sakha seri: "war", cf. Evenk kusi:n, buleme:chik, cherig, serI: (probably, from Sakha into Evenk) Sakha örüs "river", cf. Evenk birag, ene, olus (dialectal), orus (dialectal) (apparently, from Sakha into Evenk). We might conclude that Evenk played some notable role in the formation of Sakha. This is not so surprising considering that Sakha probably acted as a cultural superstratum to Evenk, whereas Evenk, being scattered over the enormous territory of East Siberia, was apparently slowly losing ground to Sakha in the course of the 15th to 20th century. (3) Russian words are often hard to recognize because they are modified in accordance with the Sakha phonology, cf. the following examples from Swadesh-215: Sakha chierbe, Russian cherv' "worm"; Sakha sieme, Russian semya "seed"; Sakha ba:lkï, Russian palka "a stick"; Sakha bï:l, Russian pïl' "dust"; Sakha muora, Russian mor'e "sea". This phonological discrepancy implies that other borrowings and archaisms may have also become phonetically unrecognizable. For instance, the following Sakha words of Turkic origin are rather hard to spot at first glance: Sakha tïmnï "cold", akin to Karakhanid tum, tumlïG "cold"; Sakha xaya "mountain" akin to kaya "rock" in most other TL's; Sakha ürüN "white", akin to Orkhon, Old Uyghur, Karakhanid ürüN, Khalaj hirin "white" (apparently a rare archaism); Sakha buruo "smoke" akin to Old Turkic bur- "to boil, evaporate"; (4) The presumable Yeniseian borrowings are particularly interesting. Sakha kö "to fly", cf. Ket kï of the same meaning; Sakha kötör "bird", cf. Ket keNassel; Sakha kini "he, she, it", cf. Ket ki, kide [Note that kini is normally (probably, according to Ubryatova (1960-80's) explained as being akin to the Karakhanid-Oghuz-Seljuk kendi "self", however herein we wonder about a different perspective.]; Sakha kuttan "to fear", cf. Ket koran, qoren', qoranai; Sakha söp, söptö:x "right, correct", cf. Ket sotdas'; Sakha sü:r "to flow", cf. Ket sennei; It should be noted that Proto-Sakha could not have borrowed directly from Ket, the only living and well-attested representative of the Yeniseian family, but rather from an unknown extinct Yeniseian language. In any case, these presumable cognates are uncertain and are provided herein only as a matter of tentative conjecture. The presence of an unknown substratum in Sakha probably of Yeniseian origin implies that Proto-Sakha at some point inhabited the Yenisei basin, which is quite reasonable. There seem to be no noticeable borrowings from Yukaghir among the unidentified words. The few lexical similarities between Sakha and Altay-Sayan With only 57% to Tuvan, 61% to Khakas, and 56% to Altay in Swadesh-215 (borrowings excluded), Sakha seems to be a deep-going branch, no doubt of that. It is obviously strikingly different from any other Turkic language. This is because Sakha has many lexical innovations, whose etymology is often hard to explain, and which may in fact turn out to be borrowings from an unknown substrate. However, there seems to exist a number of words common only to "Siberian" languages (= Sakha, Khakas, Tuvan, Altay). Consequently, we should study these suspected examples, attempting to distinguish between archaisms and innovations. (1) Khakas ïzïr-, Tuvan ïzïr-, Sakha ïtïr- "bite"; however, ïsïr- is also found in Turkish, Tatar, Karakhanid and possibly elsewhere, therefore it is an archaism; (2) Khakas chïz-, Tuvan chod-, Sakha sot- "to wipe"; however, it's akin to Chuvash sâtâr-, therefore it is an archaism; (3) Khakas köni, Tuvan xönü, Sakha könö "straight (as a road)", also cf. Turkmen göni. The lexeme is found in many TL's, but this particular meaning only in Siberian Turkic, Altay dialects and Turkmen [see Sevortyan's dictionary, the V-G-D letters (1980)]. In any case, apparently, an archaism; (4) Khakas xarax, Tuvan karak, Sakha xarax "eye". However, *qaraq is also found in Kyrgyz, Old Uyghur and Karakhanid, which makes it a notable but hardly unique Siberian isolexeme. In the meaning "pupil", it is also found in Turkmen and Kyrgyz; the original etymology of this word is evidently "the black part of the eyeball, the pupil". Therefore, apparently, an archaism; (5) Altay sogon, Tofa, Tuvan, Chulym sogun, Khakas sogan, Sakha onoGos "arrow" is usually explained as a cultural borrowing from Samoyedic [Dybo (2007)]; Note: isolexeme or isophonolexeme (introduced herein) is an endemic lexeme, that is a variant of phonological forms and meanings used only within a particular set of languages / dialects in a particular, sometimes rather isolated, territory. For instance, the English lexeme "bad" with its phonological variants /ba:d/, /bæ:d/, etc. and the various typical meanings "not good", "unhealthy", "angry", etc. was originally confined to the dialects of the British Isles and is rather unknown in other Germanic languages. Even if a similar cognate were found in other languages, they woud probably have a different meaning or phonological shape. On the contrary, the word "good" is found in many Germanic languages and is hardly a local isolexeme. On the other hand, the following isolexemes seem to be innovative formations not found outside the supposed "Siberian" subtaxon: (1) Sakha sïrït, Khakas churt-, Altay d'ür- (jurtaar), Tuvan churtt-"to live"; obviously, from *jurt "home", "place of pasture", probably innovative, or at least an independent simultaneous semantic formation; note that Sakha included an additional (prothetic?) vowel into the root; (2) Sakha sïtïy-bït, Khakas chïzïG, Tuvan chïdïg"rotten" as opposed to *chiriq in most other TL's, including Chuvash; apparently, from *J'it- "to get lost, die, fade"; (3) Sakha erge, Khakas irgi, Tuvan ergi "old" as opposed to *eski in most other TL's; (4) Sakha tü:, Altay tük, Tuvan tük "wool" instead of the usual *Jün. The original meaning of this word was probably "fluff, fur". Could be coincidental as an independent development; (5) Sakha bes, Altay mösh, Tuvan pösh, Tofa bösh "pine" [Rassadin (1981)]; Another typical "Siberian" feature is preserved in numbers. The "Siberian" 40, 50, 60, 70 are all formed regularly as *trt-on, *pesh-on, *alt-on, *s'edi-on, whereas in any other Bulgaro-Turkic languages, including Chuvash, they retain an irregular structure *qrq, *elliG (evidently from *elig "hand"), *alt-msh / *ult-ml, *j'eti-msh / *s'eti-ml. The regular nouns may have formed in Proto-Sakha due to its stronger isolation from the rest of the Proto-Turkic tribes, and then reborrowed into Altay-Khakas by maintaining trade between Proto-Sakha and Proto-Altay-Khakas, or at least this is the most plausible explanation. In any case, you can see that the number of the purported shared phono-semantic and lexical "Siberian" innovations seems to be exceedingly small: we have found only 4-5 words which are difficult to discard outright. It is highly questionable whether this amount could be sufficient to demonstrate the hypothetical Sakha-Altay-Sayan ("Siberian Turkic") common descent. On the other hand, there exist certain words or semantic formations shared not just by Altay-Sayan but also by the languages of the Great Steppe, that is, any other languages excluding Orkhon-Oghuz-Karakhanid and Chuvash, e.g. (1) *but "leg" as opposed to Oghuz-Seljuk *but "thigh"; probably an arachism judging by its presence in other Altaic; (2) tün "night" as opposed to Oghuz-Seljuk *dün "yesterday", but also Chuvash s'er "night", ener "yesterday"; probably an arachism judging from its presence in Chuvash; (3) Sakha aha:, Khakas azraan, Tatar asharga, Bashkir ashau, Karachay asharGa "to eat", whereas in most other TL's the word ash is used only to mean "food" (noun); probably a natural semantic development;; (4) Sakha xatïr-ïq, Khakas xastïr-ïx, Yugur qazdïq, Tatar qayrï, Bashkir qayïr "(tree) bark", also Tuvan qazïr-ïq "scales, a layer of dirt". Chuvash xuyâr "bark" seems to be a borrowing from Tatar. Apparently, an archaism; These findings could make one wonder whether Yakutic—Altay-Sayan—Great-Steppe may have once constituted a single unity, as opposed to Orkhon-Oghuz-Karakhanid. However, most of these words seem to be archaisms or independent coincidental semantic formations. Unexpected similarities between Sakha and Tofa The similarities with Tofa are evident already from the following similar features first discovered by Rassadin in Morfologiya tofalarskogo yazyka v sravnitelnom osveschenii (The comparative morphology of the Tofa language) (1978): Sakha and Tofa share at least the following features: (1) a unique partial case in -ta/-da; (2) the -ïn ending in the accusative case; (3) the adjective ending in -sïN /gï, cf. Sakha -sïN / ï; (4) a similar system of onomatopoetic verbs; However, Tofa is undoubtly much more similar to the Tuvan subtaxon, than to Yakutic, so no direct genetic unity unifying Sakha and Tofa is supposed to exist. This makes us suspect that most of the similarities found between Sakha and Altay-Sayan result from a secondary interaction and convergence. We suspect that Proto-Sakha may rather have acted as a substrate for Proto-Tofa, so Tofa may have formed when the early Proto-Yakutic speakers switched to Tuvan. For the geographical explanation of how this might have happened, see the map below. Conclusions: There are drastic lexical differences separating Yakutic from Altay-Sayan (hardly 58% of common words in Swadesh-215), and the majority of Altay-Sayan isolexemes cannot be found in Sakha and vice versa. Similar considerations refer to the few grammatical and lexical features that Sakha shares with Altay-Sayan and the Great-Steppe taxon. The number of these isolexemes and isogrammemes is insufficient to make any conclusions concerning their possible unity. It seems that Sakha just won't fit into the Altay-Sayan subtaxon being pretty much independent. Proto-Sakha was the first to separate from the Proto-Turkic stem at a very early stage, leaving enough time for the Altay-Sayan shared innovations to develop. Despite the strong Mongolic influence in the vocabulary, Sakha still must retain many archaic features important in the reconstruction of Proto-Turkic. Moreover, the analysis of borrowings in the basic vocabulary may indicate that Sakha could have initially developed upon an unknown Yeniseian substratum acquired in an unknown area, but most likely when the Sakha were still near the Yenisei basin. On the other hand, even though the number of possible grammatical and lexical elements shared with Altay-Sayan is rather small and in many cases, there are only tiny traces of innovations, they cannot be discarded outright. It is plausible that Proto-Sakha could have affected the grammar and lexis of Proto-Altay-Sayan leaving a few unexpected common features here and there. That is particularly true of Tofa, that has several shared elements with Sakha, as found by Rassadin (1978-81). We may conclude that these features shared between Yakutic and Altay-Sayan do not come from their initial genetic relatedness but rather emerge from a secondary contact and convergence. Therefore we may infer that Proto-Yakutic could have served as a substrate for Proto-Altay-Sayan which later moved along the same route (presumably along the Yenisei) in a secondary migration wave, thus interacting with Proto-Yakutic and acquiring some of its features. We may still use the term "Siberian" in quotes as a suitable name for the Sakha plus Altay-Sayan Sprachbund including any features that they may share either accidently or due to shared archaisms or as a result of the presumable mutual interaction. How did Sakha actually get there? It should be noted that the physical distance from the Altai and West Sayan Mountains to Yakutsk City [or the historical Tuymaada Valley where Yakutsk is located] is just enormous and exceeds 3500 km (2200 miles) in a straight line, being approximately equal to the distance from the Altai Mountains to Chuvashia and Volga Bulgaria along the Volga. That marks a noticeable curve on the globe and provides an interesting geographical perspective on the matter, making Sakha and Chuvash look like sort of mirror images of each other. That also poises questions about how and why the Sakha people could have covered that immense distance, when they migrated to the middle Lena. To answer them, we should turn to the consideration of the following points below. The lack of dialectal differentiation within Sakha Notably, despite the drastic linguistic differences from other Turkic languages and the gigantic geographic territory it covers, Sakha is rather surprisingly uniform as far as its dialectal differentiation is concerned. It has only one closely related sibling language (Dolgan) and only a few mutually intelligible internal dialects which, for the most part, are reported to differ only in phonology. This particular point of absent siblings makes us infer that the expansion of the Yakuts along the Lena has been a relatively recent event. Otherwise, how can we explain a linguistically uniform expansion over an enormous geographic area extending for three thousand miles? Indeed, in a similar case with the Khanty language (pronunciation: /HUN-tee, HAHN-tee/) (Finno-Ugric family), in which the Khanty people must have expanded in a similar way over the lower Ob basin in the course of one or two thousand years, we find much stronger linguistic diversification. The dendrogram produced by the group of Georgiy Starostin (2010) confirms the complexities of the Khanty-Mansi internal phylogeny, that consists of multiple language-dialects, so, for all practical purposes, Khanty can presently be viewed as a taxon, not a single language. [See here for details]. The diversification of Khanty-Mansi [Straling database (2010)] The absence of a similar glottochronnological diversification in Sakha as well as the existence of multiple, highly-diversified dialects and lesser-known sub-languages in Khakas, Tuvan, Altai and other "Siberian" Turkic languages of presumably comparable age, the abundance of Mongolian borrowings in Sakha's basic vocabulary, all make us wonder about the peculiarities of Yakutic prehistory. Naturally, a similar scenario is well-known for Middle English, which has become completely unrecognizable since the Anglo-Saxon times, absorbing many Scandinavian, French and Latin borrowings, but developing very few natural siblings (though its dialectal differentiation is far stronger, and it also has many creole relatives). It could be surmised that a similar kind of process may have affected Sakha, as well. It seems there could have been a dramatic turning point in Sakha's prehistory that resulted in an ethnological crisis, the inflow of Mongolian loanwords and the extinction of any possible siblings that had existed before that period. Judging by the lack of dialectal diversification, and the fact that the other in-group sibling languages (besides Dolgan) did not have enough time to develop, that crisis must have occurred during the recent historical past, probably less than a 600-900 years ago. The lack of genetic differentiation in Sakha According to Brigitte Pakendorf [Brigitte Pakendorf, Contact in the Prehistory of the Sakha, Linguistic and Genetic Perspective, (2007)], "the genetic results provide clear evidence for the strong founder effect in the Sakha paternal lineage — thus, it is clear that the group of Sakha ancestors who migrated to the north must have been very small". The expansion of the Sakha haplotypes (N1c1), found in 90-94% of Yakut population, falls with 95% confidence within the temporal interval between 700 and 1500 CE (idem). Similar consideration can be found in a different source [Eric Crubezy et al, Human evolution in Siberia: from frozen bodies to ancient DNA, BMC Evol Biol. (2010)], which states that the origins of the Yakut male lineages can be traced down to a small group of horse-riders from the Cis-Baikal area (that is, located west of Baikal), which began to spread before the 15th century AD. This information about the strong bottleneck effect and the existence of just one male progenitor who must have founded all the present-day Sakha clans confirms our hypothesis about the sudden extinction of Sakha siblings in the past. Corroboration from Sakha legends According to Sakha legends, the progenitor of all Yakuts was Elley Bootur, who was of "Tatar" origin and who fled to the middle course of the Lena, running from "a great war or persecution". The word *ba:tur < *baGatur is either a Turkic or Mongolic word for "warrior; strongman; hero" that passed into many languages, hence for instance Ula:n Ba:tar "Red Warrior", the capital of Mongolia, or Yesügei Baatur, Genghis Khan's father. Elley Bootur married the daughter of Omogoy (or omoGoy, oNohoy, oNoGoy) Bay, who had originally lived in the land of Mongols [even though the name's phonology suggests Evenk origin, cf. Evenk omakta "new", emugde "belly", oNokto "nose], but who had also fled to the north when the wars during the Genghis Khan rule (?) broke out. Omogoy Bay had settled down in the delta of the Chara River (a tributary of the Olyokma) near confluence with the Lena about 300 miles from present-day Yakutsk. Alternatively, according to an early version of this legend recorded in the 1740's by Lindenau, Omogoy Bay lived somewhere along the upper Lena, having fled in that region from Lake Baikal. [Enciklopedia Yakutii (Encyclopedia of Yakutia), Chief Editor: Safronov F. G., Moscow, 2000] Consequently, our initial hypothesis of mass extinction during the 13th century and a fleeing migration to the north along the Lena continues to find additional support. The idea that Proto-Sakha tribes could have been persecuted by the Mongols is also partly corroborated by the passages in the Secret History of Mongols (1240) [which seems to be the Genghis Khan's personal memoirs written down by a literate scribe in the 3rd person]. The History mentions the genocide of "Tatars" during the early 1200's. The "Tatars" are said to have been the old enemies of the Mongols, and Genghis Khan's father died three days after paying a visit to a "Tatar" clan feasting in the steppe. These Tatars are said to have lived somewhere near the Onnon and the confluence of the Orkhon and the Selenga, in other words, not too far from the southeastern shores of Lake Baikal, which leads to a conjecture that those "Tatars" could have originally been just an easternmost offshoot of Proto-Sakha. However, it should also be explained that "Tatar" was apparently just an ancient clan name that could become part of many different ethnicities and could even be used by the Mongols as a misnomer, so we cannot make conclusions about its ethnic or linguistic affiliation just using the name alone. The History does not mention which language they spoke or if they could speak a language different from Mongolic. Yet, in the Secret History of the Mongols we also find that Genghis Khan's original name, Temujin, was given because a certain "Tatar" named Temujin-Uge had been captured the day before his birth. This name seems to mean Temir-ji aGa "Blacksmith the Elder-Brother", a phrase recognizable in many Turkic languages. Moreover, Genghis Khan's subsequent name may originate from Tengis Kagan, where Tengis (Turkic "The Sea") is mentioned in the very first lines of the History, and presumably refers to Lake Baikal, since there are not too many large lakes in the area. So we may assume that the "Tatars" that lived near the Onon River east of Baikal could indeed have something to do with Turkic tribes. Even though these inferences are not completey conclusive, they make look the "Tatar"-Kurykan-Sakha connection rather plausible. Positioning Proto-Sakha near Lake Baikal Before the time of great crisis, the Proto-Yakuts were probably identfiable with the Kurykans, mentioned in one of the Orkhon inscriptions c. 730 as "üch qurïqan", seemingly forming the Kurumchin archaeological culture situated near the western shores of Lake Baikal and dated to the 6th-9th century AD. The identification of Proto-Sakha with this culture is a well-known and old hypothesis, based on temporal and geographical considerations and the medieval Chinese records, see [A. P. Okladnikov, Origins of the Yakut people (1951)]. The Kurumchin culture, which includes such trades and artifacts as stone walls, sacrificial stones, petroglyphs, agriculture (wheat, rye, millet), iron-making forges, cattle, camel and horse breeding, was focused near the present-day Irkutsk City and around the area of the Murin River (the name itself is probably akin to Mongolian or Buryat müren "river"). The Kurumchin culture could also be found on Olkhon Island in Lake Baikal, which is just miles away from the many sources of the Lena basin, including its large upper tributary Kirenga. This proximity of the Lena sources smoothly explains the geographic connection between the northern Yakuts of the middle Lena and their possible southern ancestors, the Kurykans of Lake Baikal. Note: This may also explain why the word Baikal seems to be a Turkic hydronym (from bay "rich" and köl "lake"). The distribution of the Buryat and Merkit people The present-day distribution of the Buryat people along the western shore of Lake Baikal and the close proximity of modern Buryat to Middle and Khalkha Mongolian suggests that the Buryat began to arrive in the area of Lake Baikal from Transbaikalia during the early period of the Genghis Khan expansion. As a result they must have diplaced the Kurykan tribes pushing them in the northwest direction. The Secret History of the Mongols tells about the dispersal of the Merkits, a Mongolic clan that, who along with the "Tatars" and the Naimans, were persecuted by the troops of Genghis Khan and his allies in the late 1190's and who tried to escape north by "entering [the Land ] of Bargujin along the Selenga [River]". In other words they were fleeing towards the eastern shore of Lake Baikal, the area situated between the deltas of the Selenga and the Bargujin, which are the rivers that flow into Lake Baikal at the eastern shore. As a result the new lands of the Merkits must have been located just 30-50 miles away from the supposed lands of the Kurykans living across Baikal. It is easy to assume that, having been deprived of their cattle and other possessions, and following the domono effect, the desperate Merkits could have attempted an assault at the Kurykans, though these events were naturally outside of the scope of the History that mostly tells about Genghis Khan's personal experiences. Consequently, even though this is entirely hypothetical, we may assume that the Merkits or other neighbouring tribes could have crossed Lake Baikal on ice in winter (only a 20-mile horseback ride) and attacked the Kurykans. They did not even have to apply the extermination policy that Genghis Khan used with the "Tatars", since just destroying winter shelters or taking the cattle away would have lead to mass starvation in the Kurykan settlements. Only the few survived by running to the mountains. This assumption does not explain, however, when and how the Sakha language acquired its Mongolic vocabulary. The Buryat clan is also briefly mentioned in the History as being subject to persecutions, and it is quite plausible that the Buryat, the Merkit and other clans of northern Mongolic tribes have finally contributed to the ethnogenesis of the present-day Buryat people in the vicinity of the southern shores of Lake Baikal and the Trans-Baikalian region, and the presumable exile of the Kurykans. Geography predicts a raft migration from Baikal to Yakutsk How did the Proto-Sakha migrate from Lake Baikal to the present-day area of Yakutsk? There seems to be a simple solution to this seemingly complex problem: the Sakha could have uses a raft or boat migration downstream along the Lena, so a good portion of this gigantic journey from Baikal to Yakutsk could be accomplished in a relatively short time. This is is partly corroborated by one of the legend versions that mentions traveling by raft. Getting to the Lena River from Baikal is quite easy. The Lena does not have a single source, rather it starts from many small rivers flowing down the western side of the mountain ranges surrounding Lake Baikal, so just a 10-mile walk from the shore across the range will nearly automatically land anyone in the upper Lena River basin — one cannot miss it. The Tuymaada Valley along the Middle Lena, where Yakutsk City was founded in the 17th century, was known for human settlements since the Bronze Age and even Paleolithic, so evidently the Sakha were not the first to reach this northern territory, and many other ethnic groups could have migrated north using the same route along the Lena. But how did Proto-Sakha even get to Lake Baikal? We have established that Sakha demonstrates convergent features shared with the Altay-Sayan and probably some of the Great Steppe languages, all of which are located either along the Yenisei river or further west. So how could Proto-Sakha move from the Yenisei area to the Kurykan settlements at Lake Baikal? And even if they moved to Baikal from an area other than the Yenisei, that migration must still have proceeded from the west, which is getting us back to the same question. Note that a raft migration towards Baikal along the Angara from the west is much less likely, because the Angara flows from Lake Baikal, so one has to go upstream in that case. The early migration of Proto-Yakutic, herein (2011)] Essentially, there exist three plausible routes from the Yenisei to the Cis-Baikal area [=the area west of Baikal]. (1) Across the taiga? The Proto-Yakuts may have moved along the East Sayan Mountains and right across the taiga (which includes some of the land belonging to South Samoyedic tribes), that is, roughly along the way of the Trans-Siberian railroad built by the beginning of the 20th century. In a straight line, this potential track would cover a huge distance of over 900 km (550 ml) (from present-day Krasnoyarsk to Irkutsk). It would mostly cut across rivers flowing down from the foothills of the East Sayan Ridge, so one would have to know precisely which direction one is taking to get to the destination, given that there is no natural orientation system when traveling across a river basin. Therefore such migrations would most likely have had to proceed in a rather random and unsystematic way before the migrants could reach their goal. If this route had actually been taken, we would have presently find many post-Proto-Sakha groups scattered all over the forests between the East Sayan Mountains and the Angara River, which are actually entirely absent. We should also take into consideration the perils of the taiga travel, such as deep snow in winter, gnat in summer and the evident lack of water as soon as one turns away from the river course. These are obvious reasons why much of this area is still uninhabited up to this day, except for regions with modern roads, railroad tracks and city areas. The attestation of South Samoyedic (Kamassian, Karagas) in the western part of this track, which had supposedly arrived in the area before the Turkic inhabitants and which could probably provide some military opposition to them, equally implies that this territory had most likely been undisturbed until the beginnings of the 17th century. Therefore, we may conclude that the route across the taiga was probably never taken by the Proto-Sakha migrants. (2) Along the Angara? Another passable route goes up the Angara River, starting from its confluence with the Yenisei to the Angara's source near the southwestern edge of Lake Baikal. That route is even longer — actually, its length is impossible to calculate precisely because of the many twists and turns of the river's meandering course — but it probably extends for a couple of thousand of kilometers making the potential migrants row hard upstream all the way, with some dense woods and forests along the riverbanks, so neither a natural naval transportation system nor an easily-available shoreline horseback travel could be used for that endeavor. Winter travel on the ice is more plausible but would probably be hindered by extremely low January temperatures. As in the previous case, no remnants of Turkic tribes were ever found along the Angara or its tributaries. Also note that the many tributaries would tend to divert the migrants away from the initially undetermined destination into even more remote corners of Siberian taiga. We should also keep in mind the possible opposition from the Yeniseian hunting tribes supposedly inhabiting at least some parts of this region. The earliest record of the Russian Cossacks (1620-1630) in the area of Bratsk fortress mention clashes with the "Buryats" and "Tunguses" [=the Evenks] but apparently no Turks / Kyrgyzes / Tatars were spoted in the area, even though the Cossaks had already been familiar with them and should have been able to recognize them. It is theoretically possible, however, that this type of migration could have begun to take place at some point in the past, but probably could not progress very far. (3) The Mongolian track? The third possibility is traveling all the way along the upper course of the Yenisei, which would finally land any potential migrants either (1) in the East Sayan Mountains — where the Tofa people presently live — (if the potential migrants followed the Greater Yenisei) or (2) in the Darkhat Depression with a relatively small lake called Drod-Tsaagan in its center — where the Tsaatan and Soyot people from the Tuvan subgroup presently live and still wander along with their reindeer herds (if the potential migrants followed the Lesser Yenisei). The Darkhat Depression, the habitat of Tsaatans, is located across the watershed from Lake Hövs-Göl (Khövsgöl), the largest lake of Mongolia, sometimes known as the sister lake of Baikal. Even though, the entire area there is mountainous, traveling along the course of the Lesser Yeneisei among relatively sparse Mongolian forests makes it a more viable option. For centuries, this route must have been extensively explored by many reindeer and horse breeding herdsmen from Tuva and Mongolia who live in the vicinity, and it is evidently passable. At the northern edge of Lake Hövsgöl, there is another watershed, beyond which there is the habitat of the Soyots and the source of the Irkut river. As soon as the potential migrants reach the Irkut, it can carry them downstream to the upper Angara in the matter of weeks, and land them all automatically where the present-day Irkutsk City is located, that is, near the area where the Kurykan settlements were attested. The overall track length from Yenisei to Baikal is roughly the same as in the two preceeding options — about 1000 km (600 mil), but requiring much less effort, especially in the second half of the journey. Of course, Tofa curiously shares with Sakha several unique grammatical features, so we have a good confirmation for this hypothesis. Even more curiously, the self-appellation of the Tsaatans is in fact "Tu'kha" (with an aspirated [t] and a glottal stop in the middle of the word) which is immediately reminiscent of "Sakha". However, this may be a pure coincidence. If it is not, it could be a clan name borrowing or a clan acquisition, when a part of a clan stays to live with another ethnic group. Therefore, we may conclude that Proto-Sakha could be a substrate both for Tofa and Tu'kha, both of which later switched to Tuvan, and this is how the Tofa and Tsaatan (Tu'kha) languages had probably appeared and evolved. Moreover, the travel through Mongolia could help to explain the Mongolian borrowings in Sakha, though these could also be acquired later from the Proto-Buryats, when the Kurykan people were already near Lake Baikal. The presence of the reindeer economy in the Darkhat Depression, so typical of the Sakha and other North Siberian peoples, is also surprising and may even shed some light on how Sakha and other North-Siberians became reindeer herders. The spread of the reindeer economy from the Sayan Mountains had long been conjectured, but there was no specific mechanism for this process, and the present hypothesis about the movement of Proto-Sakha through the Sayans could shed some light on it, though this complicated matter cannot be discussed here at any length. In any case, the Mongolian track seems far more plausible than any other option, and is well-supported by the lack of geographical obstacles and the presence of ethnographic and linguistic corroborating evidence. Conclusions: The analysis of the Sakha dialectal differentiation, genetic makeup and oral history all imply that the Sakha language could have become what it presently is only after a bottleneck event that resulted in a dramatic extinction of any sibling clans and their languages. Before that period, according to the theory created by Okladnikov (1951), as well as judging from the local geography, archaeology and the Chinese and Old Turkic historical records, the Proto-Sakha people may be possibly identified with the Kurykan people near Lake Baikal. The analysis of the Secret History of the Mongols (1240) suggests that after the late 1190's the Kurykan Turkic tribes may have possibly been attacked, in the domino effect, by the Mongolic clans, presumably the Merkits and Buryats, which in turn had been pushed from their original settlements by the expanding Mongols of Genghis Khan. The Kurykans may have tried to escape from the Mongolic invasion by moving north along the Lena River and its southern tributaries in a downstream migration, most likely using simple water transport, such as rafts. This migration down the Lena could have occurred rather swiftly on historical scale. Before that period, Proto-Sakha had existed in a remote southeastern area, such as the forested ridges adjacent to the western shores of Lake Baikal near the multiple sources of the Lena, possibly even expanding eastwards into Trans-Baikalia and producing some linguistic and genetic offspring east of Baikal. These hypothetical Proto-Sakha groups later became extinct during the Mongol expansion of the early 1200's. The geomigrational analysis and certain linguistic elements shared with the Altay-Sayan subtaxon, particularly with Tofa (discovered by Rassadin (1981)), suggest that the Proto-Sakha had migrated into the Lake Baikal area by moving along the upper reaches of the Yenisei River in present-day Tuva. Proto-Sakha in Tuva must have been displaced there after the arrival of Proto-Tuvan circa 200-300 CE (glottochronological dates) and had to move into the area of the Darkhat Depression and Lake Khövsgöl in northern Mongolia and then migrate down the Irkut River towards Lake Baikal by about 600-800 CE. On the origins of Turkic ethnonymy The present atricle suggests that nearly all of the Turkic ethnonyms must have had their origins in the names of their clan progenitors. The earliest recorded oral Turkic histories, as exemplified by the Oghuz-Khan Narratives, written down by Rashid-al-Din (c. 1300), or the Shajare-i Türk (The Genealogy of Turks) by Abu al-Ghazi_Bahadur (c. 1659), were essentially descriptions of series of legendary events occurring to Turkic clans and their original male progenitors. Therefore we have a very clear and unmistakable identification of most Turkic ethnonyms as nothing but patronymic surnames adopted by all the members of that clan. For instance, in al-Gazi Bahadur's work, such names as Turk, Oghuz, Uyghur, Kypchak, were clearly and unambiguously associated with male clan founders, including many presumably fictional or real details from their personal lives, which leaves little room for other etymological speculations, e.g.: He [Japheth] had eight sons [...] Their names were as follows: Turk, Hazar, Saklab, Rus, Ming, Chin, Kemeri, Tarykh.By the same token, Mahmud al-Kashgari (1071-74) says, "The Turks are in origin twenty tribes. They all trace back to Turk, son of Japhet, son of Noah, God's blessing be upon them." Similarly, according to the legend recorded by Ye. S. Filimonov in 1890 [cited in L.V. Dmitriyeva, Yazyk barabinskikh tatar (materialy i issledovanija) (The language of Baraba Tatars (materials and studies)), Leningrad (1981)] the progenitor of all the Baraba Tatars was the old man named Baram who migrated from a southern land to the north, between the Irtysh and Ob River, where he found plenty of fur animals, birds and fish; there, he had eleven sons — Kelem, Uguy, Uzun, Tukus, Lyubar, Kargal, Kirkach, Choy, Turas, Teren, Baram, — who after Baram's death divided his land into eleven parts (the aymaks). According to Dmitriyeva, these name still mostly correspond to the names of local auls (villages). This legend renders unfounded all the frequent alternative folksy-etymology interpretations of the Baraba name as barma "don't go", baraman "I'm going", etc. The existence of a specific Baraba clan among other Baraba Tatar clans with different names was confirmed by the demographic data collected and cited by Radlov in 1865 [Aus Sibirien. Lose Blätter aus meinem Tagebuche (From Siberia: Torn pages from my diary), Wilhelm Radloff, Leipzig, 1893]. By the same token, the Khakas legends attribute the origins of the Khobyy seok (where "seok" means "bone", that is "clan" among the Altay and Khakas people, and which is actually one of the largest clans in the Sagai and Shor ethnicities) to the legendary progenitor named Kobïy Adas. The reason why this evidence has been usually omitted is probably because at some point the scientifically-oriented researchers began to doubt the correctness of mythical factoids described in such legends. However, even if we doubt specific points, there is hardly any reason to doubt the semantic worldview in general as adopted by the early Turks and the recorders of these legends. The early Turkic oral history was documented in a society that reflected the typical male clan social structure, similar to the one described in the Tora and the Quran, where all historical events were likewise often seen as actions of strong and powerful clan forefathers. However, in the course of the 20th century, the original clan structure and the associated ethnographic tradition was almost entirely destroyed and forgotton, consequently a number of folk etymologies and semantically unfounded interpretations concerning the origin of Turkic and Mongolic ethnonyms appeared. On the other hand, we know full well from historical records that such modern names as Nogai, Uzbek, Seljuk had originally been nothing but personal names, later spreading to the title of a respective dynasty, and then finally to the whole ethnic group or nation. The expansion from a clan name to an ethnicity or a national name seems to be a common phenomenon occurring with ruling clans that were seen as encompossing the whole large ethnic group. For instance, it was noted as early as Gerhard Miller (1733-1743): "...because the Barabas are, of course, Tatars, as their language shows. Whereas 'Baraba' or 'Barama' is not the name of the whole people, but rather the title of a certain special generation, since other [groups from the Baraba Tatars] also title their generations in a similar way, e.g. Luba, Terenya, Tunus, etc." [Gerhard Miller, Istorija Sibirskaja (The History of Siberia), Saint-Petersburgh (1750)]By a "special generation", Miller meant a clan, showing that the Tatars living near Lake Chany originally had many different clans in their social structure, whereas the name Baraba for all of these Tatar clans must have been therefore a recent extension. By the same fashion, the European surnames also go back to the personal names or aliases of single male individuals, such as Johnson to John, etc. In both cases, we witness the remnants of the patriarchal clan structure and the associated patrileneal worldview. In the instance of the Nogai, we can see that, even though the name originally meant "dog" in Mongolian, there is just as little association with the dogs as in Bush, Green, Taylor, etc. with the respective concepts they represent. Therefore, we may conclude that nearly all the ethnonymic hypotheses or folk etymologies, that attempt to refer a name of a Eurasian ethnic group directly to some kind of the real-world phenomena, are usually unfounded, since nearly all such names originally referred to a personal name or alias of the clan's genetic progenitor or male leader. In the Indo-Euroean languages, the original word for "clan" seems to be reflected in the Latin genus, Greek genos, Irish Gaelic clann, Modern English kin from Old English cynn, Gothic kunni, Old Russian koleno. It seems that only after this, we can truly understand the significance of the male haplogroup research conducted in the 1990-2010's. The male DNA markers, just like male surnames, were inherited along the paternal lineage, so they represent the ancient clan markers. And the male clans were pretty much everything to ancient peoples. In fact, the very usage of the word adam for man (from Semitic *adam) in most western Turkic languages (e.g. Azeri, Turkish, Tatar, Bashkir, Uzbek, Uighur, Kazakh, Kyrgyz, etc), as well as in Persian, Hindi, Fulani, Indonesian etc., reflects the same tradition of ascribing the descent of the whole ethnic group, even the whole humanity, to one single individual. In this worldview, the history of the whole ethnicity is often seen as an outcome of some action of a legendary ancestor, whose life is poorly understood, with just a few reminiscences surviving in legends, but who presumably passed on his blood to the whole clan, then a confederacy of clans, and finally to the whole ethnic group and even the whole modern nation. (In some cases, however, the name does not go back to the semi-legendary figure himself but rather to that of his father or grandfather, cf. the difference between Seljuk and Togrul Beg.). Herein, we suggest to name this historiographic conception as Adamic ethnonymic paradigm. It should be stressed that this historiographic worldview is not based on or borrowed from the Abrahamic religions, rather being part of a much older naturally-occurring human tradition. By the same token, we should infer that the names of other oldest Turkic clans, whose ethnonymic origins have been lost, such as Kyrgyz, Bashkir, Kimak, Tatar, Sakha and so on, also go back to personal names, rather than any abstract or natural concepts, just because there seems to be hardly any other way of naming clans and ethnic groups in the old Turkic tradition. For instance, Kyrgyz was a surname originally belonging to a male progenitor who received a name or a subsequent alias Kyrgyz, probably because of his force, since Turkic verbs kyr- "to break" and kork- "fear" imply vigor or some fearful action. Radlov reports (1860's) that the newborn Altayans often received their names from completely accidental events, such as someone entering a yurt with a particular object or something happening shortly before their birth, so we must conclude that trying to find much meaning in clan names will not get us very far. However, leaders like Temujin, who got his first name from a Tatar named Temujin-Uge captured the previous day, may have subsequently chosen a more articulate name, e.g. Tengis Kagan, from "The Sea" where his ancestors beyond 12 generations had once lived, apparently Lake Baikal. The Altay-Sayan subgroup The Sayan-Altay subgroup supposedly includes at least the following languages that belong respectively to the Tuvan, Khakas, and Altay subgroups: (1) Tuvan, Todzhin, Tofa(lar), Tsaatan, Soyot; (2) Sagai Khakas (whence Standard Khakas), Kacha Khakas, Kyzyl Khakas, Fuyu Kyrgyz, Mras-Su Shor, Kondoma Shor, Middle Chulym; (3) Altay-kizhi (whence Standard Altay), Telengit, Teleut, Tuba, Kumandy, Kuu, etc. Below, we will try to show why this approach to the classification of the local languages seems to be correct. Tofa and Soyot are related to Tuvan The fact that Tofa and Soyot are closely related to Tuvan, follows at least from the following evidence. Tuvan, Tofa, Soyot vocabulary (1) Dybo's lexicostatistical research (see above); (2) The fact that most words which are unique to Tuvan (among other TL's) are usually liekwise present in Tofa and Soyot, for instance: Tuvan chu:(l), Tofa chü, Soyot chü "what?", from Mongolian; Tuvan bichi:, Tofa biche, Soyot biche "few, little"; "small", also cf. Chuvash pêchêk, akin to Mongolian *bici-qan "small"; Tuvan ïndï:, Tofa ïndï: "the other one", apparently, from the Turkic *onda "over there, that one"; Tuvan uruG, Tofa uruG, Soyot urïG "child". of Turkic origin, with the initial meaning "seed"; Tuvan ashaq, Tofa ashïNaq, Soyot ashshyaq "husband", from Turkic; Tuvan iye, Tofa iGe, Soyot i'hê "mother", probably from Mongolian ekh, Buryat ehe; Tuvan but, Tofa but, Soyot but "foot", from Turkic, instead of *azaq; Tuvan xat, Tofa qat "wind"; Tuvan xadï:r, Tofa qadï:r "blow (as of wind)"; Tuvan kesh, Tofa ke'sh, Soyot ke'sh "skin", cf. Karakhanid qas(uq); Tuvan dïNna:r, Tofa dïNna:r, Soyot dïNna:(r) "to hear", from Turkic; Tuvan mana:r, Tofa mana:r, Soyot mana:(r) "to wait", akin to Khlkha Mongolian mana-x "to guard"; Tuvan eshti:r, Tofa e'sht:r "to swim", also cf. Chuvash ish-; Tuvan da:ra:r, Tofa da:ra:r, Soyot da:ra:(r) "to sew", apparently, a cognate of the normal *tik root as in Khakas tigerge but with some specific phonological modifications; Tuvan xem, Tofa xöm "river"; Tuvan oruq, Tofa oruq, Soyot orïq "road", of Turkic origin, from *or- "to dig" [see SIGTY, Lexis (2002)]; Tuvan eqi, Tofa e'qqi, Soyot eqqi "good", apparently an archaism, also exists in the Old Turkic eDgü, Turkish iyi, Karachay-Balkar igi, and probably Sakha üchügey; Tuvan baq, baGay, Tofa ba'q, ba'xay "bad"; Even though some of these words share parallels with Mongolian, many of them seem to be original Turkic words found mostly only in Tuvan and Tofa, which suggests their close relationship. Tuvan geography The geographical relationship between Tuvan and Tofa can be explained in the following way. Initially, the Tuvan people were those Turkic tribes that followed the upper reaches of the Yenisei River into the East Sayan Mountains. There exist two main sources of the Yenisei, the Greater Yenisei (Biy-Xöm) and the Lesser Yenisei (Ka-Xöm). The Tuva's capital Kyzyl is located at their confluence. The many tributaries and sources of the Greater Yenisei lead northeast towards the East Sayan Ridge. This bordering area between Tuva and Irkutsk Oblast near the West Sayan Ridge is known historically as Tofalaria, because Tofa mostly inhabit the East Sayan Mountains, which separate the basins of the Greater Yenisei and the Angara River. On the hand, the Lesser Yenisei goes east towards Lake Khövsgöl in Mongolia, an area originally inhabited by the Tsaatans (in Mongolia) and Soyots (in Russia), which, according to Rassadin, the main field researcher of these languages, are closely related to Tofa and Tuvan [see V.I. Rassadin, O probemakh vozrozhdeniya i sokhraneniya nekotorykh tyurkskikh narodov Yuzhnoy Sibiri (na primere tofalarskogo i soyotskogo) (2006)]. The Soyots are said to have moved north into Russia from Lake Khövsgöl only 300-400 years ago, though this is mostly based on hearsay evidence from their legends. Consequently, Todzin and Tofa must have formed when a part of the Proto-Tuvan tribes moved along the Greater Yenisei (the Biy-Khem), until they reached the forests of the Eastern Sayan Mountains. Whereas, Tsaatan and Soyot must have formed when the Proto-Tuvan tribes moved along the Lesser Yenisei (the Ka-Xöm) towards Lake Khövsgöl in northern Mongolia. Tuvan hydronymy Curiously, the hydronyms of Tyva (Tuva) are clearly and specifically Tuvan, considering they often involve isolexemes or phonetic elements present only in the Tuvan-Tofa subgroup. Cf. Biche Bash "small-head (river)", Ulugan Khöl "large lake", Choygan Khöl "pine lake", Many Khöl "Marble Lake", Chazag "summer camp (river)", Kargy (river) (apparently from kargaar "to damn"), Balyktyg Khem "fishy river", Ulug Orug "big way (river)", Tashty Khem "stony river", Ak Sug "white water (river)", Chadan (apparently from chada "step" > river rapid), Uyuk "dumbfounding (because of the noise) (a river)", Chas-Adyr "springtime fork (spur) (a river)", Kara Khöl "black lake", Khadyn "birch (lake)", etc. However, the hydronyms quickly change into Mongolian as soons as one crosses Mongolia's and Buryatia's border. This phenomenon of the local hydronymic continuity is not as common as it may seem and it is probably indicative of the lack of a stable pre-Tuvan substrate in Tuva, and a relatively early occupation of this territory by Proto-Tuvan tribes (about 1500-2000 years ago, which is supported glottochronologically). The Khakas languages On the origins and usage of the ethnonym Khakas The term Khakas has been introduced only in 1918 during the turmoil of the Russian Revolution, and seems to be nothing but the then-accepted reading of the supposed word "Kyrgyz" in Chinese chronicles, which presumably referred to the Yenisei Kyrgyz people [see the discussion by S. Yakhontov, V. Butanajev, S. Klyashtornyj in the Etnograficheskoje obozrenije (1992)]. Even today the ethnonym Khakas is rarely used by native speakers, except maybe in formal situations. In fact, Altay and Khakas people have traditionally referred to themselves as just Tadar(lar) "Tatars", either because this was the usual name given by Russian Cossacks to nearly all the Turkic peoples in the course of the 17-19th centuries, or because this name could indeed have existed even earlier. The latter point is, however, uncertain. In any case, the Khakas taxon is subdivided de facto into a number of major dialect-languages, such as Sagai (first mentioned in 1311 in Persian records, and then in 1620 in Russian sources), Kacha (fist attested in 1608), Kyzyl (nearly extinct), Koybal, Beltir (extinct), etc. The Sagai Khakas people are mostly scattered in rural areas along the foothills of western Khakassia, so pure Sagai is now rarely spoken in cities and seems to be confined to the mountains of the Abakan Range as well as to the area south of the Kuznetsk Alatau Range. Just like Standard Altay and Standard Crimean Tartar, the written Khakas is more or less a 20th century's artificial creation based on Sagai, so most features that are mentioned as typically Khakas in fact refer to the Sagai dialect-language. Since the beginning of the 20th century, when Kacha, located on the planes, has gradually become marginal, Sagai, located in the moutaineous areas, can presently be considered as a good sample of a native vernacular Khakas. Outside the Khakas dialect-languages, the Khakas subgroup includes two other subtaxa — Shor and Chulym — which have long been formally recognized as separate languages, but which too turn out to be small subgroupings: (1) The Shors (including Mras-Su Shor and Kondom Shor dialect-languages) mostly live west of the Minusinsk Depression; (2) The Chulym (including Middle Chulym and Lower Chulym) live in a completely different region north of the Minusinsk Depression along the Chulym River. Lower Chulym is presently extinct, while Middle Chulym is at the verge of extinction. Standard Khakas phonology Some of the most striking and easily observable phonological features in Standard Khakas, as recorded in a common textbook, in fact come from the Sagai dialect and are not reflected in other Khakas group members (Kacha and Kyzyl). Consequently, these features may result from a recent substratum effect, such as Samoyedic influence. The following mutations can be regareded as the most typical of the Sagai dialect as compared to other Khakas dialects: (1) the -sh > -s mutation as in Sagai Khakas tas "stone", pas (as in Sakha ta:s); but Kachin Khakas tash, Shor tash "stone", pash "head", Tuva, Tofa t/dash "stone", p/ba'sh "head"; (2) the -ch > -s mutation as in Sagai Khakas as- "open", sas "hair", but Kachin Khakas ach-, chach, Shor ash-, shash, Tuvan ash-, chash, Tofa ash-, chesh; Khakas aGïs "tree", but Shor aGash, Tuvan ïyash, Tofa n'esh; (3) the q- > x- mutation in Sagai Khakas as in xara "black", but Kachin Khakas qara, Tuva qara, Tofa qara; It seems that the phonological changes in Standard Khakas and Sagai are relatively recent, whereas Proto-Khakas sounded in a much the same way as Proto-Tuvan or Proto-Altay or many other languages in the region, that is, without these peculiar local phonological mutations. Khakas and Tuvan share few or no exclusive innovations Below, we should study the degree of relatedness between Khakas and Tuvan and the plausibility of a separate Khakas-Tuvan proto-state. Khakas and Tuvan phonology In phonology, Khakas and Tuvan share the following innovative features: (1) *S > ch-, as in Chuvash s'ichê, Sakha sette, but Tofa chedi, Tuvan chedi, Khakas cheti "seven", and Standard Altay d'eti (which is basically pronounced almost the same way as /jeti/). Note however, that the *S- > n- transition is mostly confined to the Khakas subgroup: (1a) chi-, che- > ni, ne as Khakas nïmïrxa, Shor nïbïrtqa "egg" as opposed to Tuvan chuurGa, but Tofa n'umurxa; Khakas na:x, Shor na:q, but Tuvan cha:k "cheek", which sets Tuvan apart from Khakas. (2) Apparently, a secondary -w > -G innovative transition in the final syllable, cf. Tofa suG, Tuvan suG, Khakas suG, Shor suG, also Kumandy (a North Altay language-dialect) su:G / su:, but Standard Altay su: "water". That this is an innovation may be evident from the pesumption that *suw must have been the original proto-form. Note: One may be familiar with the Khakas-Tuvan pronunciation of *suw from the name of the Karasuk archaeological culture, named after the Karasuk river. Note: The Proto-Turkic *suw and Proto-Bulgaric *shuw (Chuvash shïv) "water" is akin to Proto-Mongolic usun of the same meaning, evidently from *us-sun < *wus-sun, where -sun is a Mongolic nominative suffix, whereas the root *wus- is most likely Nostratic just like in the English word "water". The same root is also widely distributed in the Uralic languages. Proto-Bulgaro-Turkic seems to go metathetic a number of cases, hence *wus > *suw. Therefore, the w > -G innovative mutation seems the only phonolgical feature so far shared by the Khakas-Shor and Tuvan-Tofa subgroupings. Generally speaking, we have more phonological differences than similarities between Tuvan-Tofa and Khakas-Shor-Chulym. For instance, there are different transitions for the intervocal -d-, cf. Khakas, Shor azaq "foot", but Tuvan adaq "down"; Khakas xazïN, Shor qazïN "birch", but Tuvan xadïN, Tofa qadïN. Moreover, Tuvan-Tofa uses the typical local "Mandarin" system of weak semi-voiced vs. strong unvoiced plosives in the consonantism, which is probably derived from the Mongolic languages, and which is also present in many other languages in the region, but not in Khakas. Khakas and Tuvan grammar There are very few or basically no innovative features in grammar shared exclusively by the Tuvan and Khakas subgroups, which can be demonstrated in the table below.
So far, we were unable to identify any grammatical features shared exclusively at the level of Khakas-Shor-Chulym and Tuvan-Tofa only. Any similar features are hardly exclusive to these two subtaxa and just seem to point to a different phylogenetic level. Khakas and Tuvan vocabulary With about 72% for the Tuvan-Khakas pair in Swadesh-215 (as contrasted with the 73% for Turkish-Turkmen and 78% for Azeri-Turkmen), the Tuvan and Khakas languages must be a little further apart than the typical members of the Oghuz subtaxon. There is hardly any lexicostatistical evidence for Tuvan being any closer to Khakas than to Altay, since we have 72% for Tuvan-Khakas and 69% for Tuvan-Altay. Most lexical differences between Khakas and Tuvan are due to the large amount of "odd" words in Tuvan and, to a lesser extent, in Tofa. Many of these words turn out to be Mongolic borrowings. Cf. Tuvan, Tofa chu: "what" (Khalkha chu:); Tuvan xöy "many" (Khalkha xu "all"); Tuvan, Tofa urug "child" (Khalkha ür); Tuvan, Tofa t.ük "hair" (Khalkha da:x "(entangled) hair"); Tuvan noGa:n "green", also in Khakas (Khalkha nogo:n "green"); Tuvan mugur "dull (of a knife)" (Khalkha molgor); Tuvan dayïn "war" (Khalkha dayin). However, some of the other Tuvan-Tofa etymologies are much harder to figure out. Khakas and Tuvan geography Judging from the geographic perspective, Tuvan is essentially a branch of Proto-Yenisei-Kyrgyz that migrated further south along the upper reaches of the Yenisei. Proto-Khakas-Shor-Chulym originally seemed to inhabit the Minusinsk Depression, whereas Proto-Tuvan-Tofa-Tsataan-Soyot moved further into the Western Sayan mountains, following the course of the Yenisei. In other words, from the geographic perspective, Khakas-Shor and Tuvan-Tofa (and the closely related language-dialects) are related in the same way as any two ethnicities living in the same river basin. Their mutual contacts, or even the separation from the same stem, should be easily predictable from their geographic position alone. However, one should also take into consideration that both of the subgroups inhabit different mountain valleys. The Khakas subgroup inhabits the Minusinsk Depression, whereas the Tuvan subgroup the Tuvan Depression, both being well-separated from each other by the Western Sayan Ridge. Conclusion: After exploring phonological, grammatical and lexicostatistical evidence, we have found no specific innovations shared exclusively by Proto-Tuvan and Proto-Khakas. Furthermore, from the geographic perspective, the two subgroups are separated by the Western Sayan Mountain Ridge. For this reason, the Khakas-Tuvan subgrouping alone — without the inclusion of the Altay subgroup and other related members — seems to be poorly supported. Altay, Khakas and Tuvan form the Altay-Sayan subgroup Below, we will study the relatedness of Altay (Turkic) to Tuvan and Khakas trying to demonstrate that, when considered together, these languages form a separate genetically related subtaxon, roughly in the same way as Turkmen, Azeri and Turkish form the Oghuz subgroup. Altay (Turkic) is not a single language, it is a subtaxon First of all, as it is well-known today, Altay (Turkic) is not a single language, but rather a complex network of independent languages and dialects. According to Baskakov (1969), the Altay subtaxon should include the following clusters of "dialects": (1) Southern: (1a) Altay-kizhi, (1b) Telengit, (1c) Teleut; (2) Northern: (2a) Tuba, (2b) Kumandy, (2c) Kuu (lit. "swan" after the river name) (or Chelkan), all of which are probably separate languages. However, the appellation of the Altay language is still widely employed apparently due to traditionalism. This term has been accepted even in Baskakov's works (1952-88), who had done field studies after WWII and written separate books on Kuu (Chalkan) and Kumandy in the 1960-70's. The strong diversification within Altay (and its relatedness to Khakas) is corroborated by the lexicostatistical study by Anna Dybo (2006). [Dybo, Anna, The Chronology of the Turkic Languages and the Linguistic Contacts of the Early Turks (2006)] Similar results have been obtained in a phono-morphostatistical study by Oleg Mudrak (2007). Note: the term Oirot in the works of Starostin's group members apparently means Standard Altay or Altay-kizhi (Proper), which was its official name until 1947. Moreover, some of the Altay "dialects", such as Kumandy and Kuu (Chelkan), have recently obtained the de jure status of separate ethnicities. Curiously, there has even been a sort of small scandal in the press (2011) when two different book authors writing in Kuu argued with each other over which language version should be more correct, so we may surmise there may be some dialectal differentiation even among the speakers of nearby Kuu villages. The strong diversification within the Altay dialect/languages suggests that Altay (Turkic) peoples have inhabited the Altai Mountains for a long time, presumably at least about a thousand years. In any case, the Altay Turkic languages are much too peculiar, much too diverse, and were much too poorly studied in the 20th century. Both the Khakas-Shor-Chulym and North-South-Altay subtaxa constitute a rather complex superposition of dialect-languages that could not be explored herein with sufficient elaboration. However, we will attempt to provide a brief argumentation for the Sayan-Altay relatedness below. Altay, Khakas and Tuvan phonology It is hard to identify specific phonological features shared exclusively by Altay and Khakas-Tuvan. Instead, however, we have at least one series of typical contractions shared by Khakas (and partly, Tuvan), Altay, and Kyrgyz. These contractions might have been either archaic or innovative. Cf. the following examples: (a) as in "liver", cf. Khakas pa:r, Tuvan pa:r, Standard Altay bu:r / pu:r , Kyrgyz bo:r "liver", as opposed to Sakha bïar, Proto-Kimak-Kypchak *bawur, Chuvash pôver <*poör (?) [the Chuvash intervocalic -v- seems to result from the late labialization of narrow vowels], as opposed to Old Turkic baGïr, probably from Proto-Bulgaro-Turkic *Bawïr or *Baïr. (b) as in "bone", cf. Khakas sö:k, Tuvan sö:k, Standard Altay sö:k, Kyrgyz sö:k "bone", as opposed to Sakha unuoh, Chuvash s'ômô, Old Turkic süNök [note that N denotes a nasal as the Engl. /ng/], Proto-Kimak-Kypchak *süyek, probably from Proto-Bulgaro-Turkic *süNök. (c) as in "horn", cf. Tuvan mïyïs, Tofa mi:s, Khakas mü:s, Standard Altay mü:s "horn", as opposed to Chuvash mây, Sakha muos, Old Turkic müNüz, Proto-Kimak-Kypchak and Kazakh-Kyrgyz *müyüz, probably from Proto-Bulgaro-Turkic *maNüR or *maiR. The details and the direction of these contractions are ambiguous. They seem to be innovative at first, since most contractions are innovative. However, judging by their partial presence in Sakha, and the partial absence from Tuvan, some of them might just as well be quasi-independent mutations or even retentions, so the matter is not entirely clear. Also note that Kumandy (a North Altay language) exhibits more Khakas features than Standard Altay (Altay-kizhi, "Oirot") [Baskakov (1972)], cf. for instance: (1) Kumandy n'- as in nimirtka, cf. Khakas nimirxa "egg", but Jïmïrtka (d'ïmïrtka) in Standard Altay; (2) Kumandy sug / su "water, river" as in Khakas suG, Shor suG, and Tuvan suG, but suu in Standard Altay and southern Altay dialects; Kumandy tag / tu "mountain" as in Khakas tag, Shor taG, Tuvan taG, Tofa taG, but tuu in Standard Altay and southern Altay dialects; (3) The Khakas ch- instead of the Altay-style d'- pronunciation in northern vs. southern Altay dialects, as in chïl : d'ïl "year" This affinity has been noted by Baskakov (1969, 1988), who clearly maintained that Northern Altay is rather related to Khakas, whereas Southern Altay to Kyrgyz, which is actually quite illogical, considering the fact that he wrote of Altay as a single language. In any case, it is reasonable to focus on the Southern Altai dialect-languages (Standard Altay, Altay-kizhi, Teleut, Telengit) below, because their relatedness to Khakas seems less obvious. Altay, Khakas and Tuvan grammatical features The shared morphological features in Altay-Sayan seem to include at least the following instances: (1) The use of choq after nouns or adjectives (as in "A is not B", or "A is not good") to express negatives instead of or parallel to the standard Turkic emes. This feature is typical of many Turkic languages in Siberia. It may also be found in Kyrgyz. (2) The use of a special contracted form for "you" (plural). Cf. Tuvan siler, Tofa siler, Khakas sirer, Kumandy sner, snir, Standard Altay slerler, Kyrgyz siler. Also found in Baraba as silär. (3) The use of a grammeme similar to bara-dïr-mïn "I'm going", which also exists in Sakha. (4) The retention of archaic forms for the past tense 1st person plural (as in "we did"): -dï-bïs, -di-bis in Standard Altay and -di-bis, -di-vis in Kumandy, cf. the innovative -d'ik, -d-uk in Turkic languages located west of the Irtysh line; this suffix is also reported (rather confusingly) in Standard Altay. (5) The retention of apparently archaic Optative mood with the -Gai-/-gei- suffix shared by Sakha, Tuvan, Tofa, Khakas, Standard Altay, Kumandy, Kyrgyz. Even though similar grammemes also exist in other languages, particularly in the Southern supertaxon (see below), they may have a different phonological shape and meaning there (usually the meaning of the future tense). (6) A special directive case in Kumandy (but not Standard Altay) expressed by -za, -ze, -sa, -se, cf. Khakas -za, -zer, -sar, -ser, -nzar, -nzer. Apparently, this feature is quite unique; Altay, Khakas and Tuvan vocabulary Proficient Kyrgyz speakers sometimes report good mutual intelligibility with Standard Altay. Indeed, we have 76% for Khakas-Altay as opposed to the similar number of 75% for the Kyrgyz-Standard Altay pairs in Swadesh-215 (borrowings excluded). The distance to any other language from Altay is even greater, with an average of about 70%, or just 69% in the case of Tuvan. An attempt to find common Altay-Khakas-Tuvan innovative isoglosses produces a bunch of potential lexical innovations:
As you can clearly see from the table above, Altay, Khakas and Tuvan share a rather huge number of apparently innovative lexemes, some of which are shared only between one pair of languages, while some of the others are shared across the board. These isolexemes provide substantial support for the existence of the Altay-Sayan genetic unity. As to the reported Altay-Kyrgyz partial mutual intelligibility, it should be noted that most of the lexemes found above are not shared with Kyrgyz, setting it apart from the Altay-Sayan languages. Moreover, certain proximity between Altay and Kyrgyz can also be explained by the considerable linguistic archaism of these two languages and their posterior interaction in the 17-18th century (see Kyrgyz-Altay isoglosses below). Altay, Khakas and Tuvan history and geography The Altai and the Western Sayan Mountains belong to the same mountain system, whereas the Tian Shan is a different matter separated form the Altai Mountains by the basin of the upper Irtysh river. The distance from Lake Issyk-Kul, where Kyrgyz people are presently located, to the Altai Mountains is over 800 km (500 miles). In other words, Altay and Kyrgyz are not geographically connected. On the other hand, the habitat of the Altay (Turkic) people is very close to the traditional habitat of Khakas, and especially Shor. For instance, the map from the The Atlas of the World Population (1964), which supposedly reflects the distribution of ethnic groups during the first half of the 20th century, clearly shows the position of Northern Altay peoples in the direct vicinity of Shor and Khakas. Old Soviet ethnographic maps of the Altay-Sayan area (1940-60's) (clickable) Note: The presence of the many unexpected ethnic groups that you can find on the first map, such as Chuvash, Tatar, Mordvins, (Volga) Germans, etc., scattered all over the Altai Krai and Khakassia, is mostly connected with the famine of the 1920's, when there was a mass railroad migration from the Middle Volga to West Siberia, Uzbekistan and other unaffected areas. Presently, most of these ethnic groups must have become ethnically assimilated, at least for the most part, and presumably lost their original languages, though some of them may still exist in the same location. In any case, we have come to the conclusion that the geographical considerations generally vote for the high probability of Altay-Khakas relatedness and against a readily-available physical connection between Altay and Kyrgyz languages. Little is known about the local Altay and Shor history. Curiously, as Radlov mentions about the Shor people in 1861 [Aus Sibirien. Lose Blätter aus meinem Tagebuche (From Siberia: Torn pages from my diary), Wilhelm Radloff, Leipzig, 1893]: In vain did I try to exact any historical legends from them [the Mrassu Shors], they could not even name the five ancestors, which any Altayan knows. The 102-year old man could only say that, as he had heard from his father, they had always lived peacefully in this land, and nothing had changed about their way of life except for their faith [=the Orthodox Christianity]; they had always been fishermen, and as far as he could remember, everything stayed the same.We may hypothesize that the migration from the Altai to Khakassia or vice versa might actually have proceeded along the Abakan river, which takes source in the Altai Mountains, near the approximate separation area of the Northern and Southern Altay dialects, and which flows through the lands of the Sagai Khakas and Beltir Khakas into the Yenisei River. The Abakan seems to provide an easily available geographic link between the Proto-Khakas and Proto-Altay areas. Note: The interpretation of the Abakan river's name as "bear's blood" is an unlikely option and may represent a folksy etymology, taken that there exists a separate tributary of the Yenisei named Kan, as well a number of other rivers in Siberia exhibiting the same root -kan presumably meaning "river". Moreover, many other hydronyms in the area do not seem to point towards the Turkic origin, therefore the hydronym Aba-Kan may in fact be non-Turkic. More curiously, there exists the Ubagan River in the Turgay Vally east of the Urals, but its connection to the Abakan of Khakassia is a mystery. The enthno-geographical distribution of the Altay Turkic, Khakas and Tuvan subgroups can be summarized in the map below. As in the other similar cases, this distribution mostly reflects the early 20th century situation, when most ethnographic data were collected. By the early 21st century, these areas have shrunk significantly and some dialects (such as Lower Chulym) have even become extinct. The approximate distribution of the Altay, Khakas and Tuvan peoples by the beginning of the 20th century (2012) Additionally, the complexity of this geographic distribution leads to a conclusion that the amount of dialectal and linguistic diversification among the members of the Altay, Khakas and Tuvan subtaxa is rather profound and implies at least 1000 years of internal differentiation. By no means do Altay, Khakas and Tuvan presently constitute single, standalone languages. Conclusions: Based upon (1) several probable phonological innovations; (2) many shared archaisms in grammar; (3) the large amount of mostly innovative shared isolexemes exclusive to the Altay-Sayan subgrouping, including a well-established lexicostatistical relatedness between Altay, Khakas and Tuvan in Swadesh-215; (4) the geographic proximity and the evident geographic connection between Altay, Khakas and, to a lesser extent, Tuvan languages and dialects; we may conclude that the existence of the Altay-Sayan proto-state becomes a rather plausible hypothesis. Moreover, as lexicostatistical calculations show, there's more proximity between Standard Altay and Standard Khakas on one hand, than between Standard Khakas and Tuvan on the other. We have also shown above that Tuvan and Khakas share no exclusive innovations. These considerations imply that Proto-Tuvan must have been the first to separate from the Proto-Altay-Sayan stem, whereas Proto-Khakas and Proto-Altay either followed much later or strongly interacted with each other for several centuries, exchanging lexis and phonological features. At least, the particular relatedness of Kumandy (and reportedly other Northern Altay languages) to Khakas, first noted by Baskakov (1969), can probably be attributed to this later secondary interaction. During the 2nd millennium CE, a further diversification of Proto-Tuvan, Proto-Khakas and Proto-Altay into smaller languages produced considerable linguistic and dialectal variation in the Altay-Sayan area. The Languages of the Great Steppe Kimak-Kypchak-Tatar, Kyrgyz-Kazakh, and Chagatai-Uzbek-Uyghur seem to form a genetic unity According to the present classification, the Turkic languages of the Great Steppe include the following languages and language clusters, among the most typical representatives: (1) Kyrgyz, Kazakh, Karakalpak, and possibly the extinct dialect of the Karluks; (2) the spoken medieval Chagatai, medieval Sart, modern Uzbek, Uyghur and their multiple dialects; (3) Bashkir, Kazan Tatar, Sibir Tatar, Nogai, Kumyk, North Crimean Tatar, Karachay-Balkar, the unattestted Kimak dialect, etc. Note: The geographic term Great Steppe is used herein to refer to the the western and the largest part of the Eurasian Steppe that stretches from the Altay Mountains to the Black Sea. For more geographical details see also the introductin to The Proto-Turkic Urheimat & The Early Migrations of Turkic Peoples. The Great-Steppe languages seem to share many common elements and are reported to retain good mutual intelligibility (subjectively up to 80% in actual speech). Their speakers often get the impression that all of the Turkic languages are very close to each other, even though this impression is in fact connected with the intelligibility of these neighboring languages mostly scattered across the Eurasian steppeland areas and the Tian Shan Mountains in the countries of the former Soviet Union. In any case, we should suppose that these languages are particularly closely related, and we will try to demonstrate this below. The history and geography of the early Great-Steppe languages Apparently, until about 700 AD, all of the proto-members of this presumable supertaxon had occupied the area somewhere near the Irtysh River in the Altay Krai region. During the rise and fall of the Göktürk-Uyghur Kaganate between the 720-840's, these tribes were affected by the strife with the Göktürks (described in the Orkhon inscriptions), and, probably were compelled to migrate (or allowed to move after the dissipation of Gökturks) from the Irtysh River towards the present-day Kazakhstan, northern Tian Shan, and then deeper into the Great Steppe, though the connection of this migration with the Göktürks-Uyghyrs and other details are rather hypothetical and poorly supported. To establish the earliest known factual migrations, we should first take a look at the earliest attestations of the potential members of this taxon: (1) The Karluks are reported to migrate from the Altay Mountains to Suyab and establish their confederacy in the Jeti-Su (Zhetisu) by about 760-766 AD. However, virtually nothing is known of this Karluk dialect, and its relatedness to other languages under consideration is purely conjectural. The relatedness of the Karluks to the Kyrgyz is only suggested by their migration to the modern-day Kyrgyzstan and the name's phonology implying superficial similarity with other languages of the Kyrgyz and Kimak origin. (2) The Tatar clan, presumably forming an important part of the Great-Steppe clans, was first clearly attested, among other Turkic tribes, in the Kul Tegin Orkhon inscription c. 732 in reference to the burial of Bumin Kagan in 552. Judging from the later distribution of the Tatars in the Great Steppe, the Proto-Kimak-Kypchak-Tatar tribes must have been situated along the upper course of the Irtysh River. And indeed, we know they formed their own Kimak Kaganate along the Irtysh after 840 AD. (3) The Kyrgyz tribes of Kyrgyzstan could have migrated from the Irtysh towards the Jeti-Su region probably after the 840's, that is after the fall of the Uyghur Kaganate (which was essentially the continuation of the Göktürk Empire), when the Yenisei Kyrgyz tribes allegedly sacked the Uyghur capital in Mongolia's Orkhon valley and driven the Uyghurs out of there, establishing their own Kyrgyz Kaganate afterwards. However, the exact details of these events are very confusing, and there are more interpretations in the Russian and Kyrgyz historiography about the origins of the Kyrgyz of Kyrgyzstan than solid facts. An alternative hypothesis suggests that the Kyrgyz had been present in the area between the Tian-Shan and the Altai Mountains since about 200 BCE, when Proto-Turkic tribes and the early "Proto-Central" dialect first appeared in the region [See The hypothesis of linguistic interaction near Zaisan below]. Despite the vagueness of the earliest records, the historical evidence for the Great-Steppe members seems to point to the existence of certain early tribal unities located (1) in the Kulunda Steppe, (2) near the middle-to-upper course of the Irtysh, (3) along the thin strip of land near the upper course of the Irtysh River as it passes through the Altay Mountains flowing from Lake Zaysan. From 200-300 BCE until about 600-800 AD, the early Karluk, Kyrgyz, Tatar and Kimak tribal clans were apparently all situated near this area in the close vicinity of the Kulunda Steppe, Altai Mountains and Lake Zaysan, possibly forming the Proto-Great-Steppe language unity. The phonology of the Great-Steppe languages Most phonological similarities of the three language clusters described above, namely Kimak-Kypchak-Tatar, Kyrgyz-Kazakh and Chagatai-Uzbek-Uyghur, are not exclusive to them, they can also be found in Southern Altay and Oghuz (especially Turkmen), which can probably be attributed to the formation of a local linguistic area. In other words, besides the Great-Steppe languages being a genetic unity in a strict sense of the word, we may also speak of the Great-Steppe languages as a Sprachbund in a boader sense, with some additional ethnicities included in this linguistic area. Some features of this Sprachbund may be present in some of these languages but absent in others. The idea is that most of these Great-Steppe features first arose within the genetic unity, but than spread to other members of the Great-Steppe Sprachbund. In any case, most languages of the Great Steppe can be characterized by the following phonological characteristics: (1) A further lenition of the intervocalic -z- > -y-: cf. Khakas azaq, but Standard Altay and Kumandy ayak, Kyrgyz ayaq, Kazakh ayaq, Chagatai ayaq, Kimak-Kypchak-Tatar *ayaq, Oghuz *ayaq. Note that this feature was originally absent from the descendants of Proto-Orkhon-Karakhanid, which preserved a fortified -d- or -ð-, cf. Orkhon Old Turkic aDaq, adaq, Karakhanid aðak (=the exact pronunciation is uncertain, possibly as a slight interdental /ð/ or an alveolar), Khalaj hadaq. (2) The absence of the final -G/-g, as in Standard Altay tu:, Kyrgyz to:, Kazakh to:, Karachay taw, Bashkir taw, Kazan Tatar taw "mountain", but Tuvan taG/daG, Khakas taG, Kumandy (a Northern Altay language-dialect) taG, Oghuz-Seljuk *dag. (3) Apparently, the i > e innovative mutation, as in Standard Altay eki, Kumandy eki, ekki, iki (depends on the dialect), Kyrgyz eki, Kazakh eki, Karachay eki, Nogai eki, Kumyk eki "two", but Tuvan ihi, Khakas iki, yet Oghuz *iki. Note again that transitions in vowels are often unreliable, lack sufficient historical stability, may emerge independently, or be an areal feature. (4) A special voicing pattern as in Kazan Tatar sigez "eight", tugïz "nine", Karachay-Balkar segiz, toGuz, Kyrgyz segiz, toGuz. Here, the second and third consonants are voiced as opposed to Altay, Kumandy segis, togus, Khakas segis, toGis, Yugur saGïs, doGïs, Orkhon Old Turkic sekiz, toquz, Uzbek sakkiz, to'kkiz. The grammar of the Great-Steppe languages (1) The languages of the Great Steppe are characterized by a unique and a very typical shared innovation: the -d-ik / -d-ïk / -d-ük / -d-uk, etc. Past Tense suffix (1st person, plural) as in "we did" or the -se-k in the Subjunctive Mood as in "if we would". It can be found in some of the Southern Altay language-dialects, Kyrgyz, Kazakh, most Chagatai languages, all of the Kimak-Kypchak-Tatar and Oghuz languages. On the other hand, the suffix is almost entirely absent from the Orkhon-Karakhanid branch [though occasionally present in late Karakhanid and Khalaj (where it was probably borrowed from Azeri)], "Siberian" Turkic, Yugur, Salar and Chuvash, where the historical archaic *-d-imiz or a synharmonically similar form is used instead in the Simple Past Tense. Note: As a matter of fact, the *-d-imiz suffix is recognizably Nostratic — actually, -miz is one of the earliest Nostratic morphemes mentioned by H. Pedersen in his article on Turkish phonology in 1903 — therefore, we may conclude that -ik / -ïk / -ük / uk, etc is a later innovation. (2) At least such languages as Kyrgyz, Kazakh, Chagatai-Uzbek-Uyghur, Karachay-Balkar, Nogai, Karaim exhibit a very odd 3rd person singular -tï ending in verbs: cf. Kyrgyz bara-t "s/he will go", Kazakh bara-dï "s/he is going", Nogai bara-dï "s/he goes", Sibir Tatar (Tyumen) para-tï "he goes", Uzbek borap-ti "s/he is going now", bara-di "s/he will go", Uyghur yazi-du "s/he, they (will) write". This pretty striking 3rd person verbal marker, so similar to that of Latin, may make one wonder whether the above-mentioned Turkic languages retained a Nostratic feature. However, it seems to be that this ending is a mere contraction of the common Turkic -dïr, -dir, -dur, -dür, -tïr, -tir, -tur, -tür, used in different connotations in nearly all Turkic grammars and mostly expressing certainty or audative mood. The key to understand how this contraction could have come to life is to realize that the ending -r in Turkic Proper is generally unstable and must either transform into a -z (according to the law of zetacism) or simply disappear as it happens in modern Turkish dialects, Uyghur and possibly elsewhere. Hence, apparently this -tïr > -tï > -t transition in Kyrgyz. The vocabulary o f the Great-Steppe languages The lexicostatistical proximity of most Great Steppe languages (except for certain members on the geographic periphery) is quite undeniable and can easily be observed. See for instance, the diagram for the The Wave Model of the Turkic Languages above. However, many of these similarities turn out to be archaisms shared with Standard Altay, and sometimes even Khakas, Turkmen and other neighboring languages on the fringe of the Great Steppe, whereas true innovations are harder to detect. In any case, consider the following lexical and phono-semantical instances, mostly from Swadesh-215, that seem to be innovative because of the absence of these isolexemes in other branches: (1) Kimak-Kypchak *üy, Kyrgyz üy, Kazakh üy, Uzbek öy, Uyghur uy, also St. Altay öy, Turkmen öy "home" as opposed to Khakas ib, Turkish ev and a different phonological shape in Tuvan ög, Kumandy ük. The *eb form is probably more archaic judging from the Korean chip and Old Japanese ipe "home, house". The *öy word may in fact be more innovative and akin to the Great-Steppe *uya, Seljuk *yuwa, Chuvash yâwa "nest", though this latter etymological conjecture does not seem to have been noted anywhere else. [Verified with Sevortyan's Etymological Dictionary]; (2) Kimak-Kypchak *tüye, Kyrgyz tö, Kazakh tüye, Uzbek tuya, Uyghur töga, also Standard Altay tö, tebe, Turkmen tüye as opposed to Khakas tibe, Tuvan teve, Sakha taba, Karakhanid teve, Old Uyghur teve, Azeri devä, Turkish deve "camel", Chuvash teve. Apparently, this word has undergone innovative phonological modification in Great-Steppe; (3) Kimak-Kypchak *may, Kyrgyz, Kazakh may, Uzbek moy, Uyghur may, also Standard Altay and Altay dialects may, Turkmen may "fat" (noun), apparently innovative, absent elsewhere. [Verified with Sevortyan's Etymological Dictionary]; (4) St. Altay bet, Kimak-Kypchak *bet, Kyrgyz, Kazakh bet, Uzbek bet, Uyghur bet "face"; apparently innovative. [Verified with Sevortyan's Etymological Dictionary]; (5) Kyrgyz sürt-, Kazakh sürt-, Uzbek sürt-, Uyghur sürt-, Tatar sürt-, Bashkir hört-, Karachay-Balkar sürt- "to wipe" as opposed to Altay arla:r, archïnar, Khakas chïzrga, Turkmen süpür- "to wipe". Apparently, innovative; (6) Kyrgyz oylo:, Kazakh oylau, Uzbek oyla-, Uyghur oyli-, Tatar uyla-, Bashkir utla-, Karachay-Balkar –, Turkmen üyt-, pikir et-, say-, as opposed to St. Altay sanan, Khakas saGïn-, "to think, ponder". Apparently, innovative; (8) Kyrgyz jïrlau, Kazakh zhïrlau, Tatar jïrla-, Bashkir yïrla-, Karachay-Balkar jïrla-, as opposed to St. Altay qozhoNdor, Khakas ïrl-, Turkmen sayra- "to sing". Apparently, innovative; (9) Kyrgyz qursaq, Kazakh qursaq, Uyghur qorsaq, Tatar qorsaq, Bashkir qorhaq "belly", as opposed to Oghuz-Seljuk *qarïn, St. Altay ich, Khakas xarïn, isti, cf. Standard Altay qursak "pregnant". Apparently, innovative in this meaning. [Verified with Sevortyan's Etymological Dictionary]; (10) Kyrgyz ïshku:, Kazakh ïskïlau, Uzbek ishqala-, Tatar ïshqïrga, Bashkir ïshqïu, Karachay-Balkar ïshïrGa "to rub", as opposed to Oghuz-Seljuk *sürt(en), St. Altay jïzhar, Khakas chïzarGa. Apparently, innovative; (11) Kyrgyz sürtu, Kazakh sürtü:, Uygur sürt, Tatar sörtörgê, Bashkir hörtöü, Karachay-Balkar sürterge "to wipe", as opposed to Turkmen süpür- Seljuk *sil-, St. Altay arla:r, archanïr. Apparently, innovative; (12) Kyrgyz ïrGïtu:, Kazakh ïrGïtu, Tatar ïrgïtu, Bashkir ïrGïtïu "to throw", as opposed to Uzbek, Uyghur at-, Oghuz-Seljuk *at-, St. Altay chachar, Khakas tastirGa, silerge. Apparently, innovative; (13) Kazakh dala, Kyrgyz tala:, Tatar dala, Bashkir dala, Uyghur dala "steppe, desert". Apparently, innovative but could be a borrowing (?); (14) Kazakh dawïs, Tatar tawïsh, Bashkir tawïsh, Karachay tawush, Uzbek towush, Uyghur tawush "voice". Apparently, is not found elsewhere, therefore probably innovative; (15) Kazan Tatar yanGïr, Bashkir yamGïr, Sibir Tatar yaNGïr, Nogai yamGïr, Karachay janGur, Kyrgyz jamGïr, Uzbek yomgir, Uyghur yamGur "rain" is definitely an innovative metathesis from a more archaic *jaG-mïr, which originally seems to have meant "falling water", judging from the fact that the latter word is widely distributed in East Altaic languages as Tungusic *mu "water" and Mongolic mören "river", as well as Korean mul "water" and even Japanese mizu "water". The original variant is attested in all the other Bulgaro-Turkic branches, cf. Chuvash s'â-mâr, Sakha sa-mï:r, Khakas naN-mïr, Altay jan-mïr, Turkish ya:-mur "rain"; The abundance of archaisms can too contribute to the demonstration, if they come in sufficiently large amounts. Below, there are a few words from Swadesh-215 that seem to be shared archaisms, because of their occasional presence in other Bulgaro-Turkic branches: (1) Kyrgyz ötkür, Kazakh ötkir, Uzbek o'tkir, Uyghur ökür, Tatar ütken, Bashkir ütker, Turkmen ötgür "sharp" as opposed to Karachay-Balkar jiti, St. Altay kurch, Khakas chitig "sharp"; also found in Tuvan, therefore probably a retention; (2) Kyrgyz tishte, Kazakh tisteu, Uzbek tishla-, Uyghur chishli-, Tatar teshle-, Bashkir teshle-, Standard Altay tishte, as opposed to Karachay-Balkar qab-, Khakas ïzïr- "bite"; a retention; (3) Kyrgyz keN, Kazakh keN, Uzbek keN, Uyghur keN, Tatar kiN, Bashkir kiN, Karachay-Balkar keN "wide", as opposed to Oghuz-Seljuk genish, St. Altay d'albaq, Khakas chalbaq, a retention; (4) Kyrgyz qatïn, Kazakh qatïn, Uzbek xotun, Uyghur xotin, Tatar xadïn, Bashkir qatïn, Karachay-Balkar qatïn "wife", as opposed to Oghuz-Seljuk kadïn "woman", St. Altay üy, Khakas ipchizi "wife", probably a retention; (5) Kyrgyz tayaq, Kazakh tayaq, Uzbek tayoq, Uyghur tayaq, Tatar tayaq, Bashkir tayaq, Karachay-Balkar tayaq "stick", as opposed to Oghuz-Seljuk chöp, chubuk, St. Altay agash, Khakas agas, tayax, a retention since it is known even in Chuvash tuya; (6) Kyrgyz soGush, Kazakh sogïs, Tatar suGïsh, Bashkir huGïsh "war", as opposed to Uzbek, Uyghur, Turkmen *urush, St. Altay d'u:, Khakas cha:, Turkish savash. Either archaic or innovative; (7) Kyrgyz burulu:, Kazakh bu^ru, Uzbek bur-, Uyghur buri-, Tatar borïrga, Bashkir borolou, Karachay-Balkar bururGa, St. Altay burïlar "to turn (right, left)", as opposed to Oghuz-Seljuk *dön-, Khakas aylanarGa; a retention; (8) Kimak-Kypchak *ayt, Kyrgyz, Kazakh ayt-, Uzbek ayt-, Uyghur eyt-, also St. Altay ayt-, Sagay Khakas ayt-, Turkmen ayt- "to say", though cf. Turkish ayït- "to concern". Apparently an archaism, since it is also found in Sagai Khakas and Sakha as et "to tell, to say" and Tuvan aytïr- "to explain" and others. However, it is particularly stable as the main verb for telling or saying in the languages of the Great Steppe. [Verified with Sevortyan's Etymological Dictionary]; Conclusions: A group of tribes inhabiting the Kulunda Steppe and the upper course of the Irtysh River near Lake Zaysan and the Altai Mountains before 600-700 AD finally led to the formation of the Kimak-Kypchak-Tatar, Kyrgyz-Kazakh, and Chagatai-Uzbek-Uyghur subtaxa. The descendants of these subtaxa are hereinafter referred to as the languages of Great Steppe, or the Great-Steppe (super)taxon. Most languages of the Great-Steppe share relatively good mutual intelligibility and many common archaic and innovative isolexemes because of their close linguistic relatedness. Moreover, some of the languages of the Great Steppe may have additionally affected the development of Turkmen, South Altay, Baraba Tatar and perhaps other geographically related subgroups, in which case we may additionally speak of the Great Steppe Sprachbund that includes some languages on the Great Steppe periphery because of the posterior interaction with them. Great-Steppe and Altay-Sayan seem to be closer to each other than to Oghuz-Seljuk We have seen in the discussion above that in some cases the Great-Steppe languages find some similarities with South Altay presumably because of secondary interaction. Below, we will briefly study the features that may genetically relate the Great-Steppe languages to the languages of the Altay-Sayan subgroup at a deeper level. There are basically two options. If the hypothesis about the Great-Steppe-Altay-Sayan relationship were correct, it would mean that the Orkhon-Oghuz-Karakhanid and Proto-Yakutic branches had been the first to separate from Proto-Turkic Proper, whereas Proto-Great-Steppe-Altay-Sayan split up only several centuries after that. Were it wrong, it would mean that Great-Steppe and Orkhon-Oghuz-Karakhanid should share many common features, whereas Altay-Sayan must have separated early on. The grammar of Great-Steppe and Altay-Sayan (1) The extensive usage of -Gan- / -ken- in the Perfect Tense instead of the Oghuz-Seljuk -mïsh-/-mush- or Sakha -bït-/-mït- is rather typical of the Great-Steppe and Altay-Sayan languages. Nevertheless, the -Gan suffix is also sporadically present in various direct and indirect functions in Orkhon Old Turkic, Karakhanid, Salar, Yugur, whereas -mïsh- is also known in Cuman-Polovtsian, Uzbek, Tuvan and some other languages. The -Gan in Karakhanid and Oghuz-Seljuk is used only in participles and adjectives, not in the Pefect Tense [see for instance SIGTY. Morphology. (1988)]. The -mïsh- in Uzbek is evidently inherited from Karakhanid. In Tuvan and Tofa, it has a slightly different meaning of "still doing something", whereas the Perfect Tense is still expressed there with the -Gan- / -ken- suffix. Consequently, despite some intermingling, the distinction between the mïsh-languages and Gan-languages, which separates Great-Steppe and Altay-Sayan from Yakutic and Orkhon-Oghuz-Karakhanid, altogether seems to be rather sharp and clearly defined. Since the Oghuz-Seljuk -mïsh-/-mush- or Sakha -bït-/-mït seem to be an archaism possibly related to the verb bol- "to be" and found in the Yakutic branch that must have been the earliest to separate, the usage of -Gan- / -ken- in the Perfect Tense may turn out to be rather innovative. Consequently, grammatical considerations seem to point to the Great Steppe and Altay-Sayan relationship. The vocabulary of Great-Steppe and Altay-Sayan A few examples of the presumable lexical innovations shared by the Great-Steppe and Altay-Sayan are listed below. (1) Khakas omas, Altay ötpös, Tatar ütmês, Kazakh ötpês, Kyrgyz ötpögön, Uzbek ûtmas, Uyghur ötmes "dull (of a knife)"; (2) Tuvan kïlïr, Bashkir kïlïu, Kyrgyz kïlu:, Uzbek qilmoq, Uyghur qilmak "to do", whereas in Sejuk-Oghuz this word has been mostly displaced by etmek or by tu in Chuvash; (3) Khakas kiche:, Altay keche, Tatar kichê, Bashkir kisê(ge), Kazakh keshe, Kyrgyz keche, Uzbek kecha "yesterday", as opposed to probably more archaic Tuvan dün, Uzbek tünügün, Karachay tünene, Oghuz-Seljuk *dün; (4) Altay ölöN, Tatar ülên, Bashkir ülên "grass". Moreover, according to Sevortyan's dictionary, cf. Khakas, Kumyk ölöN (or similar) meaning "feather grass (=Stipa, one of the most typical kinds of grass in the steppe)"; "Elytrigia (type of grass)" in Sakha; "Carex (sedge)" in Kyrgyz, Kazakh; "grass" in Uyghur, Uzbek, though modern dictionaries of these languages do not confirm some of the data listed by Sevortyan's; (5) Khakas köberge, Altay köbör, Karachay köberge, Kyrgyz köbü:, Uyghur qaparmak "to swell (as of a finger, foot)"; (6) Khakas sörtirge, Altay sü:rte:r, Tatar söyrêu, Bashkir höyrêu, Kazakh süyrêu, Kyrgyz süyrö: "to pull (behind oneself)"; (7) Khakas, Tuvan, Tatar, Bashkir, Karachay, Kyrgyz, Kazakh, Altay, *qol as opposed to Oghuz-Seljuk *el, *elig, Sakha il:i, Chuvash alâ; probably an archaism; (8) Tuvan t.ö:, Khakas tigi, Tatar tege, Bashkir tege, Kyrgyz tigi "that (furthest) (adj)", e.g. "that book"; probably a retained archaism, perhaps even of Altaic and Nostratic type; The lexicostatistical considerations for Altay-Sayan and Great-Steppe relationship At first glance, lexicostatistically, there is an average distance of about 69% from Oghuz to Great-Steppe and about 64% from Great-Steppe to Altay-Sayan (with Tuvan) or 68% (without Tuvan). However, we should take into consideration the mutual lexical exchange among the members of these taxons. The Great Steppe languages that interacted with the Southern taxon, such as Kimak and particularly Uzbek-Uyghur on one hand, and the Great Steppe languages that interacted with the Altay-Sayan, namely Kyrgyz (see the details in the correspondent chapters). So we are left with Kazakh as the only supposedly "pure" representative of the Great Steppe in our lexicostatistical study. We can also try Bashkir that was confined to the Urals and probably had minimum interaction with Oghuz. Similarly, we should omit Tuvan from the Altay-Sayan because of the great number of Mongolian borrowings that are hard to detect and that may have infiltrated into the Tuvan list. We should also omit Altay because of its potential interaction with Kazakh, taken that the Altai Mountains form part of the eastern Kazakhstan and there are Kazakh settlements in the Altai. By the same token, within the Oghuz-Seljuk taxon, we should omit Turkmen because of it's potential interaction with Kazakh, Karakalpak and Uzbek, and so we are left only with Azeri-Turkish. Consequently, the average lexicostatistical distance (1) for Kazakh and Azeri-Turkish is 66%; (1) for Kazakh and Khakas is 68%; (1) for Bashkir and Azeri-Turkish is 64%; (1) for Bashkir and Khakas is 67%; The resulting difference of 2-3% is very small but the balance now seems to be tipped in the favor of Great-Steppe-Altay-Sayan relationship. In any case, from the lexicostatistical perspective Altay-Sayan, Great-Steppe and Oghuz-Seljuk seem to have separated from each other almost at the same time. Conclusion: It seems that Great-Steppe and Altay-Sayan may be a little more closely related to each other, than either of them is related to Oghuz-Seljuk, Sakha or any other remaining Turkic subgroups. However, the similarities are few and doubts still remain. We will hereinafter rename this supposed Great-Steppe-Altay-Sayan unity as the Central supertaxon for short, because it was geographically located somewhere in the middle between Proto-Sakha and Proto-Orkhon-Oghuz-Karakhanid. The Kyrgyz-Chagatai subtaxon As mentioned above, the languages that supposedly belong to this subtaxon are: (1) Kyrgyz, Kazakh and Karakalpak; (2) medieval Chagatai, modern Uzbek and Uyghur. The history of the Karluks and their bearing on Proto-Kyrgyz-Chagatai According to scanty historical records, the Karluks left the Altai mountains circa 665 AD, and migrated towards the Jetti-Su (the Seven Waters region between Lake Balkhash and the Tian Shan Mountains), reaching the Amu-Darya River by about 700. This implies that they may be related to Proto-Kyrgyz-Chagatai originally distributed near the same region (but not at all necessarily). After the famous Battle of Talas in 751, when the Chinese were defeated by the Arabs and the Arabic supremacy in the region was established, the Karluks were able to form the Karluk Kaganate (in 766) by occupying Suyab, the capital of the Western Turkic Kaganate. It was perhaps the political turmoil in the Western Turkic Kaganate, which allowed the Karluks seize power in the Jetti-Su. The final fall of the Eastern Gökturk Kaganate in 840 left the Karluks in full possession of the Jeti-Su region (the area between the northern Tian Shan and Lake Balkhash). These events must have led to the formation of the Proto-Kyrgyz of Kyrgyzstan (and ultimately, after the 1450's, the Kazakh and Karakalpak languages), though neither the exact details nor the historical relatedness between Karluk and Kyrgyz were clearly documented. After 840, there could have been a second wave of Kyrgyz migration to the Jetti-Su from the Kulunda Steppe (sources?) that ended political domination of Karluks and finally brought the name of "Kyrgyz" to the present-day Kyrgyzstan (sources?), though the details of this process are still very unclear. The Chagatai subtaxon, which includes Uzbek, Uyghur and their dialects, is named "Karluk" in Baskakov's classification (see a separate paragraph below). The Baskakov's name "Karluk" for this subtaxon is unacceptable on the same grounds as above: the ethnic affiliation and the exact Turkic dialect spoken by the Karluks are rather obscure. By contrast, the Chagatai origins of Uzbek-Uyghur are well-established. Kazakh is closely related to Kyrgyz Before we proceed with the discussion of larger taxa, we will attempt to show the close linguistic relatedness between Kazakh and Kyrgyz, which is an important question for the historiography of Kazakhstan and Kyrgyzstan. The Kyrgyz and Kazakh ethnonymic confusion Before the 1920s the Kazakh people were traditionally known as Kirgizy "the Kyrgyzes" among Russians. As the often cited anecdote goes [apparently, first mentioned by Kurbangali Khalid (1843-1913)], when asked about their ethnic affiliation, a Kazakh would normally answer something like, "Men Qazaq-pïn" but corrected by a 19th century's Russian officer, "What kind of Kazak you are? You're a Kirgiz!". The discrepancy is probably due to the frequent application of the ethnonym Kazak to the Cossacks of the Polovtsian Steppe and the members of Cossack army. Both are pronounced in Russian as /kazAk/, nearly in the same way as /kazAkh/ "Kazakh", which inevitably resulted in conflation. As Max Vasmer's Russisches Etymologisches Woerterbuch (1950-58) suggests, based on Radlov, who lived among the Kazakh nomads in the 1860's, the original meaning of Kazak was "free-lancer, an independent adventurer, soldier of fortune", thus it could have been applied in the medieval period to many different groups of Turkic, Slavic or any other origin. Whether true or not, this interpretation has become generally-accepted. Note: However, this famous Radloff-Vasmer's etymology seems to be rather folksy and hardly corroborated by factual vocabulary. The suffix -q seems Turkic indeed. Among roots of similar phonetic shape, there are Turkic *qaz- "to dig", *qazïq "pole", *qazan- "to gain", and Arabic qazza:b "lier", gazawat "sacred war", etc. Apparently, there is no reference to a "free-lancer". It is more reasonable to assume that *qazaq had originally been a name of a small clan's leader subsequently lost in history. The Cossacks of the Ponto-Caspian region must have recieved there name from the Kazakhs of Kazakhstan via the interaction with the Nogai clans, though there seems to be little specific evidence. Consequently, to avoid confusion, the Kazakh were officially called Kazakh Kirgizes, whereas "the Kyrgyzes of Kyrgystan" — Kara Kirgizes. And indeed, in many 19th century's publications, such as Radloff's Versuch eines Woerterbuches der Tuerk-Dialekte (1893) printed in German and Russian, Kazakh was formally named Kirgiz (Kirgizischer Dialekt), whereas Kirgiz was formally named Kara-Kirgiz (Kara-Kirgisischer Dialekt). The Kara-Kirgizskaya Autonomous Oblast was actually the earliest official title of Kyrgyzstan given in 1924. As to the origins of the ethnonym Qyrqyz, there are more wild guesses than well-argued explanations. The name is obviously at least 1500 years old, as it was first mentioned in the Orkhon inscriptions (720's), though probably had existed even earlier. It seems to be the original name applied not only to Yenisei Kyrgyz tribes, but also to the members of the Kyrgyz Kaganate, and in a broader sense, to most Turkic tribes of the eastern part of the Great Steppe, at least until the Mongol invasion. Moreover, a lake in the Great Lakes Depression in western Mongolia (south of Tuva) was for some reason named Lake Kyrgyz or Khyargas, presumably because of the association with the Yenisei Kyrgyz. As a result, it is actually very difficult to differentiate between the Yenisei Kyrgyz, the Kyrgyz of the Kyrgyz Kaganate, and the early Kyrgyz of Kyrgyzstan, though all of them seem to be ethnologically different entities. Phonetically, the word Qyrqyz can be associated with qyr- "break, smash" or qorq- "fear". It seems to be a reduplication, typical of Turkic languages, where the root *qyr-qyr was repeated for emphasis, but the second word-ending -r mutated to -z according to the law of zetacism in Turkic Proper. The original meaning could therefore be "breaker" (strong warrior). Most likely, as it has been explained above, the word Qyrqyz must have originally been a name or a war alias of a clan progenitor or chief, which later spread to the name of his clan (as in the case with the Seljuks, Noghai, Uzbeks, etc). The event could probably be dated to as early as the beginning of the common era, judging by the action of the zetacism law, thus placing it among the oldest known self-appellations used by the Turkic peoples. Specific phonological features in Kazakh-Karakalpak The similarities between Kyrgyz and Kazakh are so many that it is easier to discuss their differences in the first place. The table below lists some of the phonological differences which seem to have emerged in Kazakh and Karakalpak because of their secondary contact with the Kimak-Kypchak-Tatar languages, particularly Nogai, as well as possibly with some unknown Southern Uralic substratum. By contrast, Kyrgyz seems to be more archaic exhibiting more retentions.
Consequently, we can see that the phonological differences between Kyrgyz and Kazakah-Karakalpak are also shared by some of the Kimak languages that were part of the Golden Horde, particularly the nearby located Nogai. Such phonetic evidence probably led Baskakov to believe that Kyrgyz and Kazakh are not even closely related, and Kazakh should be regrouped with Nogai. However, judging from the good lexical matches between Kyrgyz and Kazakh that were not measured by Baskakov, this is clearly not the case. Rather, the purported relatedness between the Kimak languages and Kazakh must result from the many shared archaisms and a few secondary changes in Proto-Kazakh-Karakalpak which came from a posterior interaction of the early Kazakh with the languages of the Golden Horde, specifically and most likely the early Nogai. The grammar of Kyrgyz and Kazakh Both Kyrgyz and Kazakh a great number of archaic features, many of which are also known to exist in the Altay-Sayan Turkic languages. As far as the innovative elements are concerned, Kyrgyz and Kazakh seem to exhibit the following grammatical elements: (1) Both Kyrgyz and Kazakh use the typical 2nd person plural pronoun, apparently absent from other branches, cf. Kyrgyz sizder, siler; Kazakh sizder, sender. (2) A rather unique type of the instrumental case, cf. the Kyrgyz menen e.g. qol menen "with the hand", Kazakh -men, -pen, -ben; also menen. Although this feature is probably archaic, taken that *menen is also known in certain dialectal variations of standard languages, such as Eastern Bashkir or Sagai Khakas. An even greater number of grammatical traits is simultaneously shared with Chagatai-Uzbek-Uyghur languages (see below). However, beside the similarity, there is also some notable discrepancy in grammatical usage and morphology:
The lexis of Kyrgyz and Kazakh Kyrgyz seems to be a rather archaic language with a minimum number of lexical borrowings, which clearly sets it apart from Kimak that includes a number of Oghuz innovations and Perso-Arabic loanwords (see below). Speakers of both Kazakh and Kyrgyz usually report good mutual intelligibility and sometimes state that they are bir tuGan "of one kin". The differences in Swadesh-215 seem to be very small, no more than 8%, and in some cases these are just minor inconsistencies in dictionaries. Only the following clear-cut mismatches were found in the original Swadesh-200:
Kyrgyz küyö, Kazakh küyeu "husband"; Kyrgyz chöp, Kazakh shöp, Uyghur chöp, "grass"; Kyrgyz sogu:, Kazakh soGu "blow (of wind) (originally: strike)";Kyrgyz qachïq, Kazakh qashïq "far away" (from kach- "to run away"); Kyrgyz soru:, Kazakh soru "suck" also exist in Altay-Khakas and/or Uzbek-Uyghur but seem to be absent or not typical in Tatar-Bashkir; Kyrgyz özön, Kazakh özön "river", typical in this meaning only of Kyrgyz-Kazakh, though is also known in Kumyk, Tatar, Salar, Altay, etc as "brook", "stream" and Crimean Tatar "river" (which may be an independent semantic mutation); Also, cf. the phonological similarities in Kyrgyz jumurtqa, Kazakh zhûmurtqa "egg"; Kyrgyz jalbïraq, Kazakh zhapïrak "leaf", which are rather unique among other Turkic (and presumably archaic). The history and geography of Kazakh The Kazakh Khanate was founded in 1456-1465 by Janybek (Zhany-bek) Khan and Kerey Khan in the Jetti-Su area (in the southeastern part of present-day Kazakhstan), following a successful rebellion against the Uzbek Ulus and its Abu'l-Khayr Khan. [These events were described by Mukhammed Khaydar in Tarih-i-Rashidi]. The early years of the Kazakh Khanate were marked by the struggle against the Uzbek leader Muhammad Shaybani, who was defeated in 1470. Consequently, the Jetti-Su (Zhetysu) ("The Seven Waters") area north of Almaty and especially the area of the Chu river, can be regarded as the Kazakh Urheimat, where the Kazakh Khanate was first founded and where the Kazakhs began their expansion to the Great Steppe in the north. On the other hand, the Chu River, that now runs along the Kazakh-Kyrgyz border from the present-day territory of Kyrgyzstan, is often seen as a traditional Kyrgyz habitat just as well. Actually, this is where Bishkek, the capital of Kyrgyzstan, is located. Almaty, the largest city of Kazakhstan, is only 200 km (120 miles) away from Bishkek across the Zaili (=from Russian Za-Ili-yskiy "Trans-Ilian, behind the Ili River") Alatau Ridge, so both settlements are situated at the foot of the Tian Shan Mountains nearly in the same area. Consequently, the geographic and historical connection between the Kyrgyz and Kazakh ethnicities becomes quite evident. The dialectal differentiation in Kazakh There are at least two major dialectal groups within the Kyrgyz language: the Northern and Southern dialects. This dialectal differentiation in Kyrgyz marks it as a slightly "older language" than Kazakh, which is much more dialectically uniform. Indeed, despite the large territory it occupies, Kazakh is often reported to have no dialects at all, especially in popular, nonscientific sources. However, this is not entirely true. The Western Kazakh dialect may differ (or may have differed in the past before the mass Russification and the TV standardization began) from the Eastern one in several ways, including such features as the Western /zh/ : Eastern /j/ pronunciation, the usage of -zhaq / zhek for the future tense, etc. Moreover, certain minority dialect-languages in Astrakhan (along the Volga) can presently be viewed as nothing but westernmost dialects of Kazakh, since they share 98% of mutual intelligibility with it, e.g. the so called Karagash Nogai language (not to confuse with Nogai Proper on the Caspian Sea) and Karakalpak. In any case, the weaker dialectal differentiation in Kazakh as compared to Kyrgyz marks it as a little "younger" language that must have been spreading north from the area of stronger dialectal differentiation, such as the foot of the Tian Shan Mountains near Kyrgyzstan but was affected by the dialect of Nogai clans in the Great Steppe south of the Urals. Alternative taxonomic hypotheses The placement of Kyrgyz within the same subgroup as the Altay Turkic languages was popularized by the famous Baskakov's classification, which became a generally-accepted standard in the Soviet-Russian Turkology [see Baskakov, N.A. Klassifikatsiya tyurkskikh yazykov v svyazi s istoricheskoy periodizatsiyey ikh razvitiya i formirovaniya (The classification of Turkic languages as connected to the historical periodization of their formation and development), Moscow (1952)]. However, judging by his later works from the 1960's to 1988, it turned out that there was no or little specific argumentation for this taxonomic decision. Generally speaking, Baskakov's classification was based on phonological and grammatical features, and some personal intuition, without any vocabulary comparison. Conclusions: The close relatedness between Kazakh and Kyrgyz is hardly deniable. In fact, they are so lexically close (92%, Swadesh-215) that under certain simplifying circumstances they could even be viewed as very distant dialects or variants of each other, however, the notable discrepancy in phonology and grammar marks them as distinct languages. We can now draw several conclusions concerning the early Kazakh history. Based on (1) the weaker dialectal differentiation in Kazakh as compared to Kyrgyz; (2) the presence of notable Nogai phonological features; (3) the geographical proximity of Kazakh to the languages of the Golden Horde, particularly Nogai; (4) its original location along the Chu River, near the present-day Kyrgyzstan border, Kazakh can be viewed as a historically recent 14th-16th century expansion of Kyrgyz-related tribes from the Tian-Shan Mountains into the northern steppeland. Because of the expansion over the large territory of the Kazakhstan steppe, the early Kazakh tribes must have made contact with various languages and dialects of the Golden Horde, specifically the early Nogai and other Kimak-related dialects along the Volga and the Ural (Yaik / Jaik) River. This contact may have resulted in the formation of a "Nogacizied" form of the medieval Kyrgyz, which finally led to the emergence of the present-day Kazakh and Karakalpak languages. Altay-Kyrgyz isolexemes Besides the close proximity between Kazakh and Kyrgyz, there also exist several Altay-Kyrgyz isolexemes, which make the Kyrgyz relationship with Kazakh less apparent: Altay and Kyrgyz lexis and phonology In basic vocabulary, both Altay and Kyrgyz share a number of isolexemes: (1) Altay jaan, Kyrgyz choN, and Uyghur chong "big"; (2) Altay kurch, Kyrgyz kurch "sharp (as of a knife)"; (3) Altay moko, Kyrgyz mokok "dull (as of a knife)", also cf. Tuvan mugur, probably from Mongolian; (4) Altay d'ün, Kyrgyz jün, Khakas chüg "feather" as opposed to Kazakh qawïrsïn; (5) Altay sok, sogor, Kyrgyz sogu:, Kazakh soGu "to blow (as of wind) (literally "to strike"); (6) Altay uk, Kyrgyz ugu: "to hear"; also found in Khakas, Uyghur, Kazakh as "to understand", though this word is more typical of the Altay dialects than any other languages. The word may be related to the Mongolian uqa-/uxa- "to understand" [see Sevortyan's dictionary (1974)]; (7) Altay küyer, Kyrgyz küyü: "to burn (intr.)", also attested in Khakas, Tuvan; Among examples of lesser importance, one can also note: (8) Altay sler, Kyrgyz siler, not to confuse with sizder "you (plural)", cf. a similar but not identical Kazakh secondary formations sen-der, siz-der. The siler isolexeme is obviously not exclusive to Kyrgyz-Altay, but is widely used in Altay-Sayan, Uyghyr as well as probably in some other Turkic languages east of the Tian Shan; (9) Altay bul, Kyrgyz bul, Kazakh bûl, and also Bashkir bïl "this", instead of the apparently more archaic *bu (and despite the alleged Starling's external etymologies, where the Altaic words for "body" seem to be used). However, this particular phonological shape was picked up much earlier, before the separation of Kazakh and is rather archaic; Moreover, note the following phonological similarities: (1) Altay üren, Khakas üren, Kyrgyz ürön "seed", as opposed to Kazakh ûrïq, Uzbek uruG, Uyghur uruq; (2) Altay sö:q, Khakas sö:q, Tuvan sö:q, Kyrgyz sö:q "bone", as opposed to Kazakh süyeq, Uzbek suyoq, Tatar söyaq; (3) Altay o:s, Khakas a:s, Tuvan a:s, Kyrgyz o:z "mouth", as opposed to Kazakh awïz, Tatar avïz; In other words, the typical Altay-Sayan phonological contraction that we have discussed earlier in the chapter dedicated to Altay-Sayan is also present in Kyrgyz, at least to some extent. Kyrgyz history One of the most dramatic historical periods in the history of the Kazakh nation was marked by the long-lasting struggle (1723- 1758) against the Dzungarian Khanate that ruled over East Turkistan and West Mongolia in the 18th century. This severe and brutal conflict finally forced the Kazaks to seek alliance with the Russian Empire in 1731. It is assumed herein that this period could also be marked by the presumable Altay-Kyrgyz migrations, which might have brought Altay Turkic to the Tian Shan Mountains where it intermingled with the local Kyrgyz language. This tentative hypothesis is corroborated by the fact that some similar Altay—Tian-Shan migrations are mentioned in the Manas, the Kyrgyz epic. Some corroboration may also be reflected in the ethnonymic conflation between the Altay-kizhi people (=Standard Altay speakers living in the Altai) and the Oirots (=Dzungarians of Mongolic origin near the Mongolian Altai), since the Altay-kizhi retained the name of Oirots or Oirats well into the Soviet era. This conflation suggests that some the Altay-kizhi could have become part of the Oirat army and participated in the invasion of the Tian Shan. It is also known from historical records that the Kyrgyz people had been pushed by the Oirat invasion into the Ferghana valley [The Great Russian Encyclopedia (2005)]. Moreover, some of the Mongolic Oirats, known as Sart-Kalmaks, survived the downfall of the Dzungarian Khanate (1755-58) and became part of the Kygyz tribes staying near Lake Issyk-Kul. If this conjecture is true, all the changes in Kyrgyz that differentiate it from Kazakh and make it similar to Altay must be relatively recent and acquired just a few centuries ago. Kyrgyz geography The present-day mountain habitat of the Kyrgyz people in the Tian Shan appears to be a typical isolated refugium formed after several military invasions from the Kazakhstan steppe and Taklamakan desert, such as the Mongolian invasion (c. 1220-1450), and the Dzungarian invasion (c. 1720-1750's). This predicts an early Kyrgyz presence along the northern part of the Silk Road in the Jeti-Su (Zhetisu) area and the Ili Valley during the early Middle Ages. This earlier and more eastern habitat at the foothills of the Tian Shan was later superceded by the arrival of Kara-Khidans, Mongols, Dzungarians, and other invaders, making the Kyrgyz migrate closer to Lake Issyk-Kul in the Tian Shan. Conclusion: Since many or most of the Altay-Kyrgyz isolexemes are equally found in Khakas and sometimes even Tuvan, and (1a) Altay has been shown above to belong to the Altay-Sayan taxon, on one hand, and (1b) Kyrgyz has been shown above to be closely related to Kazakh, on the other hand, and (2) few of these words are found in the closely related Kazakh language, we may conclude that most of these unexpected Altay-Kyrgyz isoglosses are late borrowings brought into Kyrgyz from Altay Turkic somewhere between the 1500-1900's, that is already after the separation of Proto-Kazakh. The most likely historical event that occurred in this geographic region during that historical period was the Dzungarian invasion of the 18th century. Therefore, we may assume that there existed an 18th century's military migration from the Altai to the Tian Shan Mountains, which brought these originally Altay lexemes into Kyrgyz, making the Kyrgyz language presently look more similar to Altay Turkic than it actually may be. In any case, we must infer from the lexical evidence above that Kyrgyz is still more closely related to Kazakh than to any other Turkic language, whereas the Altay-Kyrgyz shared features must result from a secondary interaction between Altay and Kyrgyz. Chagatai looks like Karakhanid affected by Kyrgyz The Chagatai subtaxon includes medieval Chagatai, modern Uzbek, Uyghur and their dialectal variations. The Chagatai subtaxon First of all, note that with just 86% of lexical proximity in Swadesh-215 (obvious borrowings excluded), the Uyghur and Uzbek languages (and their internal dialects) must be as close to each other as Turkish and Azeri, which is the common example of closely related languages in the Turkic group and outside of it. Both languages received their respective names only in the 1920's, being known as Chagatai, Sart or Türki for most of the time before that. The Chagatai subtaxon is often known as Karluk in Baskakov's classification and those of his followers. However, as we have explained above, the exact origins and linguistic affiliation of Karluks is very obscure, and it is far from clear what relation the early Chagatai people bore to the Karluk tribes. Moreover, this kind of misplacement of ethnonymic stress seems to make the Chagatai language and its well-known relatedness to Uzbek and Uyghur unjustly forgotten, which may make one wonder what kind of Turkic language Chagatai possibly was. For these reasons, the name "Karluk" for this taxon seems to be out-of-place and should probably be replaced with Chagatai. Chagatai-Uzbek-Uyghur geography Just as the neighboring Kyrgyz, the Chagatai-Uzbek-Uyghur languages originally occupied mountain territories along the Tian Shan range as well as some of the suitable oases along the edges of nearby deserts. Note: The Tian Shan is one of the longest mountain ranges in Central Asia forming part of the natural barrier between the Great Steppe in the north and the Taklamakan desert in the south. It mergers with the Pamirs in the west and it is separated from the Altai by the Dzungarian Plane in the east. A topographic map of the Tian Shan Mountains [topomapper.com (2011)] Chagatai-Uzbek-Uyghur history The Chagatai Ulus was a Turko-Mongol Khanate inherited by Chagatai Khan (1183-1241), the second son of Genghis Khan (1162-1227), but ruled by his successors. The true founder of the Chagatai Ulus was Alghu, the grandson of Chagatai, who in 1261 established control over most of its territory but died in 1266. Chagatai Khanate [en.wikipedia.org (2011)] Giovanni da Pian del Carpine, who was passing through the Chagatay Ulus north of Tian Shan Mountains in 1245, described some scenes of great devastation in the nearby western areas left after the war with the Mongols:
Moreouer,
out of the land of the Kangittæ [= probably, the land of Kangly
located
near the Ustyurt Plateau or nearby area], we entered into the
countrey of the
Bisermini [= apparently, a vague alias for Turkic-speaking
Muslims,
cf. dialectal Russian basurmany from musulmany "Muslims"],
who speake the language of Comania [= by Cumania the author meant
the vast
land between the Kievan Rus in the west and the Volga River in the east,
where
Cuman-Polovtsian, or (Old) Kypchak, was spoken], but obserue the law
of the
Saracens [= Islam, Sharia]. In this countrey we found innumerable
cities
with castles ruined, and many towns left desolate. The lord of this
country was
called Soldan Alti, who with al his progenie, was destroyed by the
Tartars [=
the Mongols, Tataro-Mongols, Turko-Mongols, the Tatar tribes directed by
the Mongols].
This countrey hath most huge mountains [= apparently, the Tian Shan].
On
the South side it hath Ierusalem and Baldach [= Baghdad], and all
the whole
countrey of the Saracens [=Arabs, Muslims]. In the next
territories adioyning
doe inhabite two carnall brothers dukes of the Tartars [= Mongols],
namely,
Burin and Cadan, the sonnes of Thyaday [= Chagatai], who was the
sonne
of Chingis Can.
Political
strife in the Chagatai Ulus never ceased since the days of its
formation. In 1346,
a tribal chief Qazag-Khan from the Mongolic tribe of Qaraunas in
Afghanistan and
eastern Persia [Babur noted that they still spoke Mongolian in the late
15th century]
killed the Chagatai Khan-Qazan during a revolt. Qazan's death marked the
end of
an effective Chagatayid rule over Transoxiana. As a result, the
administration
of the region fell into the hands of the local chieftains of Turkic and
Mongolic
origin. Using the disintegration, Janibeg Khan, the ruler of the Golden
Horde
from 1342 to 1357, asserted Jochid dominance over the Chagatai Khanate. [Frier Iohn de Plano Carpini, The long and wonderful voyage of Frier Iohn de Plano Carpini, (1245-46)] Note: It is believed that Janibeg's army had catapulted infected corpses into the Crimean port city of Kaffa (1343) in an attempt to use the plague to weaken the defenders. Infected Genoese sailors subsequently sailed from Kaffa to Genoa, introducing the Black Death into Europe. However, the Chagatayids expelled Janibeg Khan's administrators after his assassination in 1357. By 1363, the control of Transoxiana was contested by two tribal leaders, Amir Husayn (the grandson of Qazaghan) and the famous Timur, or Tamerlane. Timur [from Turkic temir "iron"] eventually defeated Amir Husayn and took control of the state. As a legacy of the severe devastation caused by the Mongol invasion and the ensuing feudal turmoil, the Karakhanid language of the Tarim Basin lost its political dominance and cultural significance in the region. It is conjectured herein that the desolation of towns, the spread of deadly disease, the subsequent intervention of the Golden Horde and the resulting continual movement of large armies, as well as the later conquest of the Golden Horde territories by powerful Chagatai leader Timur (Tamerlane) resulted in supplanting of the Karakhanid language by an unknown Great-Steppe dialect situated along the northern ridges of the Tian Shan Mountains, such as an early Kyrgyz or Karluk. Consequently, the early Chagatai language emerging during that period, was essentially a mixed dialect mostly based on the Kyrgyz grammar but with the Karakhanid phonology. Chagatai-Uzbek-Uyghur phonology By taking a closer look at the actual lexical and phonological differences (see the table below), we may conclude that Uzbek and Uyghur phonology bears certain similarities to Karakhanid, e.g.: (1) an innovative /*S-/ > /y-/ mutation, just like in Orkhon-Karakhanid, e.g. Uzbek, Uyghur, Karakhanid yol "way" as opposed to Kyrgyz jol, Kazakh zhol; Uzbek yurak, Uygur, Karakhanid yürek "heart" as opposed to Kyrgyz jürek, Kazakh zhürek; (2) the retention of the nasal /-N-/ as in Karakhanid, cf. Karakhanid müNüz, Uzbek mugiz, Uyghur müNgüz "horn"; Karakhanid süNük, Uyghur söNäk (but Uzbek suyak), as oppose to Kyrgyz sö:k, Kazakh süyek "bone"; (3) the intervocalic or final uvular or velar /-G-/, /-G/, cf. Karakhanid taG, Uzbek tôG (mountain), Uyghur taG; Karakhanid baGïr, Uyghur beGir "liver". By contrast, the languages of the Great Steppe all have /-w-/ and /-w/ in this case; (4) the initial /b-/ instead of /m-/ just as in Karakhanid, cf. Karakhanid boyun, boyïn, Uzbek bûyin, Uyghur boyin "neck", as opposed to Kyrgyz moyun, Kazakh moyïn; (5) the retention of the final /-vq-/ in certain words, such as in Karakhanid yuvqa, Uzbek yupka, Uyghur yupqa "thin", as opposed to Kyrgyz Juka; (6) the lenition of the "heavy" /-d-/, /-t-/ into the "lighter" /-l-/, which provides Uzbek-Uyghur with a more lenitioned, more simplified and more western pronunciation as in Uzbek -lar, Uyghur -lar, -lêr, as opposed to Kyrgyz -lar, -ler, -lor, -lör, -dar, der, -dor, dör, -tar, -ter, -tor, -tör with its heavy, fortified consonants and some similar fortition in other languages of the eastern part of the Great Steppe. On the other hand, the Great-Steppe phonological influence in general and the Kyrgyz influence in particular is also quite evident, cf. (1) the innovative metathesis in Uzbek yamGir, Uyghur yamGur as in Tatar yaNgïr, Bashkir yamgïr, Nogai yamGïr, Kyrgyz jamgïr and other languages of the Great-Steppe, instead of the Old Uyghur yaG-mur from *jaG- "to fall, to rain" and *mur, the typical Proto-Altaic word for "water"; (3) Uzbek mûgiz, muguz, Uyghur müNgüz, which is similar to the Kazan Tatar mögez, Bashkir mögöð, instead of Karakhanid müNüz, Old Uyghur müyüz; (4) Uzbek sovuk, which is similar to the Kazan Tatar sïwïq, Bashkir hïwïq, Nogai suwïq instead of the Karakhanid suGïq, though it is also partly retained in Uyghur soGaq; (5) Uzbek yaproq from Proto-Kimak *yapraq instead of the longer yapurgak in Karakhanid, though the Old-Uyghur-Karakhanid pronunciation is also partly retained in modern Uyghur yapurmaq; The table below lists some of the phonologically dissimilar words in Turkic languages of Central Asia. Note that Uzbek, Uyghur and Karakhanid are mostly colored dark red, marking their apparent lexical and phonological relatedness of Uzbek-Uyghur to Karakhanid, with just a few Kimak-Kypchak-Tatar borrowings in Uzbek.
Chagatai-Uzbek-Uyghur grammar However, the Uzbek-Uyghur grammar usually lacks the most essential Orkhon-Karakhanid features ( and they may only be occasionally present in Chagatai), namely: (1) the lack of the archaic copula er-/är- (see below) and its mutation to e- in Uzbek e-mes, e-dim just like in other languages of the Great Steppe; neither is there any notable usage of tägül which was known in Old Uyghur; (2) the lack of the typical Karakhanid usage of the 3rd pers. singular pronoun ol as a copula (see below), e.g. ul mêniN oGlïm ol, literally "he (is) my son-he". The ol-copula mutated to zero in modern Uzbek-Uyghur languages; (3) the absence of the Future Tense with -Gay, -gey (see below) in Uzbek-Uyghur known in Karakhanid, Old Uyghur and other representatives of the Southern branch, though it sometimes ay be retained in written Chagatai as -Ge; (4) the absence of the archaic instrumental case ending -(n)ïn, that was originally present in Karakhanid, Old Uyghur and other early branches of Turkic Proper; (5) the lack of the archaic directional case ending -Garu known from Old Uyghur and other representatives of the Southern branch; (6) no persistent usage of -mïsh- (replaced by -Gan- as in other languages of the Great Steppe), though -mïsh- is still sporadically present in Chagatai and Uzbek dialects. The situation with the -mïsh- seems to be more complex than it may initially seem, since -mïsh- can be used quite actively in modern Uzbek (as an example consider the song provided as an example in The Turkic languages in a Nutshell, ), but seems to be absent from the published grammars of the "literary" Uzbek. That may imply that the grammar of Standard Literary Uzbek is the same kind of science fiction as those of Standard Khakas, Altay, Evenk, Nenets, etc. Note: The creation of "literary" local languages (sometimes renamed herein as "standard" in English), was part of the general paradigm in the postwar Soviet Union. Since it was quite difficult or even impossible to conduct specific research for each and every local dialect and separate all the dialects from all the languages, certain simplifications had to be made with some major dialect getting clustered into a single category and the local particularities being ignored and forgotten. In some cases, this procedure could even lead to the loss of the intelligibility with the proclaimed literary standard or a virtual loss of the vernacular. As a matter of fact, the most typical grammatical features of modern Uzbek and Uyghur clearly point to the languages of the Great Steppe, particularly Kyrgyz and Kazakh. Consider the following Uzbek-Uyghur morphemes: (1) The typically Great-Steppe verbal ending -di / -dï / -ti / -tï in the 3rd person singular in the present and future tense, e.g. Uzbek bor-ap-ti "he is going", bar-a-di "he will go", Uyghur bar-i-du "he'll go", yaz-i-du "s/he, they (will) write", cf. Kyrgyz bar-a-t "he will go", Kazakh bar-a-dï "he is going". (2) The usual Great-Steppe verbal ending -d-ik in the 1st pers. plural Past Tense, cf. Uzbek bor-d-ik "we went, kel-d-ik "we came", Uyghur yaz-d-uq "we wrote" as in Kyrgyz bar-d-ïk, kel-d-ik, even though it seems to be used interchangeably with the Karakhanid -dimiz > -divuz in the Toshkent dialect of Uzbek, cf. bar-d-uvuz "we went", kel-d-ivuz "we came". The -d-ik type of suffix also seems to be occasionally attested in Karakhanid sources in relation to Oghuz, but it had never been original to the Orkhon-Karakhanid subtaxon. (3) The typically Kyrgyz-Kazakh -ïb-man, -ïp-tïr Unexpected Past Tense as in Uzbek unut-ib-man "so it turns out I forgot", Uzbek kel-ip-ti "so he really came", Uyghur yez-ïp-tu "he (really) wrote", cf. Kyrgyz al-ïp-tïr "so it turns out he took it, he really took it", Kazakh söyle-p-ti "he seems to have said", bar-ïp-pïn "I might have gone". (4) The -yat-ïr-man Present Continuos Tense as in Uzbek yaz-a-yat-ïr-man "I am writing", Tashkent Uzbek bor-wot-tï "he is working" (a contracted form), Uyghur kir-i-wati-men (a contracted form) "I'm coming in", cf. similar forms in Kazakh bar-a-zhat-ïr-mïn, Kyrgyz bar-a-jat-a-mïn "I walk, I'm walking", Kyrgyz oku-p-jat-a-mïn "I'm reading". The original grammatical meaning was actually "I am lying doing something" which perhaps initially implied a leisurely, slow passage of time as if resting in a yurt. The -a- suffix here seems to be just a spoken contraction from the -ïp- gerundial suffix, given that the latter is much more widely used in Kyrgyz and Kazakh in similar expressions. (5) The typically Central-taxon -Gan Perfect Tense normally absent from the Southern taxon where Karakhanid and Old Uyghur belong, e.g. Uzbek ishla-Gan-man "I have worked", Modern Uyghur yaz-Gan-män "I have written", cf. Kazakh ol kel-gen "he has come", Kazakh men kel-gen-min "I have come", etc. (6) The widely used -a-man, -y-man, -e-men Habitual Present / Future Tense instead of the -r- Aorist in Old Uyghur and Karakhanid, e.g. Uzbek ishla-y-man "I work; I will work", Uzbek men bil-ma-y-man "I don't know", Uyghur kir-i-men "I enter", cf. Kyrgyz bar-a-mïn "I will go", Kyrgyz bil-be-y-min "I don't know", Kazakh bar-a-mïn "I will go", Kazakh bol-a-mïn "I will be". The Aorist in Uzbek-Uyghur is now used only in the meaning of a potential or uncertain future, e.g. Uzbek bar-ar-man "I think I will go", Uyghur kir-ir-men "I might enter", Uyghur tut-mas "he might not catch (hold) it". (7) The -mak-chi-men Tense expressing wish or intention, e.g. Uzbek qil-moq-chi-man "I'm going to do it", Uyghur yaz-maq-chï-men "I'd like to write" , cf. Kyrgyz yaz-mak-chï-mïn "I want to write". The construction originally meant "I am the doer (for this) " > "I'm eager to do this"; and it does not seem to be attested in Karakhanid. (8) The -Gin / -Gïn imperative as opposed to -Gil / -Gïl /-qil /-qïl imperative in Karakhanid and Old Uyghur, e.g. Uzbek oqi-gin "You read!", Uyghur yaz-Gin, yez-iN "You write!", cf. Kyrgyz bar-gïn "You go!", but Karakhanid tur-gïl "Stand up!". Chagatai-Uzbek-Uyghur lexis But to which subgroup within the Great Steppe taxon is Chagatai-Uzbek-Uyghur related most? According to the lexicostatistical research (2012), there is about 83% of average distance from Uzbek-Uyghur to Kyrgyz-Kazakh, about 78% to Tatar-Bashkir, and about 74% to Turkmen (with borrowings excluded), which marks Kyrgyz-Kazakh as the most closely related subtaxon (outside Orkhon-Karakhanid which could not be counted lexicostatistically). Uzbek-Uygur and Kyrgyz-Kazakh seems to share a few presumably innovative isolexemes in Swadesh-215 that are apparently missing or rare in other subgroups, cf. (1) Uzbek yiqilmoq, Uyghur yiqilmaq, Kazakh zhïGïlu, Kyrgyz zhïGïlu "to fall"; (2) Uzbek dumaloq, Uyghur domlaq, Kazakh domalaq "round (such as wheel, lake, table)"; (3) Uyghur chöp, Kazakh shöp, Kyrgyz chöp "grass"; (4) Uzbek uqalamoq, Uyghur ugulumaq, Kazakh uqalau, Kyrgyz ukalo: "to rub"; Moreover note certain Great-Steppe words with some wider distribution in the nearby languages: (5) Uzbek bu yerda, Uyghur bu yerde, Kazakh bûl zherde, Kyrgyz bul zherde "here", also at least in Altay bu d'erde and Turkmen bu yerde "here". This phrase, of course, is not necessarily originally Kyrgyz-Chagatai or even Great-Steppe; it may have formed at an earlier level or even independently in several Turkic subgroups with some posterior contact spreading (for instance, probably into Turkmen which often borrowed from Great-Steppe). Nevertheless, its usage in the Kyrgyz-Chagatai subgroup in the sense of "here" is quite typical. (6) The verb kïl- in its direct meaning of "to do" seems to be particularly common of Kyrgyz, Uzbek, Uyghur, Bashkir, however it is not limited only to these languages and is widely distributed in various meanings from Tuva to Turkey. (7) Uzbek tüshün-, Uyghur chüshen-, Kazakh tüsin-, Kyrgyz tüshün-, Tatar töshen-, Karachay-Balkar tüshün-, Kumyk tüshün-, Turkmen düshün- has the meaning "to understand", for the most part, only in the above-listed languages, even though it may also be distributed in other branches in similar meanings, e.g. Turkish, Gagauz and Azeri düshün- "to think", Nogai "to look into something, to study" and Kumyk "to guess", etc. [Verified with Sevortyan's Dictionary]. It seems that the meaning "to understand" was formed in the Great-Steppe subtaxon, whence it spread into Oghuz-Seljuk (or vice versa). The original meaning of this verb in the literal translation was "to fall oneself; to be fallen" from *tüsh-ün- as if "I fall myself; I'm being fallen (into this)" as in the English idiom "it sinks in". (8) Uzbek tovush, Uyghur tawush, Kazakh dawïs, Tatar tawïsh, Bashkir tawïsh, Karachay-Balkar tawush "voice", a Great-Steppe innovation. (9) Uzbek uy, Uyghur öy "home, house", most Great-Steppe languages *üy. These 4 words constitute merely 2% in Swadesh-215, so it is hard to make any claims concerning particular relatedness of Uzbek-Uyghur to Kyrgyz-Kazakh. However, the general trend in the analysis of the vocabulary described above is to exclude the Kimak subgroup from direct Chagatai predecessors. That becomes even more evident if we take into consideration the closer geographic proximity between Kyrgyz-Kazakh and Chagatai-Uzbek-Uyghur, as opposed to Kimak tribes scattered somewhere near the Urals. By the same token, there are no grounds to suggest that Proto-Kazakh could have affected Proto-Chagatai in a direct way, since we know from history that the formation of Chagatai must have occurred before the separation of Kazakh from Kyrgyz, which is corroborated by the lack of any Kazakh-exclusive isolexemes. Quite on the contrary, we have: (1) Kyrgyz-Chagatai *yamGur "rain", but Kazakh zhaNbïr; (2) Kyrgyz-Chagatai *qïl- "to do", but usually Kazakh isteu, zhasau; Consequently, we should infer that the Great-Steppe tribe that came in contact with Karakhanid in the 13th-14th century belonged to the Kyrgyz-Kazakh subgroup, thus resulting in the formation of the early Chagatai, whereas the Kimaks or the early Kazakh tribes could not have played any significant role in this exchange. The tribal unity under consideration could be Karluk, but there is no direct linguistic evidence. Conclusion: It all looks as if Proto-Chagatai were a language of newly-arrived Kyrgyz-related speakers who continued to build sentences in the way similar to modern Kyrgyz or Kazakh but adopted the Karakhanid-style pronunciation, e.g. Proto-Uzbek-Uygur *müNüz cf. Karakhanid müNüz, instead of Kyrgyz müyüz "horn"; Proto-Uzbek-Uygur *taG cf. Karakhanid taG, instead of Proto-Kyrgyz-Kazak *taw "mountain";Proto-Uzbek-Uygur *aGïz cf. Karakhanid aGïz, instead of Proto-Kyrgyz-Kazak *awïz "mouth"; Proto-Uzbek-Uygur *boyun cf. Karakhanid boyun or boyïn, instead of Proto-Kyrgyz-Kazak *moyun "neck"; Proto-Uzbek-Uygur *quruq cf. Karakhanid quruq, instead of Proto-Kyrgyz-Kazak *qurGaq "dry"; Proto-Uzbek-Uygur *ye- cf. Karakhanid ye-, instead of Proto-Kyrgyz-Kazak *je- "to eat"; Proto-Uzbek-Uygur *yupqa cf. Karakhanid yuvqa, instead of Proto-Kyrgyz-Kazak *juqa "thin"; However, many Karakhanid words were replaced by their Great-Steppe and Proto-Kyrgyz-Kazak equivalents, such as *üy "home, house" instead of Karakhanid äv; often *qorsaq instead of Karakhanid *qarïn; *yamGur "rain" with a metathesis instead of Karakhanid yaGmur, etc. Consequently, we can see that the Chagatai-Uzbek-Uyghur languages seems to inherit the original Kyrgyz grammar and some of the vocabulary, but acquired superficial phonological similarity to Karakhanid. The retention of grammar and lexis is normally more fundamental than the changes in the phonology that can be achieved more easily. Therefore we may conclude that the original Karakhanid speech of the 10th-12th centuries has not survived in the Tian-Shan and Taklamakan being overrun during the complex turmoil and ethnic disorder of the 13th century's Mongol invasion by a new speech of the newcomers from the the northern foothills of the Tian-Shan Mountains who spoke a Kyrgyz-related dialect. (The only living direct descendant of Southern Karakhanid seems to be Khalaj, as shown below.). A counter-argument that Karakhanid and Old Uyghur may be poorly attested and perhaps possess some of the grammatical features described in here as purely Great-Steppe is implausible, judging from the fact that these grammatical features are equally absent from Oghuz-Seljuk languages (the closest modern Karakhanid sibling), and still mostly belong to Proto-Kyrgyz-Kazakh. Approximate glottochronological calculations suggest that the separation of Proto-Chagatai from Proto-Kyrgyz-Kazakh must have occurred at least a few centuries before the Mongol invasion, c. 1000 AD, so it is difficult to attribute Proto-Chagatai directly to the early Kyrgyz, rather it could have been a slightly different Kyrgyz-related dialect, possibly such as Karluk, though the linguistic affiliation of the latter remains unknown. Note: The formation of such "mixed" languages is a typical adstratic phenomenon occurring at the boundary of two ethno-geographical areas, sometimes involving strong impact from a third or forth superstratic component (in this case, Arabic and Persian). This interaction usually leads to remarkable, historically rapid changes in a language, and without a doubt deserves a separate detailed consideration elsewhere. Additionally, Standard Literary Uzbek or its dialects could have picked up certain lexical and phonological elements from Kimak-Kypchak-Tatar languages, but that process must have been fairly recent, less significant and did not affect the basic vocabulary of Uzbek to the same extent. The term Karluk should not be directly conflated with the dialects of Chagatai, Uyghur and Uzbek as in Baskakov's classification. The Karluks were an early Turkic clan confederacy of unknown dialectal affiliation that lived near the Tian Shan between the 8th and 12th centuries. A suitable self-explanatory name for the Kyrgyz-Kazakh-Chagatai cluster could be Tian-Shan. The Kimak subtaxon The Kimak subtaxon, sometimes also designated herein as Kimak-Kypchak-Tatar, includes at least the following languages and dialects: (1) the typical languages of the Golden Horde, which include Sibir Tatar, Bashkir, Kazan Tatar, Mishar Tatar, Nogai, Kumyk, Northern Crimean Tatar, Lithuanian Karaim, Crimean Karaim; (2) Baraba Tatar (presumably separate); (3) Karachay-Balkar; The Kimak subtaxon does not include Kyrgyz or Kazakh. Below, we will try to demonstrate that the above-mentioned Kimak languages indeed share common innovative features. Kimak history and geography According to the work Zayn-al-Akhbar composed by Gardezi circa 1030, where he apparently cites the earlier writings by ibn Khordadbeh (820-912), there was the following legend about the Kimak origins: Once upon a time, there were two sons left after the death of a leader of the Tatars. The younger son, named Shad, was envious of his elder brother, who was the heir to the kingdom, and attempted to kill him. Consequently, Shad had to run away with his slave concubine into the steppe near the Irtysh River, where they settled down in a yurt and lived happily for some time hunting squirrels and ermines. As a result, some of his Tatar relatives came over and joined them. These were the seven men named Imi, Imak (Yamak, Kimak), Tatar, Bayandur, Kipchak, Lanikaz, and Aj(a)lad. All of them also settled down the Irtysh, and finally formed the seven tribes named after these forefathers. See [Gosudarstvo kimakov IX-XI vv. po arabskim istochnikam (The Kimak State of the 9-11th century according to the Arab sources), Kumekov, B.E.; Alma-Ata (1972)]. Most authors writing on the subject [Kumekov (1972), Marquart (1920)] date this legendary period to about 700 AD, which is also confirmed glottochronologically herein. For other details see On the origins of the ethnonym Tatar. By the time of Gardezi (c. 1030) and Mahmud al-Kashgari (c. 1070), the seven clans of the Kimak confederacy were well-established and described by several authors. Mahmud al-Kashgari cited an apt saying, "The snake has seven heads", referring to the Kimak clans. The Arab geographer Al Idrisi (1099-1165), who created his famous (though very convoluted by modern standards) map of the world known in Europe as Tabula Rogeriana, too mentions the existence of 16 Kimak towns apparently located in the upper Irtysh basin near Lake Zaysan [see figures below]. Therefore, the Proto-Kimak-Kypchak-Tatar tribes must have lived somewhere along the upper course of the Irtysh River, where they finally formed their own Kimak Kaganate. The difference between the attested ethnonyms Kimak (Kimek) and Imak (Yemek) is poorly understood. We can hypothesize that the original name could have been preserved in the ethnonym Kumyk that may originate from a clan name, therefore the initial reading could be close to *Qïmïq, but this word was later misread or incorrectly recorded in the Arabic script with a different consonant. In any case, the Kimak (Kimek) Confederacy / Kaganate/ Khanate was a prominent medieval Turkic state in the area of the middle and upper Irtysh River. It existed as the Kimak Kaganate from approximately 743 to 1050 AD, and as the Kimak Khanate until the Mongol conquest in the early 13th century (?). Even though the Kimaks were essentially nomadic, they had many cities mostly in the Irtysh basin, such as Imakiya, which was the summer seat of the Kimak kagan, and which is said to have markets and temples. Note: the Arabic toponym Imakiya is probably a misspelling from Kimakiya /kee-mah-KEE-ya/ which is supposed to mean just "Kimak (City or Town)", for instance as in Arabic al-arabi:ya, al-injli:ziya, etc. It can be inferred from the linguistic and ethnonymic evidence that during the 9th century CE, these Kimak tribes began to spread far away to the west. They were subsequently attested as (1) "Bashkirt" near the Southern Urals and the Volga River by Ibn-Fadlan in 921 and then as (2) "Tatar", "Bashkirt", "Kifchak", etc. by Mahmud al-Kashgari in 1073, as well as by other Arab authors. Consequently, they must have expanded as far as the Ural Mountains somewhere between the 750's-900's, or most likely, after the fall of the Göktürk-Uyghur Kaganate, that is after the 840's. The period of the Kimak spread to the northwest is supported archaeologically: at some period between the 700-900 CE, there was a wave migrations into the Baraba Steppe that displaced the earlier Potchev culture in that area. The new culture was characterized by inhumations in burial mounds along with the horse, which is typically associated with the Turkic tribes. [Arkheologija Zapadno-Sibirskoj ravniny (The Archaeology of the West Siberian Plane), Troitskaja, T.N., Novikov, A.V., Novosibirsk (2004), pp. 93-95]. Moreover, we may suppose that this migration must have proceeded along the northern and northeastern border of present-day Kazakhstan and Russia, because the Irtysh flows to the northwest providing a natural route for a travel in that direction. The migration along the Irtysh towards the confluence of the Irtysh and Tobol is also corroborated by the existence of the Baraba Tatars along the middle course of Irtysh and the Sibir Tatars near the Tobol-Irtysh confluence. These ethnic groups share many common features both with each other and with the Bashkir and Kazan Tatars. Otherwise, if the migrating Kimak tribes had turned west or southwest, they would have run into the Karluk and Kyrgyz territory in the south near the Tian Shan, mentioned by al-Idirisi and in other historical sources. Also note that any direct migrations to the west across the central Kazakhstan are unlikely due to geographic difficulties, such as desert climate, highlands and the scarcity of water sources. By following the Tobol and Yaik River, and/or traveling across the Southern Ural, the Kimak tribes must have crossed into Eastern Europe and formed the ancestors of the early Bashkirs and Tatars. Following the upper Kama, some of them must have reached the confluence of the Kama and Volga, where the Volga Bulgaria was located. These Kimak tribes must have become the precursors of what we presently know as the Kazan Tatar people. The exact migration tracks of Proto-Northern-Crimean-Tatar, Proto-Karachay-Balkar, Proto-Nogai and Proto-Kumyk are harder to establish. At the time of their arrival to the Urals, all of these were almost linguistically indistinguishable, but they may well have belonged to different clans, so there still could be some genetic or political distinctions. Apparently, they split off from rest of the Kimak, Tatar and Bashkir tribes near the Southern Ural. Then, these tribes migrated southwest by following the Ural (Jaik) River first towards the Caspian Sea and the Caucasus Mountains, and finally as far as the Kievan Rus, where they soon became known as Kipchaks or Polovstians. Most of the Kimak groups under consideration (or at least Kazan Tatar, Sibir Tatar, North Crimean Tatar, Caspian Nogai, etc) seem to have emerged as separate ethnicities with their own dialects only after the expansion and dissipation of the Golden Horde (1235-1502), and the formation of the localized post-Golden-Horde Khanates of the 16th century. The spread of the Kimak and Tatar dialects (2012) It should perhaps be explained that the Golden Horde (cf. ordu, orda "army") is a historiographic name for the basically Kimak-Kypchak-Tatar Empire (1226-1502) established after the Mongol invasion of Rus and ruled by the nominal descendants of Genghis Khan. It was mostly known either as just Orda (in Russian sources) or as the (Ulug) Ulus "the (Big) Country" or by the name of its current ruler, such as Ulus of Jochi (in Turkic and Persian sources of that period). It was officially Islamized only in 1313. The Golden Horde exacted taxes from Russians, Armenians, Georgians, Circassians, Alans, Crimean Greeks, Crimean Goths, and other subjugated peoples along its borders. The Golden Horde's capitals were (1) Sarai-Batu meaning "the Palace built by Batu Khan" and (2) just Sarai "the Palace", both of which were located along the Volga River and had many thousands of inhabitants. However, they were sacked, destroyed and dismantled after the fall of the empire. The Golden Horde elite traced their descent from the Mongol clans and originally used the Middle Mongolian language as the main means of communication, however its most common population was apparently of Kimak-Kypchak-Tatar origin. After the collapse of this powerful state by the end of 15th century, the newly-formed Kypchak-Tatar dialects and ethnic groups were for the most part vaguely known as "Tatars" to the Russians from the early 16th until the end of the 19th century. The word "Tatar" may still retain somewhat negative connotation in Russian and other languages affected by the expansion of the Golden Horde, including some European languages where Tartar became the synonym of "fierce" and "violent". It is conjectured herein that nearly all the Turkic languages presently located on the territory of the former Golden Horde (Kazan Tatar, Mishar Tatar, Bashkir, Karachay-Balkar, Kumyk, Nogai, North Crimean Tatar, etc) are particularly close to each other to the extent of mutual intelligibility. The Kimak languages share a number of distinct innovations in phonology, grammar and lexis. Some of these innovations are also shared with the Oghuz-Seljuk languages, an interesting phenomenon that deserves a separate description below. On the other hand, these Kimak innovations are mostly absent from Kyrgyz-Kazakh, that did not belong to Kimak or the Golden Horde, given that Kyrgyz was locked far away in the Tian Shan Mountains, whereas Kazakh formed only after the middle of the 15th century when the Golden Horde no longer formally existed. Kimaks on the map of al-Idrisi The location of the Kimak Confederacy was shown in the 12th century's atlas prepared by the Arab geographer Mukhamed al-Idrisi, known in Europe as the Tabula Rogeriana. The Asian part of the map, which is extremely difficult to decipher, has been studied by several authors including Kumekov, B.E. in [Strana kimakov po karte al-Idrisi (The land of the Kimaks according to the al-Idrisi's map)// Strany i narody vostoka, vol.10, 1971, pp.194-198 (in Russian)]. Judging by phonetically garbled toponyms and the typical contractions and doubling, such as "Dardan", "Lalan", etc., the Asian part was probably based on some Chinese sources, assumingly on hearsay evidence provided by medieval Silk Road merchants. Consequently, the map is not grounded on astronomic measurements, and there is no such thing as scale or even orientation in it, so trying to link some of its features to modern geography can sometimes turn into a formidable task. However, we may presume that the map features are supposed to match real-world geography to the extent that they would in a verbal account obtained from a medieval traveler, whereas the map toponyms are supposed to sound as if they were reinterpreted from the heavy Kimak-Tatar pronunciation into the medieval Chinese and then finally into al-Idirisi's Moroccan Arabic. The Land of the Kimaks in the Tabula Rogeriana (clickable) The map ends abruptly near Mongolia, where traveling in the Altai-Sayan Mountains was most likely impossible. Apparently, B.E. Kumekov made an error by attributing Lake Gagan to Lake Alakol (Ala-Köl). It all becomes clear as soon as one takes into consideration that, in a way similar to English or Italian, the letter gimmel can be pronounced in Arabic as either /g/ or /J, zh/, depending on a dialect. In the Moroccan dialect of al-Idirisi it should be read as Jajan or even Zhazhan, which immediately reminds of Lake Zaysan lying along the course of the Irtysh river. That allows to identify the multiple Kimak settlements as being located on the shores of Lake Zaysan and along the Kara-Irtysh (presumably Gamash on the map, as if from a contracted pronunciation *qa...ash), where they were indeed supposed to be according the legend. This territory is designated on the map as Ard-al-Kimakiyya (The Land of the Kimaks). In reality, it most likely extended further to the northeast than the map shows, but Chinese Silk Road merchants rarely visited the northern tracks, so we see only its southern part. Similarly, in the Muhamed al-Kashgari's sketchy drawing (c. 1072-74), we find the Yamaq Steppe positioned between the Ertish River and the Ili River (in the Tian Shan), therefore he also must have thought that the Kimak tribes lived somewhere between the Tian Shan and the Altai Mountains. Kimak phonology, grammar and lexis Consequently, a matter that should be discussed in detail is the difference between the Kimak-Kypchak-Tatar, Kyrgyz-Kazakh, and Altay subtaxa, which are all frequently mixed up and intermingled in other classifications. How do these subtaxa differ? The following table shows that Proto-Kimak-Kypchak has undergone certain crucial transformations that made it phonologically very different from Kyrgyz-Kazakh and Altay, so they cannot be just blindly grouped together.
Evidently, this table demonstrates the differences between the Kimak-Kypchak-Tatar and Kyrgyz-Kazakh subtaxa, with Karakalpak being something of a secondary seam between the two of them. Notes on other classifications and their positioning of Kimak The table also shows why Kazakh should be included into the same subtaxon with Kyrgyz, whereas (Caspian) Nogai, on the contrary, has no direct bearing on either of them, and should be positioned into the same subtaxon as Kazan Tatar, unlike in an older Baskakov's classification. It is true, however, that Kazakh may exhibit some Kimak features, but these seem to stem from secondary contacts on the large territory of the Kazakh Steppe, which inevitably resulted in some intermingling of the early Kazakh speakers with the Kimaks. Naturally, even more Kimak influence may be found in Karakalpak, which is essentially something of a northwestern variety of Kazakh. Also, consider again the above-mentioned lexicostatistical research by Dybo (2006), which demonstrates the close proximity of some of the other Kimak-Kypchak-Tatar languages that were omitted in the present publication. [Dybo, Anna, The Chronology of Turkic Languages and the Linguistic Contacts of Early Turks (2006)] A similar classification had also been proposed at least as early as Bogoroditskiy (Kazan, 1934), unfortunately it was later superseded by that of Baskakov. Bogoroditskiy's classification was based purely on geographical principles, nevetheless it rather correctly differentiated (1) the many Khakas dialects; (2) the many Altai dialects; (3) the Siberian Tatars, e.g. Baraba; (4) Tatar, Bashkir; (5) Kazakh, Kyrgyz, Karakalpak, Uzbek, Uyghur; (6) Seljuk and Oghuz languages. However, Baskakov (1960), apparently incorrectly, regrouped Kyrgyz with Altai, and Kazakh with Nogai, ignoring the obvious similarity between Kazakh and Kyrgyz, a view that lasted for about a half a century. Desite this and other similar drawbacks, Baskakov's classification was still the most detailed of its time. For the above reasons, it is essentially incorrect to name both Kyrgyz-Kazak and Kimak-Kypchak-Tatar subtaxon as "Kypchak" (or "Kipchak" /keep-CHAHK ) as Baskakov and his followers tend to do. Initially, the term "Kypchak" seemed to refer only to a relatively small clan within the original Kimak confederacy. At a later stage, during the 11th-13th centuries this clan was present in many differnt parts of Eurasia, but that is just a different meaning of the term. The term "Kypchak" in the sense of tribal confederacy possibly referred to Cuman-Polovtsian or some of the Kimak tribes in contact with the Kievan Rus or just situated nearby, see for instance [Gosudarstvo kimakov IX-XI vv. po arabskim istochnikam (The Kimak State of the 9-11th century according to the Arab sources), Kumekov, B.E.; Alma-Ata (1972)]]. It actually takes a thorough historical study to explain who the Kipchaks were anyway, and Baskakov seems to omit this issue in his books. Therefore we should assume that the term "Kipchak" originally had a much more narrow usage, until it was rather artificially attributed to all of the Great Steppe languages and more during the second half of the 20th century. Conclusions: The Kimak languages originally constituted a single linguistic unity that formed near Lake Zaysan and the upper Irtysh River by about 700 AD. By c. 900 AD the Kimaks must have spread to the west across the Great Steppe territory and by 1050 AD reached the Kievan Rus. The term Kimak (sometimes named as "Kimak-Kypchak-Tatar" to keep some compatibility with the older terminology) may hereinafter be only applied to those languages which share the features described in the table above, and which therefore are particularly close to Kazan Tartar, the latter being a typical good example of modern Kimak languages. Other instances of Kimak languages include Bashkir, Sibir Tatar, Mishar Tatar, (Caspian) Nogai, North Crimean Tartar, Lithuanian Karaim, Crimean Karaim, Kumyk, possibly extinct Cuman-Polovtsian, and some other closely related dialects and languages. The difficulties in the classification of Baraba (and particularly Tomsk) Tatars result from the scarcity of available materials, however Baraba seems to exhibit all the essential features of this Kimak subgroup just as well. A special position belongs to Karachay-Balkar (see below). These languages exhibit innovative features, which — as we shall explain in detail below — were mostly brought by their interaction with the Oghuz adstratum. On the other hand, Kyrgyz, Kazakh and Karakalpak are more linguistically archaic and belong to a different subtaxon of the languages of the Great Steppe, named herein as the Tian-Shan languages. One of the probable reasons why the Kimak languages finally grew so historically important may be connected to their close original location to the northern track of the Silk Road where they could interact culturally, linguistically and genetically with many different peoples and acquire certain knowledge and wealth that could have helped them to expand in the northwestern direction. The relationship between Oghuz and Kimak The Kimak and Oghuz secondary contact Finally, we come to an interesting point mentioned above: the Oghuz-Seljuk subtaxon seems to share some innovations with Kimak-Kypchak-Tatar, namely: (1) the incomplete J- to y- mutation, cf. Proto-Oghuz *Jedi "seven" attested by Mahmud al-Kashgari (see below), North Crimean Tatar Jedi, Kazan Tatar Jide, the intermingled allophonic use of J / y- in East Bashkir dialects, etc., as opposed to the clear-cut Karakhanid yeti; (2) a sporadic t- to d- voicing, cf. Gagauz, Turkish, Azeri, Turkmen dört, Kazan Tatar dürt, Nogai dört as opposed to the Karakhanid tört; (3) the loss of -G / -Gaq as in Turkish kuru, Azeri Guru, Turkmen Gurï, Kazan Tatar korï, Nogai kurï, as opposed to the Karakhanid quruG and Kazakh qûrGaq; (4) a contraction in "leaf" cf. Turkish yaprak, Azeri yapraG, Turkmen yapraG, Kazan Tatar yafrak, Nogai yapïrak, as opposed to the Karakhanid yapurGaq; (5) the t : l transition named herein as "the heavy eastern versus the light western Turkic consonantism", e.g. a "light" (lenitioned) -l- in the plural marker: -lar in Oghuz-Seljuk, Kimak-Kypchak-Tatar, Chagatai-Uzbek-Uyghur, Orkhon-Karakhanid, Khalaj, as opposed to the "heavy" (fortitioned) eastern pronunciation of -dar-/-tar-, for instance in Kazakh-Kyrgyz, Baraba, Yugur and "Siberian" branches. Curiously, however, Kazan Tatar also preserves -nar, -ner which can be seen as an intermediate form between -dar and -lar as far as the degree of lenition is concerned. The stronger -dar / -tar and other fortified suffixes are also preserved in the East dialect of Bashkir (which was least affected by Kazan Tatar) as well as in Baraba. This may imply that the Kimak-Kypchak-Tatar languages originally had some phonological fortition typical of the eastern language clusters, whereas their historically recent lenition is probably acquired from Oghuz; (6) the use of *tegül instead of e(r)mes, cf. Turkish deGil, Azeri deyil, Turkmen del, Kazan Tatar tügel, Kumyk tügül as opposed to the Karakhanid ermes, Kazakh-Kyrgyz emes; (7) the use of the *aJak in Future Tense, cf. Turkish -aJak-/-eJek-, Turkmen -Jak/-Jek, Kazan Tatar -achak-, Bashkir -asaq-, Nogai -ayak-/-eyek-, Crimean Tatar -aJaq-/-eJeq-, Kumyk -azhaq/-ezhek. The tense is also used in Karakalpak in the Aral-Caspian region probably because of the Oghuz (Turkmen) presence there; (8) the frequent use of -dïr/-tïr in the 3rd person singular, cf. Turkmen, Azeri, Turkish; Cuman-Polovtsian, Kazan Tatar -dïr/-tïr, etc. as opposed to its absence in Kazakh and Kyrgyz at least as far the copula construction is concerned (e.g. Ol qazaq "He is a Kazakh), etc; On the other hand, despite this presumable relatedness, presently there is only poor mutual intelligibility between modern Oghuz-Seljuk and Kimak-Kypchak-Tatar languages, with many differences in syntax, morphology and semantics. With the 70% of average similarity between Turkmen and the modern languages of the Golden Horde, the present-day distance between even the most archaic and easternmost Oghuz languages and the Kimak-Kypchak-Tatar languages seems to be rather considerable. For instance, with the 65% between Turkish and Tatar in Swadesh-215 (borrowings excluded), the actual difference in real speech would normally be considerably beyond comprehension. A few simple phrases from Tatar-Turkish phrasebook may look as follows: Kazan Tatar Sin kay-a bar-a-sïn cong? cf. Turkish Sen nere-ye gid-i-yor-sun?, literally "You where going-are-you?"; Kazan Tatar Salkïn su bir-egez-che cf. Turkish Souk su ver-in (lütfen), "Cold water give-please"; Kazan Tatar Gailê-biz-de öch bala — min, apa-m hêm ene-m, cf. Turkish Aile-miz-de üch chojuk (var) — ben, abla-m ve (hem de) kardesh-im, "Family-my three child — me, sister-my and brother-my". That does not mean, of course, that Kimak and Oghuz have nothing in common with each other, it is just that the described changes seem to be roughly consistent with at least 1500-2000 years of glottochronological separation, which makes the recent existence of an Oghuz-Kimak genetic unity an unlikely option. And indeed, as we will conclude below, the phonology, grammar and particularly the vocabulary of Oghuz languages are in good correspondence with Karakhanid, taken that that Proto-Oghuz originally belonged to the same stock as Orkhon Old Turkic, Old Uyghur and Karakhanid, which seems to refute the above idea of Oghuz-Kimak relationship. But if Oghuz and Kimak are not really close, where do these shared elements come from, anyway? We may not suppose that these could have emerged independently in each subtaxon, since the coincidence of several simultaneous mutations is statistically negligible, therefore a much more likely and interesting option would be that they occurred due to the secondary contact and mutual intermingling, when at some point in time, the early Oghuz tribes crossed the area of the Kimak tribes. The hypothesis of linguistic exchange in northern Kazakhstan The conclusion of secondary relatedness between Kimak and Oghuz is in accordance with the historical records saying that Seljuk's clan separated from the Transoxanian (=Aral-Caspian ) Oghuz tribes near the Syr-Darya in the Kazakhstan steppe, which seems to have been the traditional habitat of the Kimak-Kypchak-Tatar or Kazakh tribes. In other words, it is geographically simple to assume that the Oghuz and the Kimaks, being so geographically close, might have formed a sort of a linguistic area near the Aral Sea. Curiously, Al-Kashgari claims that "Kirkiz, Kifzhak, Uguz, Tuxsi, Yagma, Jikil [the latter three tribes apparently were located near the Ili river in the Tian Shan], Ugrak, Jaruk all have one pure Turkic language. Close to them are the dialects of Yamak [= probably Kimak] and Bashkirt...", which evidently positions "Uguz" into the same geographic and linguistic row as Kyrgyz and Kypchak with several lesser medieval tribes. We can also find multiple historical records mentioning a Kimak-Oghuz alliance in the 10th century. For instance, Arab geographer Al-Masudi wrote c. 930 that the Kimaks and Oghuzes were coaching along the Emba and Yaik together. Note: the English word coach is from French, where it seems to go back to Hungarian, where it is probably from Bulgaro-Turkic *köch- "to migrate" [Webster's New World Dictionary (1986), Sevortyan's Dictionary (1980)] Ibn Haukal c. 950 drew a map showing that Kipchak-Kimak tribes together with the Oghuz tribes were pasturing their cattle in the steppes north of the Aral Sea. Al-Biruni c. 1000 noted that Oghuz tribes quite often pastured in the country of the Kimaks [en.wikipedia.org]. However this hypothesis does not explain why the above-listed features passed into nearly all of the Kimak languages, which implies that the actual interaction must have occurred much earlier when both Kimak and Oghuz tribes were still living in the same relatively small area, such as a passage between mountain ranges, so their linguistic contacts must have been very intense and taking place at the proto-language level. For this reason, below we will consider another hypothesis that suggests a cultural and linguistic exchange near Lake Zaysan. The hypothesis of linguistic interaction near Zaisan Beginning of 552 AD some of the Great-Steppe tribes were subdued by the western Göktürks, who essentially must be the speakers of an unidentified Orkhon-Oghuz-Karakhanid dialect, such as Old Uyghur or Oghuz judging from their geographic position near Dzungaria. Presumably, this West Göktürk language-dialect must have acquired a high sociolinguistic status in many Turkic-speaking societies of the time. It is quite plausible to assume that Proto-Oghuz could have actually formed a considerable part of that West Gökturk dialect area given its later tendency to migrate in the western direction along the same path. Initially, Proto-Kyrgyz was a conservative Turkic language apparently distributed either (1) along the Irtysh or (2) between the Irtysh and Ob rivers, essentially in the area known as the Baraba and Kulunda Steppe, or (3) in the area between the Altai and Tian Shan Mountains. Whereas Proto-Kyrgyz-Kazakh had occupied the area west of the Altai Mountains and east of the Tian Shan for many centuries, Proto-Oghuz was probably a recent arrival from Dzungaria brought by the expansion of western Gökturks after 530-550 AD. Consequently, we can infer that somewhere around 550-800 AD there occurred a strong linguistic exchange between Proto-Oghuz in Dzungaria and the early Kyrgyz dialects north of the Tarbagatai in the Great Steppe, which could have resulted in the formation of Proto-Kimak. In other words, the most simple and plausible hypothesis which would explain all the relations among Proto-Oghuz, Proto-Kimak, and Proto-Kyrgyz-Kazakh, would be that the area of Proto-Kimak must have originally formed as a transitional region where the early Kyrgyz dialect overlapped and intermingled with Proto-Oghuz. The map of Proto-Oghuz and Proto-Kyrgyz hypothetical exchange between 550-800 CE The overlapping of the Oghuz Kyrgyz area soon resulted in the formation of a new transitional dialectal seam, which became known as Kimak. This Kimak area shared archaic linguistic features with Kyrgyz, on one hand, and some innovative features with the early Oghuz, on the other. Furthermore, Oghuz too was affected by Kimak and Kyrgyz dialect-languages; it absorbed some of their elements, to some extent even becoming part of the Great Steppe Sprachbund, and deviating from its Orkhon-Karakhanid parent stem. On the other hand, the speakers of Kyrgyz were largely unaffected by the Göktürk dialect-languages because these were already absorbed and buffered in the Kimak zone. Consequently, the Proto-Kyrgyz-Kazakh-Uzbek-Uyghur language became locked in a sort of linguistic refugium near the foothills of the Tian Shan Mountains where it was able to retain many of the archaic features from before the 6th century. Conclusions: As the Western Göktürk tribes, apparently speaking a language similar to the early Old Uyghur, moved back from Mongolia into the upper reaches of the Irtysh river between 550-700 AD, they must have come into contact with the local Proto-Kyrgyz tribes. This intermingling must have resulted in the formation of the three local dialectal areas: (1) Proto-Kyrgyz (or Proto-Tian-Shan) (possibly also including Proto-Karluk): this area that was almost unaffected by the Göktürk language ultimately led to the emergence of the now-extinct Karluk (uncertain), the Tian-Shan Kyrgyz, and finally, after the 15th century, Kazakh and Karakalpak languages; (2) Proto-Kimak: this area was strongly affected by the Oghuz or Western Göktürk migration, but retained many older Kyrgyz elements, for instance -w- in bawïr "liver", and -w in taw "mountain", as opposed to the -G- and -G in the oncoming West Göktürk language — to name just a few typical features; (3) Proto-Oghuz: this area acquired certain features from Kimak, but otherwise remained relatively unaffected, retaining many Orkhon-Karakhanid archaisms from an older period. On the origins and history of the ethnonym Tatar Speaking of the earliest clear-cut attestation of the ethnonym Tatar, we should probably turn to the Orkhon Turkic inscription of Kul Tegin made in 732, which cites a reference to the burial of Bumin Kagan in 552. The attestation consisted of the following passage, "...Böküli Chölüg (=the Koreans), TabGach (=the Chinese), Avar, Rome (=the Byzantines), Kirgiz, Uc-Quriqan (=the Proto-Yakuts), Otuz-Tatar, QitaN (= the Khidans = the Mongolic peoples in the Greater Khingan Mountains) and Tatabi, this many people came..." [see Türük Bitig, a site dedicated to Orkhon-Yenisei inscriptions]. This suggests that by 550 AD the Tatars constituted a political or military confederacy made up 30 (otuz) different clans or tribes and probably united as one single kaganate, though their exact location is unknown. Note: Herein we are trying to consitently exclude any early evidence from Middle Chinese records due to their ambiguity and multiple difficulties with the verification and interpretation. However, according to the Chinese version, the word ta-da or a similar one could have been initially used as the Chinese exonym applied to all of the foreign tribes beyond the Great Wall, similar to the barbars of the Greeks. Moreover, and quite confusingly, the Tatars are described in the Secret History of the Mongols circa the 1190's, living somewhere near the modern-day border of Buryatia and Mongolia along the Onon River (which is the tributary of the Amur, and being the sworn enemies of Genghis Khan). Those Mongolian Tatars had poisoned his father and waged war on Genghis Khan, but then were finally exterminated in retaliation when he came to power. The History does not explain which language they spoke, whether they were Turkic or Mongolic, it only suggests that they were able to say at least a couple of phrases in Middle Mongolian. More curiously, the two names of Genghis Khan himself, the original one Temüjin created after the name of a Tatar Temüjin-üge — presumably from Turkic Temir-ji Aga "The Blacksmith Brother — , and the later one Jenghis Kagan, probably chosen after a certain Lake Tenghis mentioned in the first lines of the History (Turkic "The Sea", probably Lake Baikal), both indicate the existence of Turkic ethnonyms and toponyms in the area, which may finally mean that these Mongolian Tatars, vividly described by Genghis Khan and his court scribes, were indeed of Turkic origin [see the Secret History of the Mongols (1240), translation by F. W. Cleaves from the Mongolian original (1982)]. Judging from their location in the Trans-Baikalian region, we may suppose that these Tatars could in fact have been a lost extension of Proto-Sakha, most likely related to Kurykans, who had integrated into the local Mongolic society (and possibly adopted the Mongolian language). According to the legend cited by Gardezi (1030) and described in the chapter about The Kimak subtaxon, the ethnonym Tatar is also clearly traceable to a certain clan within the Kimak Confederacy situated along the Irtysh River circa 700 AD. Consequently, one may wonder about at least three different early mentions of Tatars in three different contexts — one before the formation of the Kimak confederacy, another one as a part of it, and yet another one in reference to the purported Turkic tribes of Mongolia and Trans-Baikalia. What is the difference among the three? As explained in the chapter about the Turkic ethnonymy, the most likely hypothesis about the Tatr origins would be that the word Tatar must have originally been the name of a patrilineal clan working as a sort of equivalent of a European surname. In other words, this hypothesis suggests that the word Tatar may originate in a personal name or alias of the Tatar clan's progenitor. (But what this name o alias could have initially meant would be just anybody's guess.) Consequently, when the legend teller says that the men named Tatar, Kimak, Kipchak, etc. came over to live with the man named Shan, he probably just means that these could either be their original first names in some cases or their preexisting clan surnames in others. Since the patrilineal clan of Tatars and the surname of Tatar may have merely genetic but not necessarily linguistic connection to its members, any men who belonged to that clan could have possibly spoken a generally unknown Turkic dialect or even a Mongolic language and lived in unspecified parts of Eurasia. We cannot even exclude the possibility that some of the Tatars may have deliberately adopted their surname under generally unknown circumstances, even though they were not genetically connected to the original clan of Tatars. The existence of Mongolian Tatars described in the Secret History of Mongols is particularly interesting and questionable in this respect. However, we should assume that most European and West Siberian Tatars, that the ethnologists are usually familiar with, supposedly trace their patrilineal descent (1) either to the Tatar man of the Kimak Confederacy, who had no first name and who settled down with Shan of the Tatars circa 700 AD, or (2) to Shan himself, or (3) they both were the same person, the latter option being the most simple and plausible one. If the Mongolian Tatars indeed were of Proto-Sakha origin, then their separation from other Tatar clans could have occurred at the Proto-Turkic level, somwhere before 1000 BCE because of the very early separation of Sakha, which would make Tatars one of the ealiest attested Turkic clan. As for thr rest of it, the actual use of this word Tatar throughout history has been quite different and variable — rising from the limited, regional usage as a clan name to an all-encompassing Turkic and Mongolic exonym and then falling into disuse again. In 922, the "al-Bashkird" of Ibn-Fadlan were already attested near their present-day location west and southwest of the Urals, however there is no direct reference to the Kimak-related Tatars, as yet. Presumably, in the course of the 9th-10th centuries, during the period of the Kimak dissemination over the Great Steppe, the Kimak Tatars must have become the ruling clan among the Kimaks. As one may suppose, during that period the word Tatar must have gained a socially prestigious connotation of a leading clan's title, and many Kimaks might have attempted to trace their personal roots specifically to Tatars. That honorific usage could have lasted well into the times of the Mongols in the 13th century, so finally the Mongols themselves were frequently conflated with the Tatars. Giovanni da Pian del Carpine (1245), for instance, consistently names all the Mongols as Tatars despite his personal visit to Mongolia. This ethnonymic confusion can also be explained from the military standpoint: the aristocracy of Mongolic descent constituted only a small part of the Golden Horde population, at least during its later stages, and the Mongolic tribes had initially been far too small to achieve the conquest of the enormous territory they acquired. Therefore, it is implausible that the Mongol generals were able to do without any help from the locals, they must have recruited the regional Turkic population into their armies, most of whom were evidently of Kimak-Kypchak-Tatar origin. Therefore, the actual conquest and control over the land was probably achieved by means of the ruling Tatar clans. However, there are few specific historical documents that could corroborate this outlook. According to a different version [sources and details?], the name Tatar was brought only during the Mongolian period. The ethnonym Tatar was particularly widespread among the Golden Horde aristocracy, military and local officials [see for instance The Great Russian Encyclopedia (2004)]. The linguistic differentiation among the Turkic dialects of the Golden Horde was evidently small, so all of the Golden Horde peoples between the 13-17th centuries were collectively called Tatars in Russia, many parts of Central Asia and Europe. In Latin-speaking Europe, the word Tatar was frequently changed to "Tartar", apparently due to the association with the Tartarus, which, according to Greek mythology, was the underworld at the bottom of the abyss beneath the earth, where an anvil takes nine days to fall. After the dissolution of the Golden Horde, the term must have acquired negative connotations, whereas many post-Golden-Horde ethnicities came up with other newly-formed names, such as Noghai (=from the Noghai Khanate, after the name of a Mongol general), Mishar, Kazanly (=from the Kazan Khanate), etc. For instance, in reference to the 18th-19th century, Carl Ritter, citing the research of German ethnographer Julius Klaproth (1783–1835), notes the following: "But if you ask the so called Kazan or Astrakhan Tatar, if he is a Tatar, he will answer negatively, for he names his dialect 'Turki' or 'Turuck', not 'Tatar'. Being aware that his ancestors were subdued by the Tatars and Mongols, he takes the word 'Tatar' as pejorative and meaning nearly the same thing as a bandit." [See Die Erdkunde im Verhaltniss zur Natur und zur Geschichte des Menschen (Geography in Relation to Nature and the History of Mankind), written 1816–1859]During the period of Ivan the Terrible (1530-84), who moved the imperial frontier beyond the Ural Mountains, the ethnonym Tatar was presumably carried further into Siberia by Russian Cossacks. Supposedly, this is how it came to be applied to the Sibir Tatars of the Tobol-Irtysh area, the Baraba Tatars, the Altay Turkic peoples and the Yenisei Kyrgyz tribes of the 17th century, though the presumable Russian origin of the Tatar self-reference among these people is disputable. In any case, until the beginning of the 20th century, the Altay-Sayan peoples were known under such names as Abakan Tatars, Chulym Tatars, Kuznetsk Tatars, Azerbaijani Tatars and so forth. Only the Kyrgyz and the Ottoman Turks were among the few that never recieved this exonym. By the 18th century, the name became so overextended and overused, that it began to include any people of East Asia. French Sinologist Abel-Rémusat, for instance, used the term "Tartares" as a catch-all name for "des Mandchos, des Mongols, des Ouigours et des Tibetains" as late as 1820. Moreover, until the 19th century, Siberia was often designated as Tartaria (Magna) in Latin or Grande Tartarie in French or Tartary in English on most geographic maps, see, for instance, Nicolaes Witsen, Noord en Oost Tartarye... , (1672). In other words, the expression Tartaria (Magna) was used in the same way as Siberia today. Hence, also the name of the Strait of Tartary between mainland Russia and Sakhalin Island. The name was coined by La Perouse in 1787, even though no Turkic peoples had lived there ever. During the reign of Peter the Great (1682-1725), when Turkology began to rise as a distinct branch of science in the Russian Empire and Western Europe [see Baskakov, N.A. Vvedeniye v izucheniye tyurkskikh yazykov (An intoduction into the study of Turkic languages), (1969); chapter The history of study of Turkic languages in Russia before the 19th century, p. 18], nearly all the known Turkic languages and dialects (outside Ottoman Turkish) became generally known as tatarskiye narechiya "Tatar dialects" in Russian. And, in some cases that indiscriminately included Mongolic, Tungusic, Tibetic, Samoyedic and other completely unrelated Siberain ethnic groups. Strahlenberg and Messerschmidt (1720-1730), the earliest European explorers of Siberian peoples, were apparently a little unsure about the proper usage, however Strahlenberg [Das Nord und Ostliche Theil von Europa und Asien, Stockholm, 1730 ] seems to use the word Tataren as a generic term for the Turkic-speaking peoples only, not Mongols or anyone else. The Brockhaus and Efron Encyclopedic Dictionary (1906), widely popular before and even after the Russian Revolution, openly protested against that overused terminology, "Tatars do not exist as a single ethnicity; the word "Tatar" is nothing but a collective nickname for a number of peoples of [sometimes] Mongolic, but particularly Turkic descent, speaking Turkic languages, and of Quranic affiliation. [...] From scientific perspective, the name of Tatar has presently been rejected when applied to Mongols or Tunguses, and retained only in reference to those linguistically Turkic ethnicities that form part of the Russian Empire, but excluding other Turkic nations with independent historical appellations (Kirigizes, Turkmens, Sarts, Uzbeks, Yakuts, etc). Certain scientists (Yadrintsev, Kharuzin, Shantr) have suggested to modify the appellation terminology of some of the Turco-Tatar ethnicities [...], for instance, by renaming Azerbaijani Tatars to Azerbaijanis, Altay Tatars to Altayans, etc., but that has not gained much acceptance, as yet [...]"As a result, the indiscriminate term tatarskiye narechiya "Tatar dialects", generally accepted in the 19th century, was soon supplanted by the names of specific languages that appeared during the 1920-30's post-revolutionary renovation, though in some cases, such names as Uzbek, Uyghur, Khakas seem to have been taken right off the top of the head and then granted by consensus. For some time after the revolution, "Turkish-Tatar languages", "Turkish languages", "Turco-Tatars" were still variably used as generic terms by various authors between the 1800-1930's . But aAfter the rise of the Republic of Turkey (1922) and its frequent generalization of Türk as a comprehensive, far-reaching concept, the recognition of the newly-formed term tyurkskije jazyki "Turkic languages" must have finally become widespread and generally-accepted even in reference to the ethnic groups that never called themselves Turks. Nevertheless, the older usage in such phrases as tataro-mongoly "Tatar-Mongols" or tataro-mongolskoye igo "Tatar-Mongol yoke", referring to the rise of the Golden Horde and its punitive raids against Rus, still exists in Russian historiography. Apparently, the extensive use of the term Kypchak popularized by Baskakov's classification (1950-1980's) followed the same avoidance strategy by trying to get rid of the word Tatar. As a result, in certain contexts, both names became nearly synonymous, the former being sort of euphemistic for the latter. In the beginning of the 21st century, the name Tatar is formally retained mostly just by the Kazan Tatars of Tatarstan (who sometimes object to its usage), Crimean Tatars, Mishar Tatars west of Tatarstan, Sibir (Tobol-Irtysh) Tatars (whose language is poorly documented in the scientific literature), Baraba Tatars (on the verge of linguistic extinction, but often just "Baraba"). It is also accepted as a generic self-appellation Tadarlar by various Khakas and Altay Turkic ethnicities, and sometimes can be applied to other smaller and lesser-known ethnic groups, such as Astrakhan Tatars, Lithuanian Tatars, etc. Bashkir is closely related to Kazan Tatar Judging solely by a superficial look at the orthographic phonology, a casual onlooker may think that Bashkir might be a strongly differentiated language among Turkic, no less than Chuvash or Sakha. However, at closer examination, one can find a remarkable lexical similarity of more than 95% between Kazan Tatar and Bashkir in Swadesh-215. A significant error in this figure is rather unlikely, taken that the list was composed by proficient speakers at Wiktionary.org and then rechecked through dictionary search herein. The few clear-cut lexical and semantic discrepancies found in Swadesh-200 are as follows:
|
He found them near the large river named Etil [= supposedly, Ak-Etil or Belaya, the main river of Bashkortostan]... And to everything he wanted to tell them, they listened carefully, for their language was entirely Hungarian, and they understood each other... The Tatar people live near them. But the Tatars, when waging a war on them, could not overcome them, on the contrary, they were defeated in the first battle... In that country, the aforementioned friar found the Tatars and the messenger of their lord, who spoke Hungarian, Russian, Cuman, Teutonian, Saracyn [=Arabic], and Tatar [and who said that behind the country of Tatars there were the "big-headed" people who wanted to start a war, perhaps the oncoming Mongols who must have reached West Siberia after 1207].This implies that the unusual phonological features in Bashkir could in fact have been the result of Tatar-Hungarian intermingling, when the local South Mansi and Majar tribes (=usually Magyar in Hungarian spelling) switched to Kimak-Kypchak-Tatar languages.
[Relatio fratris Ricardi, De facto Ungarie Magne a fratre Ricardo invento tempore domini Gregorii pape noni (On the existence of Magna Hungaria as related by Friar Ricardus), quoted from a translation by S.A. Anninskiy (1940)]
"birch" with the loss of -ð- as opposed to the Karakhanid, evidently because of the Great-Steppe influence where the same transition is inherited from an earlier Proto-Central level.
In 605, [...] the Uyghur leader has taken his tribes to the Khangai Mountains [ = in eastern Mongolia], where a separate group was created, known in Chinese historiographical sources as "the nine tribes". In the Orkhon inscriptions, this group was named Toquz-Oghuz.Therefore, we may assume that Oghuz is nothing but a different pronunciation of Uyghur, which can easily be explained by the widespread usage of the liquid affricate in Mongolian (and most likely the nearby early Turkic languages and dialects), where /r/-/l/-/s/-/z/ are in some cases pronounced as mere allophones of the same phoneme. In other words, it is not even necessary to add any evidence from the Bulgaric languages, where the /z/ to /r/ mutation is compulsory, rather the local Khalkha Mongolian data provide enough substatiation, since the -z to -r mutation could have arisen either on the basis of incorrect Mongolic-based translations, transcriptions, reinterpretations, Sprachbund phonology, etc. In any case, the hypothesis that Oghuz and Uyghur may have originally been the same ethnonym seems quite plausible, albeit not clearly demonstrated.
[Stepnyye imperii: rozhdeniye, triumf i gibel (The Steppe Empires: birth, triumph and disintegration), Saint Petersburgh (2005)].
For example, the Turks [=the Karakhanid Turks] call a traveler yalkin, whereas they [Oghuz and Qifchaq] call him 'alkin. The Turks call warm water yilig suw, whereas they say ilig with the 'alif. Likewise, the Turks call a pearl yinchu, whereas they call it Jinchu. The Turks call the long hair of a camel yigdu, whereas they call it Jugdu. [Diwanu l-Lugat al-Turk (c. 1073)]Despite this quote, al-Kashgari also confusingly cites a good dozen of Oghuz words beginning with the y-, as if, either what he had said earlier no longer applied to them, or the reader was supposed to make the y-to-J substitution for himself. The latter seems likely, taken that this substitution was recommended by al-Kashgari in the beginning of his book.
The Uguz and Kifzhak say the words beginning with y- as J-: ul mani Jatti (he reached me) instead of yatti. At-turk say suvda yundum (I bathed in water), whereas they [Oghuz and Qifchaq] say Jundum. Amongst the Turks and the Turkman, there exists this constant rule. [Diwanu l-Lugat al-Turk (c. 1073)]
"One may note that this prothetic h- is very frequent before long vowels and before the following -j- and -v-. However, the rules are not strict, and in general the emergence of h- in Khalaj is unpredictable. The absence of h- in Khalaj is therefore an almost certain sign of *0- in Proto-Altaic, so its presence there may be either original or secondary. We shall thus continue to use Proto-Turkic forms without the initial *h- "Furthermore, the hypothesis of h- being a unique survivor retained exclusively in Khalaj is simply not statistically viable. If Khalaj were so archaic, other languages would also exhibit similar traces of the Proto-Turkic *h-.
"The people of Khutan [= the city of Khotan along the southern ridge of the Taklamakan desert that still exists] and Kanjak (Känchäk) [= another city further to the east] substitute the 'alifs [= the word-initial hamza plus the letter "A"] by an h (ha:). That is why we do not consider them among the Turks [=pure Karakhanid Turks], for they introduce something foreign into the Turkic speech. For example, the Turks call the father 'ata, whereas they say hata, the mother — 'ana, whereas they say hana." [Diwanu l-Lugat at-Turk].Surprisingly or not, this observation was made as early as the original Minorsky's article (1940) with its first description of Khalaj, so the whole thing must have been evident right from the start but then overrun by Doerfer's assumptions.
Tense | Yugur | Old Orkhon | Old Uyghur | Karakhanid | Khakas |
Future Tense | -Gu, -gu, -Go,
-go; -Gï, -ge, -kï, -ke |
-tachï, -dachï; Giy (rarely) |
-Gay, -gey | -Gay, -gey, -qay, -kêy | Gai/gei, qai/kei = Optative Mood |
Perfect Tense | -Gan (usually Narrative Past) ; the -mïsh participle or tense seem to be entirely unknown | -mïsh-,
-mish; -Gan- |
-mïsh-,
-mish-; |
-mïsh-,
-mish; -Gan-, -gen-, -qan, -ken- |
-Gan/gen, -xan/ken |
plural | -lar, -nar, -dar, -tar | -lar | -lar | -lar | -lar, -nar, -tar |
you | seler | siz | siz | siz | sirer |
copula | i:re | er- | ärür | ol (3rd pers. copula) | – |
The system of the Salar consonantism is so drastically different from the South Turkic (Oghuz) system, which was supposed to exist for the Salar language in the past, that one involuntary arrives at a conclusion of its secondary, posterior origin, and its dependence upon the neighboring languages, such as Chinese, Dongxiang, Tibetan. [E. Tenishev, Stroj salarskogo jazyka (The structure of the Salar language), Moscow, (1976)]
Tense | Yugur | Salar | Comment |
Present Progressive | ROOT+ïp+par | ROOT+por | This tense is rather innovative, probably from *par/var "there is", as it follows from the examples in the other Salar tense ROOT + Gan var as well as from par-dr "there is"; the relatedness to the verb *bar- "to go" has also been suggested, though Tenishev for some reason assumed that -par is from the Oghuz -yor-. |
Aorist | ROOT+ar (Future) | ROOT+ïr/er (Present-Future) | Common to all Turkic (no taxonomic value) |
The "Yugur" Future | ROOT+qïr | ROOT+qur | Apparently, a unique Yugur-Salar innovation |
The Simple Past | ROOT+te | ROOT+Je |
Common to all Turkic languages, but still phonologically innovative, including the striking absence or degradation of personal endings. |
The Gan- Past | ROOT+Gan+tro | ROOT+Gan+dïr | Common outside of Oghuz-Seljuk, but the addition of -dïr or -tro is rather innovative. |
Yugur | Salar | Comment | |
copulas | er, ere, ire |
ira, irar; iter, itïr, ider; ideroN (except the 1st person); tïr, dïr, tir, dir; shi, shê < Mandarin |
Cf. Old Uyghur ärür,
Khalaj är; According to Tenishev, the Salar itïr = ira + tïr (a double copula), just as in emes-tïr, emes-er (a negative copula) |
examples | xo p'er k'i:se i:re "[we] all one people are" |
wu pirinige oy iter "this our house is"; men xon iter "I the-khan am"; inJi avu ira vu "a young(man) still he is"; putaGï pir ideroN "their roots one are" |
Also, used in Salar much in
the same way as "right, it is" in English. Man ka'cha yanshaGanï idero? — Ider! "What I said, is it right? — It is." Men pichtigeni ira mu? — Ira. "What I wrote, is it right? — It is." |
(3) SOUTHERN (or ORKHON-OGHUZ-KARAKHANID)(2.2.1) Tian-Shan (or alternatively, Kyrgyz-Kazakh-Uzbek-Uyghur or Kyrgyz-Kazakh-Chagatai or just Kyrgyz-Chagatai, according to the typical representatives).
The exact original homeland of this subtaxon and its temporal period are unclear, but it was probably situated somewhere between the Altai and Tian-Shan Mountains. By the 7th-8th century it must have moved to the foothills of the Tian Shan Mountains, hence the suggested appellation.
(2.2.2) Kimak (or Kimak-Kypchak-Tatar, according to the most famous representatives of the Kimaks).(2.2.1.1) Kyrgyz-Kazakh (including Kyrgyz, Kazakh, Karakalpak)
Kyrgyz was apparently affected by Altay Turkic ("Oirot") during the Dzungarian invasion of the 17-18th century, hence its frequent misplacement in other classifications.
(2.2.1.2) Chagatai (including possibly the hypothetical Karluk (?), medieval Chagatai, modern Uzbek and Uyghur and their dialects)
The subgroup is essentially an admixture of the old Uyghur-Karakhanid substratum with the language of Great-Steppe newcomers. It formed after the Mongol invasion of the Tian Shan in the 13th century. The name "Karluk" from Baskakov's classification is best to be avoided because our knowledge of Karluks is rather limited, and their Turkic dialect has not been preserved. On the contrary, Chagatai was a significant and commonly-used medieval koine in Central Eurasia, therefore its name sounds much more reasonable and recognizable as a taxonomic appellation.
All of the ethnicities therein are thought to be descendant from the Kimak Confederacy (Kaganate, Khanate) situated near Lake Zaysan. The Kimaks were strongly affected by the linguistic exchange with Oghuz near the Zaysan Passage in the 7th-9th centuries. The older Baskakov's name "Kipchak" is best to be avoided due to the inaccurate and confusing inclusion of Kazakh and Karakalpak, the exclusion of Nogai, etc. Moreover, the actual Kypchaks constituted only a small part of the Kimak subtaxon apparently focused near the Kievan Rus, therefore overestimating their significance at the cost of of the Kimaks, the original progenitors of the subgroup, seems to be rather unjustified.
(2.2.2.1) Karachay-Balkar (including Karachay-Balkar and its dialects)
A linguistically deviating subgroup in the Caucasus Mountains, still evidently of Kimak-Kypchak-Tatar origin.
(2.2.2.2) Golden-Horde (including Sibir Tatar, Bashkir, Kazan Tatar, Mishar Tatar, (Caspian) Nogai, Kumyk, North Crimean Tatar, Central Crimean Tatar, Crimean Karaim, Lithuanian Karaim and other closely related language-dialects)
The formation of most of these Kimak languages is clearly connected with the rise and expansion of the Golden Horde during the 13th-15th centuries. Having formed during a relatively recent period, the Golden-Horde languages still share many common features. Due to a large number of languages in this subgroup, it has been studied rather superficially in this work.
(2.2.2.3) Baraba-Tomsk (including Baraba and probably Tomsk Tatar)
A very special Kimak subgrouping exhibiting certain archaic features and presently almost extinct. Tomsk Tatar has not been included into this study.
(3.3.1) Yugur (including (West) Yugur (Yughur))
(3.3.2) Salar (including the West and East Salar dialects)
türk dili © 2010