Introduction to the Database of the Inflectional Morphology of the Romance Verb

This Database has been constructed in connection with the Arts and Humanities Research Council-funded research project Autonomous Morphology in Diachrony: comparative evidence from the Romance languages (AH/D503396/1), carried out at Oxford University first in the Faculty of Medieval and Modern Languages, and latterly in the Faculty of Linguistics, Philology and Phonetics, between October 2006 and December 2010.

The Database offers a representation of the inflectional paradigms of the verb for some 80 Romance varieties. The data may be viewed by language variety, by lexical verb (labelled by etymon), by grammatical category, or by combinations thereof.

Table of Contents

  1. Nature and purpose of the Database
  2. Origins of the data, and coverage
  3. Geographical locations
  4. Transcription and special symbols
  5. Organization
  6. Using the Database
  7. People and contacts
  8. References

The Nature and Purpose of the Database

The Database is an interpretation of a wide range of published descriptions of Romance verb morphology. It does not purport to reproduce or supplant such descriptions: rather, the data here offered are our attempt to give an account of what others have said about the verb system for each variety described. The act of interpretation is inherently problematic (see particularly our remarks below under 'Origins of the data, and coverage' and 'Transcription and special symbols'), and users who wish to pursue points of particular interest in our data for any given variety should unfailingly consult the reference or references cited for that variety. Our aim is simply to offer to Romance linguists (and morphologists in general) a tool for the comparative analysis of the inflectional morphology of the Romance verb. This Database constitutes the most comprehensive available unified representation of this major subsystem of Romance inflectional morphology, presenting the constituent word-forms of the Romance verb system, and allowing (among other things) observation of distributional patterns of allomorphy and neutralization.

The Database sets out what are conventionally regarded as the 'synthetic' word-forms of the inflectional paradigm of the verb in a wide range of Romance languages. 'Analytic' constructions, for example of the type 'auxiliary have or be + past participle' present in varying measure in all Romance languages (cf., e.g., Ledgeway 2011:452-57), are not specified here, although their component word-forms may be. We are well aware that the boundary between the 'synthetic' and the 'analytic' is not always clear (see Ledgeway ib. 385), and that the components of apparent analytic structures show varying degrees of 'fusion'. In addition, Romance languages show varying degrees of fusion of verb-forms with (clitic) subject pronouns: in Romansh and Ladin varieties, for example, the verb system has a set of special forms with phonologically reduced word-final endings when used in interrogative constructions or in certain other types of construction usually requiring syntactic inversion of subject and verb (see, e.g., Alton and Vittur 1968:47f.; Minach and Gruber 1972:76-80); and for many Gallo-Romance and northern Italo-Romance varieties it could be argued (e.g., Rizzi 1986) that what are conventionally regarded as 'obligatory subject clitics' are analysable as part of the inflectional morphology of the verb. The (current) limitation of the forms in this Database to what are conventionally regarded as 'synthetic' word-forms is dictated not by any particular theoretical stance, but both by tradition and by present practical limitations.

This is a database of paradigms representing the various types of inflectional verb morphology. Existing descriptions rarely state how many verbs belong to any given inflectional type. This means, of course, that the Database can have little to say about the productivity of the types. Romance linguists know, however, that it is usually the verbs which directly continue the Latin inflectional class called ‘first conjugation’ which constitute the productive class, notably in respect of their inflectional endings; another historically productive inflection class is that which continues the Latin fourth conjugation, notably in Daco-Romance (see, e.g., Maiden 2011:212n76).

Origins of the data, and coverage

The data presented in this Database are an interpretation of the very considerable range of (mainly) published detailed scientific studies offering detailed descriptions of Romance verb morphology. In the case of the major standard languages we specify no particular reference source, since the forms are extensively and (usually) uncontroversially established in numerous descriptive and prescriptive grammars. The number of varieties presented here (some 80) may strike some scholars as surprisingly low, all the more because the total number of published studies offering illustration of aspects of the inflectional morphology of the Romance verb exceeds 3000, to our certain knowledge. Very few of these, however, meet the criteria we have set for inclusion of data in this Database. A prime condition of such inclusion is that the references in question should be a detailed and authoritative description of the variety. In addition, they should offer a comprehensive description of the inflectional morphology of the verb. True, we can rarely be sure that some important verb has not been omitted, but there is in all Romance languages a nucleus of verbs (see below) which habitually show some irregularity, and we have usually discarded sources in which not all of these are mentioned. We have also, with some regret, had to eschew the information provided by works which focus just on 'irregular' verbs, without offering a suitably comprehensive survey of 'regular' ones (e.g., Decurtins 1958).

We further require that the data we take into account should have been obtained from interrogation of one or more native speakers. It would have lain far beyond our current scope, and our practical capacity, to fill in all the geographical lacunae in our coverage by direct fieldwork in under-represented areas. In some cases, however, we have taken advantage of ready access to native-speakers in order to acquire data on certain dialects (see the entries for the Italo-Romance varieties of Mussomeli and Macerata). Needless to say, we welcome suggestions for new datasets meeting our criteria, and it is our intention continually to update and enrich the Database in that way.

Some published descriptions provide rich documentation only for a very limited number of lexemes. Others specify, or unambiguously indicate how to deduce, the forms for hundreds of lexemes. We have usually preferred to make reference to descriptions which allow us to document at least the full inflectional paradigm for the continuants of Latin ambulare, dare, esse, facere, habere, ire, posse, sapere, stare, tenere, uadere, uelle, uenire, uidere. These (semantically basic and also highly frequent) verbs are almost all present in all Romance languages, and are the locus of some major and idiosyncratic types of morphological structure. We also require documentation of the continuants of at least one member of each of the four major Latin conjugational classes (in addition to those given above). We have also tried to use as examples a common core of cognate Romance verbs. There is, therefore, a list of 'usual suspects', comprising reflexes of cantare, portare for the first conjugation, placere, tacere, tenere, uidere, for the second, bibere, credere, perdere, trahere, uendere for the third, audire, dormire, finire for the fourth). Other etyma whose reflexes we have endeavoured to include wherever possible are those of second conjugation ualere and third conjugation cadere, mori, nasci, fugere.

In the Romance domain, descriptions capable of meeting our selection criteria are scarcely to be found before the last quarter of the nineteenth century, with the birth of scientific Romance linguistics. The late nineteenth and the twentieth centuries witness a profusion of detailed dialect descriptions, many of which offer explicit and detailed accounts of the inflectional morphology of the verb. There do, indeed, exist structural descriptions of earlier stages of some Romance languages but these are rarely the result of direct interrogation of native speakers; rather, and in the nature of things, they are usually derived from written sources and are therefore lacunary. Derivation of complete inflectional paradigms from such sources is usually impossible for all but the most frequently used verbs and not always so, even then. This same difficulty, of course, applies to modern dialect studies derived from texts. The nature of the problems presented even by the most extensive textual corpora can be appreciated if we consider, for example, the following fact. In the Frantext ( database, the quarter of a million words representing the writings of Gustave Flaubert happen to offer no examples of certain forms of basic French verbs, such as: the second person plural preterite of avoir ‘to have’; the second person singular preterite of être ‘to be’; the first person plural imperfect subjunctive of aller ‘to go’; or the third person plural present subjunctive of voir ‘to see’.

While we have therefore not referred to text-based sources, we have felt it necessary to make one exception. In order to be able to include in our database the last remnant of the Dalmatian branch of the Romance languages, Vegliote, we have drawn on Bartoli (1906) whose grammatical analysis is mainly extracted from texts and particularly from the elicited narratives of the last (near-) native speaker. A glance at the Database entries for this variety will show how unsatisfactory and fragmentary is the use of such material, major lacunae remaining even for forms of the commonest verbs, but we have done so faute de mieux. Perhaps surprisingly, few of the major Romance linguistic atlases offer more than partial paradigms for the verbs they illustrate (the present indicative usually having pride of place), so that such atlases have been little used.

Where possible, we have sought to base our interpretation on studies dealing with varieties spoken in a particular village or town. Sometimes, however, we have found that the most comprehensive descriptions for the varieties of interest are studies which comport a degree of abstraction away from the local details, being descriptions of (to take two actual examples) 'Surselvan' or 'Friulian' in general. It has been our feeling, in such cases, that the disadvantages of abstraction from local reality, and a probable idealization of the data which we need to acknowledge, are counterbalanced by the breadth and detail of the description offered.

The criteria for inclusion of some variety in the Database have also posed some problems for geographical coverage, which while extensive remains very uneven. It is our overall impression that descriptions tend to decline in detail and quality the closer the variety described is to some standard language. The all too often tacit assumption is that forms that are not specified are the same as the corresponding forms in the standard, but in the absence of unambiguous statements to this effect, we cannot safely conclude anything. For example, for the southern half of the Iberian Peninsula (matters are different for the Pyrenean and Cantabrian domain) there exists very little by way of dialectological studies offering the kind of detail we require. This is probably because these 'southern' Ibero-Romance varieties really are relatively similar to standard Portuguese and standard Spanish, so that authors have felt it unnecessary to comment in any detail on the verb, but one cannot be sure. In this connection we might add that the Acadian varieties of French listed in the Database show that it is a by no means safe assumption that Romance varieties which one might assume a priori to be similar to the major Romance languages actually are so.

Sometimes it has simply been hard to find the right kind of description for a particular area. This, for some reason, is true of Friulian, one of the best descriptions of which pertains to dialects transplanted by émigrés into Romania (whence the rather surprising 'country' classification of some of the Friulian data as 'Romania'). Even in geographical areas for which we have good coverage, we have not necessarily captured every unusual or remarkable development in verb morphology of those areas. Naturally, such developments are often presented in studies specifically dedicated to the phenomenon at issue, without accompanying illustration of the rest of the paradigm, and these studies cannot be used as the basis of data this Database for the reasons explained earlier. As an example we might cite the inflectional marking of the gender of the subject as revealed, for example, for the Appennines between Modena and Bologna by Loporcaro (1996), or for Ripatransone in the Marche by Parrino (1967) and Harder (1998).

In a few cases (for example for Gascon), we are currently unable to make the data publicly available, but hope to do so in the near future.

Geographical locations

The Database provides an indication of geographical location for each variety presented. For data from single localities (villages or towns) we have provided the corresponding latitude and longitude, and a corresponding marker on a map. In the case of national or regional languages which cannot readily be assigned to any one geographical point, we have generally given the variety a map reference corresponding to that of the capital city or chief town of the country or region in which the variety originates (for well-known historical reasons, in the case of Italy we choose Florence). In the case of non-national varieties spoken over a continuous area in more than one country (e.g., Megleno-Romanian), or scattered over several countries (e.g., Aromanian), we have given the latitude and longitude of some locality in the area, which is either central or where the largest concentration of speakers is found.

Transcription and special symbols

The descriptions from which we draw our information range in their representation of phonetic detail from quite 'narrow' phonetic transcriptions to rather approximate representations using conventional orthography for the language concerned, or modifications of conventional orthography. There is no denying that the phonetic accuracy of our data varies considerably from variety to variety, depending on the nature of our information. However, to facilitate comparison we have attempted to give a broad IPA (International Phonetic Association) rendering for all the forms in the Database, paying careful attention to all indications in the material consulted about the value of the symbols there used. For the most part, we are confident that the data presented are at least a reliable phonemic representation of the data. In cases of substantial doubt, we have made a note to this effect. It bears repetition, however, that what is presented here is an interpretation of other material, and that the references provided here should always be regarded as the sole authority on phonetic representation.

Most Romance languages (with the exception of French and some other northern Gallo-Romance varieties) display phonologically distinctive stress at the level of the phonological word. Stress has therefore been indicated for all forms in the relevant varieties, including for monosyllabic verb-forms (which usually retain their stress within the phonological word in combination with clitic pronouns; but cf. Loporcaro 2011:80).

We do not always indicate all the forms for the inflectional paradigm of a given verb. Where we merely lack information, we specify Not given or, in cases where we have gaps for particular person-number combinations, we leave a blank. Where, however, we have explicit information (or firm grounds to believe) that no form or set of forms exists, we have specified No form; in case of a gap for a particular person-number combination, we give Ø. In a few cases where there exists a small degree of doubt over the correctness of a form thus deduced, we have made a note to this effect in the Comments box for the relevant verb.

The set of IPA characters currently used in this Database is the following; the corresponding Unicode identifiers are also given:

Character (Unicode) Character (Unicode) Character (Unicode) Character (Unicode) Character (Unicode) Character (Unicode) Character (Unicode) Character (Unicode)
̃  (0303) ̩  (0329) ̯  (032F) ˈ  (02C8) ː  (02D0) ̞  (031E) ̝  (031D)    
a  (0061) æ  (00E6) ɐ  (0250) ɑ  (0251) ɒ  (0252) b  (0062) β  (03B2) c  (0063)
ç  (00E7) d  (0064) ð  (00F0) ʣ  (02A3) ʤ  (02A4) ɖ  (0256) e  (0065) ə  (0259)
ɘ  (0258) f  (0066) g  (0067) ɣ  (0263) h  (0068) i  (0069) ɨ  (0268) j  (006A)
ʲ  (02B2) ʝ  (029D) ɟ  (025F) k  (006B) l  (006C) ʎ  (028E) m  (006D) n  (006E)
ɲ  (0272) ŋ  (014B) o  (006F) ø  (00F8) œ  (0153) ɶ  (0276) ɔ  (0254) p  (0070)
ɸ  (0278) r  (0072) ʀ  (0280) ɹ  (0279) ɾ  (027E) ʁ  (0281) s  (0073) ʃ  (0283)
t  (0074) θ  (03B8) ʦ  (02A6) ʧ  (02A7) u  (0075) ʌ  (028C) ɥ  (0265) ʊ  (028A)
v  (0076) w  (0077) x  (0078) χ  (03C7) y  (0079) z  (007A) ʒ  (0292)  


For each of the Varieties there is specified a name, a linguistic subclassification, and geographical location, as well as the basis for our description, and the approximate 'linguistic date' at which the data were obtained, divided into late 19th to early 20th century, mid to late 20th century and late 20th to early 21st century. The Notes mention any problematic or especially noteworthy characteristics of the variety described.

The inflectional paradigms of individual lexical verbs are identified by the Etymon of the verb (there are well over 300 Etyma in this Database). The Etymon is the presumed historically underlying form of the lexeme displayed, and is categorically not intended as a definitive statement about the etymology of that lexeme (this would be a hazardous undertaking involving issues that, usually, have little bearing on morphology), but simply as an identifier which can serve to facilitate cross-linguistic comparison of lexically cognate verbs. It is, in effect, a means of signalling cognacy. In some cases, the cognacy of verbs is partly masked by the fact that in some, but not all, varieties the same etymon appears in a derived form, preceded (historically) by prepositions or other prefixes. In such instances, we have usually placed the preposition/prefix in parentheses, after the basic etymon (e.g., ‘SEDERE (AD+)’ for forms derivable from ADSEDERE).

Overwhelmingly, the Language of the etymon is Latin, occasionally it is Germanic or Slavic (the latter especially for Romanian). In cases of doubt about the origin of some verb we have referred to Meyer-Lübke (1935) and to other major etymological dictionaries of the individual Romance languages. The Language of the etymon indicator 'Romance' is sometimes used to indicate the origin of lexemes which, while not attested in Classical Latin, are present in all or most Romance languages and have no external origin. A case is *passare, which appears to have no precedent in Latin, but is clearly derivationally created from Latin passus 'step', and extensively attested across the Romance languages. We have followed the convention of presenting attested Classical Latin forms in capitals, and all other etyma either in conventional orthography (if one exists) for the language in question, or in broad IPA. Etyma preceded by an asterisk (*) are hypothetical and unattested; a few also preceded by '?' are hypothetical, unattested, and open to serious question. In cases where we are wholly unsure of the etymon, or of the source language (doubt in one domain is usually accompanied by doubt in the other), we may give a very recent and 'local' form. This occurs sometimes, for example, in Surselvan Romansh, where there are a number of verbs having interesting morphological properties which seem to have expressive or onomatopoeic origins, and which we have simply marked as 'Romansh'. Certain verbs have more than one etymon — and are therefore 'suppletive' by 'incursion' (in the terminology of Corbett 2007). In these cases, we list all the etyma. The commonest cases involve the verbs 'to go' (e.g., ambulare / ire / uadere for French), and 'to be' (e.g., esse / stare for French).

In words derived from Classical Latin, we have also indicated (under Latin Conjugation) the Latin inflectional class membership of the etymon, following the traditional division into four major classes: the first (with the theme vowel /a/ and infinitive in -āre), the second (with the theme vowel /e/ and infinitive in -ēre), the third (arguably analysable in Latin as lacking a theme vowel, and with the infinitive in -ere) and the fourth (with the theme vowel /i/ and infinitive in -īre). The designation 'special' means that the Latin verb was idiosyncratic and could not be easily classified as belonging to any one of the major inflectional classes.

The Meaning specified for each lexeme is intended to be broadly indicative. It should be made absolutely clear that this is the meaning in the language described, and not the original meaning of the etymon. Thus the Latin etymon sapere 'taste, smell' is defined as 'know', since that is its primary meaning in the Romance languages where it survives. In some languages there are verbs which function solely as auxiliaries in such constructions, or which have variant morphology according to whether they are used as lexical verbs or auxiliaries (e.g., the reflexes of Latin habere in Spanish and Romanian respectively). In such cases, the Meaning of the specialized auxiliary form is specified as 'auxiliary'.

Grammatical categories

The data are organized according to fairly traditional and easily recognizable categories (but see our comments below on the labels used). Under Grammatical categories there appear the major tense and mood classes, and under Verb forms the actual word-forms of the inflectional paradigms as specified for 'finiteness': the 'non-finite' Verb forms are Infinitive, Gerund and Past Participle; the 'finite' Verb forms indicate person, number, tense and mood, subdivided according to the person-number combinations 1sg (first person singular), 2sg (second person singular), 3sg (third person singular), 1pl (first person plural), 2pl (second person plural), 3pl (third person plural). The descriptive labels used to identify these blocs are designed to facilitate cross-linguistic comparison between cognate forms. These labels may fall far short of describing the functions of the forms listed in the varieties consulted, but this is not our aim. Rather as with the Etyma, we have sought to offer indicators for classes of cognate forms originally united by a common grammatical function, not to offer through the labels a description of the functions of such forms in the variety described.

The notion that some set of forms is the 'continuant' of some Latin set of forms, or that sets of forms are cross-linguistically cognate, is often problematic. For example, it is unlikely that any Romance linguist would cavil at the assignment of a set of forms in each Romance language to the category ‘present indicative’; there is a substantial continuity of form (especially in regard of the inflectional endings) with the Latin ‘present indicative’. The problem typically arises where there are ‘partial gaps’, such that some person-number combinations for a particular category have been ‘imported’ from other categories. Thus in Romanian the ‘present subjunctive’ continues the Latin ‘present subjunctive’ forms only in the third person. First and second person forms of the modern Romanian present subjunctive have been taken over from the corresponding forms of the present indicative, the sole remaining exception being the verb ‘to be’, all of whose forms still continue Latin present subjunctive forms. Should we then state in our Database that, from a comparative-historical perspective, Romanian has a morphological ‘present subjunctive’ only in the third person, and leave the first and second person forms blank? Perhaps, but we have chosen not to do this because there remains within the sub-paradigm of the present subjunctive a set of forms which do continue the Latin present subjunctive and there has never been, as far as one can tell, any kind of historical ‘hiatus’ between the original first and second person present subjunctive forms and the (originally) present indicative forms that occupy those cells today. Apparently there has been no stage of ‘defectiveness’ in which the third person present subjunctive forms were paradigmatically isolated, the gaps subsequently being filled by the indicative. That there has been no ‘defective’ stage is equally indicated by the verb ‘to be’, which has kept all cells of the present subjunctive filled, with distinctively subjunctive forms. Rather, there appears to have been an organic and gradual transition, such that the present subjunctive forms (already fairly similar in form of those of the indicative), were replaced by their present indicative counterparts. In this sense, there is a genuine ‘continuity’ between the Latin ‘present subjunctive’ and the Romanian ‘present subjunctive’, even if it is one that is mediated not only by the regular effects of sound change, but also by extensive analogical adjustment. Yet there is a more extreme, and even more problematic, case. What are we to make of those Istro-Romanian dialects which have carried replacement by the indicative forms to the ultimate point, so that even the third person forms have finally ceded to replacement by their indicative counterparts? The fact that the present subjunctive of the verb ‘to be’ still preserves a full set of distinctive forms in these varieties, and that a system like that of modern standard Romanian and other Romanian dialects almost certainly underlies the Istro-Romanian situation, has led us to present the relevant data under ‘present subjunctive’. Undeniably, the choice is questionable, but we feel that it would be frankly misleading in our taxonomy to suggest that the ‘present subjunctive’ is simply discontinued in Istro-Romanian. There may be other places in our Database where such a treatment would also have been appropriate (cf. the treatment of the continuant of the Latin pluperfect subjunctive in the entry for French - Acadian - Pubnico), but where evidence of an ‘organic’ transition, where some forms of category A are substituted while those of category B, while maintaining an unbroken paradigmatic relationship with surviving forms of category A, is less readily forthcoming.

For an overview of the grammatical categories and corresponding forms described in this Database, and their uses see, for example, Lausberg (1976:§§787-948). Some of our labels are explicitly historical (e.g., ‘Continuant of Latin X’): these are appropriate where cognate forms have diverged so widely in respect of their functions that any label suggesting some shared cross-linguistic function would be misleading. In other cases, the same label (e.g., ‘present indicative’) is conventionally used to describe the relevant, and historically cognate, set of forms across Romance, and we have seen no reason to replace it with a more openly historical one. The labels used are:

  • Infinitive. This label is sanctioned by usage across Romance and also for the historically underlying set of Latin forms. It is a non-finite form (in Latin, usually comprising the verb stem + re) implicated in a range of grammatical functions and often having clausal functions. By convention among Romance linguists it is the 'citation form' for verb lexemes. Strictly speaking, the Romance infinitives continue the Latin imperfective infinitive, the old perfective infinitive being extinct.
  • Gerund. The Romance gerund is a (usually invariant) form continuing the Latin gerund, a form of verbal noun, and in Romance it usually has circumstantial clausal functions ('while / by doing something') compatible with an origin in the ablative.
  • Past participle. The Latin past participle was, broadly speaking, a verbal resultative adjective, with a meaning roughly paraphrasable as 'in the state resulting from the action / state expressed by the verb'. It is continued in Romance, although the functional load associated with it has greatly increased, as a consequence of the many Romance morphosyntactic innovations, including new perfective verb forms, comprising auxiliary verb + past participle, a type with only a very restricted distribution in Latin. Verbs which (usually for reasons associated with their lexical semantics) lacked a past participle in Latin have often come under pressure to create a past participle in order to provide forms for the innovatory 'auxiliary + past participle' constructions. Therefore, not every 'past participle' in Romance (for example, that of reflexes of esse 'be') has an antecedent in Classical Latin. As in Latin, all Romance past participles inflect for number and gender, although under circumstances which vary considerably from variety to variety. In so far as these inflectional properties belong with nominal morphology, we have normally specified only the 'masculine singular' form of the Romance past participles.
  • Present indicative. This label is consecrated by usage across Romance and also for the historically underlying set of Latin forms.
  • Present subjunctive. As for the 'Present indicative', except that the label is less appropriate for Daco-Romanian, where the relevant forms have no particular association with tense (there is one, tenseless, synthetic form of the subjunctive, continuing the Latin present subjunctive).
  • Imperfect indicative. This label is used across Romance for the past tense imperfective indicative forms deriving from the Latin past tense imperfective indicative forms.
  • Continuant of Latin pluperfect indicative. This form (more precisely, the 'continuant of the Latin past perfective indicative') survives or survived in many 'western' Romance varieties and Italo-Romance (possibly also in Dalmatian). It manifests a variety of functions in Romance (pluperfect in Portuguese and medieval Spanish, imperfect subjunctive in Spanish, conditional in some Italo-Romance and Gallo-Romance varieties), so that reference by means of an 'etymological' label seems most convenient.
  • Continuant of Latin pluperfect subjunctive. This form (more precisely, the 'continuant of the Latin past perfective subjunctive') survives or survived in virtually all Romance varieties. The outcome in most areas is a form known as the 'imperfect subjunctive', retaining functions associated with the subjunctive but abandoning the associations of perfectivity/anteriority of its Latin antecedent. In Daco-Romance, however, it is the subjunctive functions that are lost, while a meaning of past anteriority is broadly conserved. Again, reference by means of an 'historical' label seems most convenient cross-linguistically.
  • Continuant of Latin future perfect / perfect subjunctive. More precisely, this form is the 'continuant of Latin future perfective indicative / present perfective subjunctive'. It is generally accepted to be a conflation of two sets of Latin tense-aspect-mood forms which were already formally conflated, in part, in Classical Latin. They were homophonous, with the exception of the first person singular (cf. pst.pfv.sbjv and fut.pfv 3sg fecerit, dederit, portauerit vs pst.pfv.sbjv 1sg fecerim, dederim, portauerim; pst.fut.pfv 1sg fecero, dedero, portauero). It is certainly the case that no Romance language shows any sign of differentiating the continuants of the future perfect from those of the perfect subjunctive, and the fact that the Spanish and Portuguese outcomes are a 'future subjunctive', suggests functional conflation as well. These Ibero-Romance forms are clearly cognate with the Vegliote future, and the Daco-Romance synthetic conditional (still surviving in Istro-Romanian and Aromanian). See further Maiden (2009).
  • Preterite. This form could equally have been labelled 'continuant of the Latin present perfective indicative'. 'Preterite' is the term used to describe this form in descriptions of a number of Romance languages, and is therefore used here. The form survives in most Romance varieties to this day and is attested in all varieties of which we have historical records. Its typical function across the Romance languages is that of indicating a completed ('perfective') action or state in the past; but in Acadian French, for example, the form labelled 'preterite' has assumed a different function.
  • Romance future and Romance conditional. These labels refer to two related classes of synthetic word-forms attested across many 'western' Romance languages together which much of Italo-Romance, without precedent in Classical Latin but formed historically from a reflex of the Latin infinitive followed by an originally auxiliary form of habere 'to have', in the present indicative in the case of the 'Romance future', and in a past tense form (usually imperfect indicative, but sometimes preterite in Italo-Romance) in the 'Romance conditional'. Both forms (at least originally) indicated futurity (respectively, in relation to the time of speaking and in relation to a reference time in the past). The conditional is also a form used in counterfactual constructions (e.g., 'If he were here he would do it').

We have not included the continuants of the Latin present participle. Traditional grammars of Romance languages often include this among the forms of the inflectional paradigm of the verb. The present participle was in Latin a kind of verbal adjective (usually with active value), characterized (outside the nominative singular, in -ns), by the formative -nt-: thus cantare, audire, present participle (accusative singular) cantantem, audientem. To the extent that remnants of the present participle survive in Romance, they generally do so in a lexically and semantically erratic way (see, for example, Maiden and Robustelli 2007:58-60 for Italian). Most verbs do not have a present participle, and in those that do its very formal and semantic idiosyncrasies indicate that it lies outside the domain of inflectional morphology as traditionally defined, and squarely within that of 'derivational' morphology. French verbs have a set of forms orthographically in -ant which may represent a historical conflation of the reflexes of the gerund and the present participle. We have listed these forms under Gerund.

Some varieties specify additional grammatical categories to those generally listed for all. Two cases in point are varieties of Sardinian and Galego-Portuguese. In the former it appears that, uniquely among modern Romance languages, there survives a Continuant of Latin imperfect subjunctive, and this category has accordingly been added for the relevant Sardinian varieties. The same claim has been made about the origins of what is usually known as the ‘Portuguese inflected infinitive’, but here there is considerable controversy (see our Note for Portuguese), and we have taken the more cautious line of classifying it as, simply, the ‘Portuguese “inflected infinitive”’

Using the Database

Varieties gives access to the dataset for each of the varieties described in the Database. Each dataset contains information on the linguistic classification of the variety, its geographical location, the source of the data, and a list of verb lexemes by etymon. By clicking on the relevant lexeme, the inflectional paradigm for that verb can be accessed.

Simple search allows the user to search, within a given variety or across the Database as whole, for particular grammatically or phonologically specified forms. These may be accessed by using the drop-down menus, or by specifying the items sought directly in the 'text' boxes. Advanced search offers additional distributional information, for example: ‘in which varieties does a particular etymon or verb-form occur?’; ‘for which etyma do particular verb-forms occur?’. Text search allows 'tailor-made' searches for complex patterns of intersection of 'grammatical categories' and 'verb forms' (for example, 'provide all the forms of the present subjunctive together with those of the first person singular and third person plural present indicative in Italo-Romance varieties'). Text search also contains a 'Help' section further exemplifying the procedure for making searches of this kind.

Inputting data (for users with the appropriate level of access)

New data may be input by clicking on the 'edit' button (where visible) for the appropriate section of the Database. All verb-forms must be entered in International Phonetic Association characters, and normally (but see Transcription and special symbols) stress must be marked for each word-form, including monosyllables.

People and contacts

The following have contributed to building this Database:

  • Silvio Cruschina
  • Maria Goldbach
  • Marc-Olivier Hinzelin
  • Martin Maiden
  • John Charles Smith

We are grateful also to Chiara Cappellaro, Louise Esher, Paul O’Neill, Stephen Parkinson, Mair Parry, Nicolae Saramandu, Andrew Swearingen, Francisco Dubert García and Tania Paciaroni for their assistance.

Any queries or suggestions regarding the Database should be directed to: and / or


  1. Alton, J.B. and Vittur, Franz 1968. L ladin dla val Badia: Beitrag zu einer Grammatik des Dolomitenladinischen. Bressanone: Weger.
  2. Bartoli, Matteo 1906. Das Dalmatische. Altromanische Sprachreste von Veglia bis Ragusa und ihre Stellung in der Apennino-Balkanischen Romania, 2 vols., Vienna: Hölder.
  3. Corbett, Greville 2007. 'Canonical suppletion, typology and possible words'. Language 83:8-42.
  4. Harder, Andreas 1998. 'La declinazione dei verbi in un dialetto di transizione nelle Marche'. In G. Ruffino (ed.) Atti del XXI Congresso internazionale di linguistica e filologia romanza. Sezione 5. Dialettologia, geolinguistica, sociolinguistica. Tübingen: Niemeyer, 389-99.
  5. Lausberg, Heinrich 1976 (2nd ed.). Linguistica romanza. Milan: Feltrinelli.
  6. Ledgeway, Adam 2011. ‘Morphosyntactic persistence from Latin into Romance’, in M. Maiden, JC Smith, A. Ledgeway (eds) The Cambridge History of the Romance Languages Cambridge: CUP, 382-471.
  7. Loporcaro, Michele 1996. ' Un caso di coniugazione per genere del verbo finito in alcuni dialetti della montagna modenese e bolognese', Zeitschrift für romanische Philologie 112: 458-.77
  8. Loporcaro, Michele 2011. ‘Syllable, segment and prosody’, in M. Maiden, JC Smith, A. Ledgeway (eds) The Cambridge History of the Romance Languages Cambridge: CUP, 50-108.
  9. Maiden, M. 2009. 'Osservazioni sul futuro dalmatico (e guascone)'. Bollettino linguistico campano [2007] 11/12:1-19.
  10. Maiden, Martin 2011. ‘Morphophonological persistence’, in M. Maiden, JC Smith, A. Ledgeway (eds) The Cambridge History of the Romance Languages Cambridge: CUP, 155-215.
  11. Maiden, Martin and Robustelli, Cecilia 2007. A Reference Grammar of Modern Italian. London: Hodder Arnold.
  12. Meyer-Lübke, Wilhelm 1935. Romanisches Etymologisches Wörterbuch. Heidelberg: Winter.
  13. Minach, F. and Gruber, T. 1972. La rujneda de Gherdëina. Saggio per una grammatica ladina. Urtijëi: Typak.
  14. Parrino, Flavio 1967. 'Su alcune particolarità della coniugazione del dialetto di Ripatransone'. L’Italia dialettale 30:156-66.
  15. Rizzi, Luigi 1986. 'On the status of subject clitics in Romance'. In Jaeggli, O. and Silva Corvalan, C. Studies in Romance Linguistics. Dordrecht: Foris, pp. 137-52.

[Martin Maiden, 16th March MMXI]