A FALA DATABASE
A Fala database contains 225 000 tokens/words documented in 156 texts. It has been compiled from transcribed recordings, which contributed with 110 315 words (49%), and published and unpublished texts written in one of the three varieties of A Fala, which contributed with the remaining 114 690 words (51%). However, due to the copyright issues 6 of the written texts had to be deleted and for that reason this public version has only 150 accessible texts, with over 222 900 words.
The objective was to create a database that would reflect both spoken and written aspects of the language, taking into account a variety of factors: equal representation of the three varieties (Lagarteiru, Mañegu and Valverdeñu), participation of both genders (women and men), participation of speakers of different age groups, not only the oldest speakers and a variety of topics to be covered in the interviews ranging from the traditional ones like the local agriculture to European funds and their local usage. The community of speakers contributed to all stages of the database compilation.
Community participation: approx. 180 participants, 4% of the population of the three villages.
You will need version 8 or 9 of FLEx to open the database.
FLEx download: https://software.sil.org/fieldworks/download/
The database is password protected. It is available to everyone, but to get the password, please contact:
- General note - this line is used for extended comments on usage as the Usages line only offers pre-defined categories.
- Semantic domains – the only semantic domains that have been marked are related to Animals (1.6), Plants (1.5) and Tools (6.7). The categorization is simplified and it will be a matter of future corrections and completion.
- Restrictions – this line reflects the frequency of words. It is also a section to be completed.
no mark = frequent words
A = less frequent words (not marked yet)
B = rare words (not marked yet)
C = very infrequent words – related to the traditional culture, often unused e.g. corsetería
D = very infrequent words – related to Castilian e.g. lasaña, paracetamol
E = adverbs in -menti, they will not be part of the dictionary, but they appear in the database
F = words that are not included in the dictionary, they might be inserted after verification
- Total tokens/words registered: 110 315
- Total number of recordings: 63 (in 37 interview sessions)
- Total time: 705 min (11hrs 45 min)
- Video recordings: 61 (94%)
- Audio only recordings: 2 (6%)
- Total number of participants: 67 (37 women, 30 men, 20 participants in the position of interviewers with limited participation)
Number of recordings: 16 (in 12 interview sessions)
Time: 238 min (3 hrs 58 min)
Tokens/words registered: 38 709
Participants: 22 (12 women, 10 men, 6 in the position of interviewers)
Number of recordings: 26 (in 12 interview sessions)
Time: 248 min (4 hrs 8 min)
Tokens/words registered: 37 703
Participants: 19 (11 women, 8 men, 4 in the position of interviewers)
Number of recordings: 21 (in 13 interview sessions)
Time: 219 min (3 hrs 39 min)
Tokens/words registered: 33 903
Participants: 26 (14 women, 12 men, 10 in the position of interviewers)
- Total tokens/words registered: 114 690
- Total number of written texts: 93
- Larger texts (books, theatre plays): 6 (49 305 words)
- Shorter texts (magazine articles, short stories, etc.): 80 (55 308 words)
- Translations: 5 (9 518 words)
- Web texts: 1 (408 words)
- Public announcements: 1 (151 words)
- Total number of authors: 71
- Authors lagarteiru: 33
- Authors mañegu: 12
- Authors valverdeñu: 26
- Texts not available in the public version of the database: 6 (2 094 words)