Multilingual Hispanic Speech in California (MuHSiC)

Languages and Applied Linguistics

WHO: Mark Amengual, Damaris García, Karen Hinojosa, Silvia Íñiguez, Rosario Méndez, Joscelyn Osorio, Cassandra Ponce Ventura, Claire Skelly, Dacia Van Wormer

muhsic-cover.jpg

WHAT: This research team contributes to developing a robust and linguistically rich corpus of bilingual Spanish-English speech samples from 200 hours of sociolinguistic interviews and naturalistic conversations among speakers of diverse social profiles and regional origins throughout California. The audio recordings will be available on an open website where researchers, teachers, students, and the public can access a linguistic map of California Spanish-English bilingual speech.

This project examines the speech patterns of speakers who grew up in a Spanish-speaking household and speakers who grew up in an English-speaking household. By comparing the linguistic features that comprise the speech of Spanish-English bilinguals from diverse backgrounds, we will be able to offer a unique source of insight into the cognitive and linguistic abilities of a largely understudied bilingual population that has traditionally been ignored in language research despite the continuous increase of early and late Spanish speakers in California and the US. For this, we propose to address three main research questions:

Q1 - How are Hispanic and/or Latinx-based identities and ideologies negotiated through the usage of Spanish and English in California, and how do these strategies compare to those observed in other prominent US bilingual communities?

Q2 - What are the dynamics of language use amongst: (a) California bilinguals who acquired Spanish at home and English in school, (b) California bilinguals who acquired English at home and Spanish at school, and (c) first-generation Spanish-speaking immigrants who currently reside in California? Moreover, do these dynamics suggest the emergence of new bilingual norms for language use in the US?

Q3 - What linguistic features uniquely characterize California Spanish, and how do these relate to speakers’ conceptions of Spanglish, US Spanish, and prescriptive norms of correct Spanish?

To address the research mentioned above questions, this project is developing a robust and linguistically rich corpus of Spanish, English, and bilingual Spanish-English speech samples (Multilingual Hispanic Speech in California; MuHSiC) culled from sociolinguistic interviews and naturalistic conversations among speakers of diverse social profiles and regional origins throughout California. Moreover, high-quality recording devices are being used to collect speech samples. The MuHSiC is a versatile corpus that can be used to analyze linguistic features at multiple levels of grammatical domain (phonetics, phonology, morphosyntax, pragmatics, etc.).

The research plan’s primary components are (1) the creation of an oral speech corpus of Spanish and English speech in California; (2) the launching of a website interface that will meet the needs of researchers and educators who wish to analyze or instruct with actual samples of California English and Spanish speech; The project aims to understand better local varieties of Spanish and English used by Spanish-English bilinguals in California.

WHY: Just a few generations ago, many people believed that exposing children to more than one language could confuse them, resulting in delays in their language development or speech disorders. Such bias against bilingualism has had detrimental effects on immigrant families, on education policy, and, more broadly, on the linguistic and cultural diversity that enriches our society. For instance, in the early 20th century, psychological tests were administered to all immigrants to the US as a yardstick to “keep out those who are manifestly undesirable” for US society (Sweeney, 1922). Presently, accented or non-native speakers of English face bias with potentially harmful consequences in terms of employment, housing, racial discrimination, and even judicial prejudice (Baugh, 2005). Ironically, in California, one of the most linguistically and culturally diverse states in the country, Proposition 227 required that all public schools use only English as the language of instruction, though fortunately, this was repealed recently in November 2016, by the California Non-English Languages Allowed in Public Education Act (Proposition 58).

With a larger Spanish-speaking population than many countries in which Spanish is the national and official language, such as Spain and Colombia, the United States now more than ever constitutes an ideal research site for investigations concerning language contact and bilingualism (Instituto Cervantes 2015). Nonetheless, deep-rooted English monolingualism in US mainstream culture (Fuller, 2012), in combination with the more recent anti-immigrant politics of the US, has made it very difficult for children raised in a home with a language other than English to maintain their home language. For example, in California, the state with the most Spanish speakers (U.S. Census Bureau 2020), foundational work by Silva-Corvalán (1994) regarding the Spanish of Los Angeles revealed abrupt generational practices of abandoning Spanish in favor of English. The youngest Spanish speakers were accordingly characterized as speakers of an improficient, simplified, and substandard variety of Spanish influenced by English. To what degree are these outcomes consistent with present-day language trends in California? Do California bilinguals consider their Spanish to be substandard or impoverished, and how are such ideologies expressed through their use of Spanish and English?

WHAT'S NEXT:

  1. Collecting more speech samples for creating our oral speech corpus: The material for the Multilingual Hispanic Speech in California Corpus (MuHSiC) is being gathered throughout California and coordinated in the Bilingualism Research Lab with our research assistants. These materials include speakers of diverse ages and social profiles engaged in naturalistic interviews and conversations with Spanish-English bilingual interlocutors. The sessions are audio recorded in high-quality, uncompressed digital formats, exploiting the corpus for a wide range of linguistic features. Speech samples cover various topics, from (im)migration stories and traditional folktales to culinary recipes and instructions for speaking Spanglish. Participants provide non-identifying information on their social characteristics and will be assessed for language proficiency, dominance, attitudes, and usage. The primary data collection method is the sociolinguistic interview, consisting of roughly 70-minute-long voice-recorded sessions. Notably, the interviews elicit speech in both Spanish and English, permitting a comprehensive analysis of Spanish speech patterns in relation to English. This crucially obviates the need for monolingual benchmark comparisons (e.g., the typical and problematic comparison between a bilingual’s Spanish and that of a monolingual Spanish speaker from a different country) and more appropriately grounds the empirical study of bilingualism within a bilingual context.

  2. Creation and maintenance of a website that will house the MuHSic Corpus: The audio files will be edited, coded for speaker information, and annotated with suppressible linguistic transcriptions that will allow users to quickly search the corpus based on discourse topics, grammatical features, or speaker demographics. The MuHSiC corpus will be a collection of 600 (i.e., 200 collected by UC Santa Cruz and the rest by research teams at UCLA and UC Berkeley) recordings of Spanish-English bilinguals living in California, thus providing authentic samples of the Spanish language that US students are likely to encounter in their everyday lives. Providing these materials to language classrooms, in addition to language scholars, additionally serves to reinforce the legitimacy of US Spanish as a bonafide dialect of Spanish.

  3. Analysis of California bilingual speech data: We will first focus on Spanish-English bilinguals' sound systems and then later on other linguistic features (e.g., vocabulary, word and sentence structures). We will critically and empirically re-examine the Spanish of California through comparison across diverse types of bilinguals to offer a more comprehensive understanding of how the Spanish of the US constitutes a fully developed, systematic, and legitimate language. In so doing, we shall additionally reveal how linguistic diversity in California Spanish, like in all Spanish varieties, is the natural product of unique linguistic and extra-linguistic conditions. Moreover, our research aims to identify the sound features of Spanish and English that are (or are not) indicative of cross-linguistic influence. By linking these (non-)sites of cross-linguistic influence with speakers’ sociolinguistic backgrounds (e.g., age, gender, immigrant generation, ethnicity, amount of language use, age of language exposure), we will shed light on the dynamics of linguistic variation and bilingualism.

THE WOW: This initiative will enhance the research capacity and leadership of UCSC to address the linguistic issues of language contact, language shift, and language maintenance in the Spanish-speaking population of California by developing a robust and linguistically rich corpus of bilingual Spanish-English speech samples (Multilingual Hispanic Speech in California; MuHSiC) culled from sociolinguistic interviews and naturalistic conversations among speakers of diverse social profiles and regional origins throughout California. The audio recordings will be made available on an open website housed in the Humanities division at UCSC, where researchers, teachers, students, and the public can access a linguistic map of California Spanish-English bilingual speech. The project has already designed a targeted research strategy by engaging underrepresented undergraduate students in field research experience and engaging with multiple cities and counties in our state, ultimately establishing UCSC as an international leader in bilingualism research.