ライセンス許諾可能な機械学習用データセットでAIプロジェクトを迅速にスタート
アッペンのラベル付き機械学習用データセット製品は、プロジェクトを素早く開始するのに役立ちます

アッペンの許諾可能/ラベル済機械学習用データセット
80以上の言語、400以上の音声、画像、動画、テキストデータセットの豊富なカタログをご覧ください。ラベル付け済みのデータセットはすぐにご利用いただけますので、高品質の学習データを機械学習へすぐにお使いいただけます。

自動音声認識データセット
- 64以上の言語で22,000時間以上のASRデータ
- 携帯電話、固定電話、マイクで収集した音声
- 独話、自由会話、2人での対話などのシナリオ
- 静かな環境、オフィスや自宅、車内など、さまざまな録音環境
- すべて書き起こしたテキストをアノテーションし、一部は発音辞書を添付しています

テキストデータセット
- 98言語、523万項目をカバーする発音辞書
- 22言語、326万項目をカバーする品詞辞書
- 8言語、100万項目以上をカバーするNER(固有表現抽出)データセット

画像データセット
- 3つの言語をカバーするOCRデータセット、合計12,000枚の画像
- マルチラベル画像データセット
- マルチポーズ、マルチライトのポートレート写真データベース

TTS (Text-to-Speech)データセット
20カ国以上、400人以上のネイティブスピーカー

データセットの使用例
AIプロジェクトの推進により、より迅速かつ低コストな意思決定を。AIを活用することで、顧客体験の向上、顧客ロイヤリティ強化、コスト削減などの目標を達成することが可能です

安全でスマートなドライビング
ドライバーの危険行動認識データセットは、危険な行動やドライバーの疲労を検出するのに役立ちます。
乗客の安全監視データセットは子供やペット、車内に残された危険物の特定に役立ちます。
自律走行データセットは道路のレーンライン、障害物、駐車場の認識に役立ちます。
自動運転向けデータソリューション

チャットボット/顧客サービス
NLP会話データセットは、スマートな顧客サービスをサポートする生成AIチャットボットの構築に役立ちます。
TTSデータセットは、 テキストに基づいた音声を生成し、自然な変換サービスを提供するのに役立ちます。
小売・EC向けデータソリューション

スマートファイナンス
金融OCRデータセット は保険や金融業界向けの自動契約書レビューシステムの構築に役立ちます。
また、効率的で正確なOCRデータサービスは、OA自動化や請求書認識の展開にも役立っています。
金融業界向けデータソリューション

スマートホーム/IoT
ASRとTTSのデータセットは、消費者と対話するためのIoT家電をサポートします。
障害物認識データセットは、 ルンバや類似の家電が障害物を認識し、迂回することをサポートします。

カタログ
Dataset Name | Product Type | Common Use Cases | Recording Device | Unit |
---|
Dataset Name | Product Type | Common Use Cases | Recording Device | Unit | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
135 | Text | ASR, TTS, Language Modelling | N/A | 12,000 words | Add Quote | sqi_ALB_PHON | Appen Global | Pronunciation Dictionary | Albanian | Albania | N/A | N/A | N/A | N/A | 12000 | N/A | text | Albanian (Albania) Pronunciation Dictionary | ||
136 | Text | ASR, TTS, Language Modelling | N/A | 45,000 words | Add Quote | amh_ETH_PHON | Appen Global | Pronunciation Dictionary | Amharic | Ethiopia | N/A | N/A | N/A | N/A | 45000 | N/A | text | Amharic (Ethiopia) Pronunciation Dictionary | ||
141 | Text | ASR, TTS, Language Modelling | N/A | 11,000 words | Add Quote | ara_DZA_PHON | Appen Global | Pronunciation Dictionary | Arabic | Algeria | N/A | N/A | N/A | N/A | 11000 | N/A | text | Arabic (Algeria) Pronunciation Dictionary | ||
20 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 29 hours | Add Quote | EAR_ASR001 | Appen Global | Conversational Speech | Arabic | Algeria | Low background noise (home/office) | 496 | 2 | Available on request | 11327 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed however, for a smaller number of calls, only one half of the conversation was collected and transcribed |
Arabic (Eastern Algeria) conversational telephony | |
137 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add Quote | ara_EGY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Egypt | N/A | N/A | N/A | N/A | 40000 | N/A | text | Arabic (Egypt) Pronunciation Dictionary | ||
114 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 352 hours | Add Quote | ARE_ASR001_CN | Appen China | Scripted Speech | Arabic | Egypt | Low background noise (home/office) | 627 | 1 | 128908 | 207576 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised |
Arabic (Egypt) scripted smartphone | |
139 | Text | ASR, TTS, Language Modelling | N/A | 13,000 words | Add Quote | ara_IRQ_POS | Appen Global | Part of Speech Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 13000 | N/A | text | Arabic (Iraq) Part of Speech Dictionary | ||
138 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add Quote | ara_IRQ_PHON | Appen Global | Pronunciation Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 15000 | N/A | text | Person names | Arabic (Iraq) Pronunciation Dictionary | |
140 | Text | ASR, TTS, Language Modelling | N/A | 48,000 words | Add Quote | ara_LBY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Libya | N/A | N/A | N/A | N/A | 48000 | N/A | text | Arabic (Libya) Pronunciation Dictionary | ||
65 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 12 hours | Add Quote | MSA_ASR001 | Global Phone | Scripted Speech | Arabic | Tunisia | Low background noise (home/office) | 78 | 1 | 4908 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Arabic (Modern Standard Arabic) scripted microphone | |
112 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 33 hours | Add Quote | ARY_ASR001 | Appen Global | Conversational Speech | Arabic | Morocco | Low background noise | 180 | 2 | 80544 | 23836 | 8 | alaw | Each speaker participated in 1 to 4 conversations. Speakers are identified by a unique 4-digit speaker ID which is recorded in the demographic file Transcription is available in original script and fully reversible Romanised version with accompanying pronunciation lexicon English translation of product transcription is available (ARY_MT001, ARY_ASRMT001) |
Arabic (Morocco) conversational telephony | |
113 | Text | MT, Chatbot , Conversational AI | N/A | 80,544 utterances | Add Quote | ARY_MT001 | Appen Global | Conversational Translation | Arabic | Morocco | N/A | 180 | N/A | 80430 | 23844 | N/A | text | Corresponding audio, transcription, fully reversible romanised transcription and pronunciation lexicon data are available (ARY_ASR001, ARY_ASRMT001) | Arabic (Morocco) conversational telephony translation | |
143 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add Quote | ara_MAR_PHON | Appen Global | Pronunciation Dictionary | Arabic | Morocco | N/A | N/A | N/A | N/A | 60000 | N/A | text | Arabic (Morocco) Pronunciation Dictionary | ||
144 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add Quote | arb_N/A_PHON | Appen Global | Pronunciation Dictionary | Arabic | N/A | N/A | N/A | N/A | N/A | 40000 | N/A | text | Arabic (N/A) Pronunciation Dictionary | ||
115 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 322 hours | Add Quote | ARS_ASR001_CN | Appen China | Scripted Speech | Arabic | Saudi Arabia | Low background noise (home/office) | 227 | 1 | 104574 | 156282 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised 300-1000 prompts per speaker covering general content including education, sports, entertainment, travel, culture and technology |
Arabic (Saudi Arabia) scripted smartphone | |
146 | Text | ASR, TTS, Language Modelling | N/A | 17,000 words | Add Quote | ara_SDN_PHON | Appen Global | Pronunciation Dictionary | Arabic | Sudan | N/A | N/A | N/A | N/A | 17000 | N/A | text | Arabic (Sudan) Pronunciation Dictionary | ||
145 | Text | ASR, TTS, Language Modelling | N/A | 75,000 words | Add Quote | ara_ARE_PHON | Appen Global | Pronunciation Dictionary | Arabic | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 75000 | N/A | text | Arabic (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
120 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 170 hours | Add Quote | ARU_ASR001_CN | Appen China | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise (home/office) | 133 | 1 | 42352 | 85775 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised |
Arabic (United Arab Emirates (UAE)) scripted smartphone | |
70 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 48 hours | Add Quote | OrienTel United Arab Emirates MCA (Modern Colloquial Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 880 | 1 | 43000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Arabic (United Arab Emirates (UAE)) scripted telephony | |
71 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 31 hours | Add Quote | OrienTel United Arab Emirates MSA (Modern Standard Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 24500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Arabic (United Arab Emirates (UAE)) scripted telephony | |
9 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 86 hours | Add Quote | CGA_ASR001 | Appen Global | Scripted Speech | Arabic | United Arab Emirates (UAE) - Saudi Arabia | Low background noise (home/office) | 150 | 4 | 42000 | 19245 | 16 | raw PCM | Fully transcribed with acoustic event tagging derived from the SpeechDAT conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words All transcriptions fully vowelized 280 prompts per speaker including 30 Person names (first name and family name) from a set of 15, 10 single isolated digits 0-10, 8-digit sequences (randomly generated), 200 phonetically balanced sentences, 30 x 10-word phonetically balanced word strings |
Arabic (United Arab Emirates (UAE)/ Saudi Arabia) scripted microphone | |
127 | Text | NER, Content Classification, Search Engines | N/A | 20,774 sentences | Add Quote | ARB_NER001 | Appen Global | News NER | Standard Arabic | N/A | N/A | N/A | N/A | 20774 | Available on request | N/A | text | Arabic NER news text | ||
147 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add Quote | asm_IND_PHON | Appen Global | Pronunciation Dictionary | Assamese | India | N/A | N/A | N/A | N/A | 40000 | N/A | text | Assamese (India) Pronunciation Dictionary | ||
121 | Audio | Baby Monitor, Security & Other Consumer Applications | Mobile phone | 3 hours | Add Quote | CRY_ASR001 | Appen China | Human Sound | N/A | China | Low background noise (home/office) | 100 | 1 | N/A | N/A | 16 | wav | Crying sound of babies 0-3 years old, each lasting around 2 minutes. | Baby crying audio | |
4 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 31 hours | Add Quote | BAH_ASR001 | Appen Global | Conversational Speech | Indonesian | Indonesia | Low background noise | 1002 | 2 | 30695 | 11480 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For a large proportion of calls, only one half of the conversation was collected and transcribed |
Bahasa Indonesia conversational telephony | |
150 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | eus_ESP_PHON | Appen Global | Pronunciation Dictionary | Basque | Spain | N/A | N/A | N/A | N/A | 10000 | N/A | text | Basque (Spain) Pronunciation Dictionary | ||
6 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 47 hours | Add Quote | BEN_ASR001 | Appen Global | Conversational Speech | Bengali | Bangladesh | Mixed (in-car, roadside, home/office) | 1000 | 2 | 108923 | 17922 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Bengali (Bangladesh) conversational telephony | |
151 | Text | ASR, TTS, Language Modelling | N/A | 29,000 words | Add Quote | ben_IND_PHON | Appen Global | Pronunciation Dictionary | Bengali | India | N/A | N/A | N/A | N/A | 29000 | N/A | text | Bengali (India) Pronunciation Dictionary | ||
7 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 38 hours | Add Quote | BUL_ASR001 | Appen Global | Conversational Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 217 | 2 | 86453 | 22342 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Bulgarian (Bulgaria) conversational telephony | |
152 | Text | ASR, TTS, Language Modelling | N/A | 55,000 words | Add Quote | bul_BGR_PHON | Appen Global | Pronunciation Dictionary | Bulgarian | Bulgaria | N/A | N/A | N/A | N/A | 55000 | N/A | text | Bulgarian (Bulgaria) Pronunciation Dictionary | ||
111 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 22 hours | Add Quote | BUL_ASR002 | Global Phone | Scripted Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 77 | 1 | 8674 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Bulgarian (Bulgaria) scripted microphone | |
268 | Image | Document Processing, Document Search | Camera, scan | 5,832 documents | Add Quote | IMG_OCR_B2B | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-business documents containing printed text. 38% Premium Quality images including Purchase Order, Payment Advice or Remittance Advice, Order Confirmation and Delivery note; 64% Standard Quality images in various challenging conditions in a wider range of categories including Complaints or Return, Delivery advice, Delivery note, Dunning, Goods receipt, Invoice, Offer, Order confirmation, Pay slip, Payment Advice or Remittance Advice, Purchase Order, Receipt, and Supplier load | Business-to-business printed text document OCR | |
269 | Image | Document Processing, Document Search | Camera, scan | 22,626 documents | Add Quote | IMG_OCR_B2C_Other | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-consumer and miscellaneous other category documents containing text: 37% invoices, 42% receipts, 1% documents with tables, 2% handwritten forms and documents, 2% menus, 11% product labels, 2% posters, 3% street signs. 6 Languages collected in 23+ locales: 11% Arabic, 43% English, 4% French, 4% German, 24% Spanish, 14% Russian | Business-to-consumer/other text document OCR | |
155 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | yue_HKG_POS | Appen Global | Part of Speech Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 10000 | N/A | text | Traditional | Cantonese (China) Part of Speech Dictionary | |
153 | Text | ASR, TTS, Language Modelling | N/A | 37,000 words | Add Quote | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 37000 | N/A | text | Simplified | Cantonese (China) Pronunciation Dictionary | |
154 | Text | ASR, TTS, Language Modelling | N/A | 40,000 words | Add Quote | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 40000 | N/A | text | Traditional | Cantonese (China) Pronunciation Dictionary | |
156 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | cat_ESP_PHON | Appen Global | Pronunciation Dictionary | Catalan | Spain | N/A | N/A | N/A | N/A | 10000 | N/A | text | Catalan (Spain) Pronunciation Dictionary | ||
157 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add Quote | ceb_PHL_PHON | Appen Global | Pronunciation Dictionary | Cebuano | Philippines | N/A | N/A | N/A | N/A | 20000 | N/A | text | Cebuano (Philippines) Pronunciation Dictionary | ||
265 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 200 hours | Add Quote | FOREIGNER_ASR001_CN | Appen China | Scripted Speech | Mandarin Chinese | China | Low background noise | 309 | 1 | 16 | wav | This database contains 200 hours of foreigners speaking Chinese from the following countries: Argentina, Egypt, Australia, Russia, the Philippines, Kazakhstan, Korea, Kyrgyzstan, Canada, Kuala Lumpur, Kenya, Laos, Malaysia, Mauritius, the United States, Mongolia, South Africa, Japan, Tajikistan, Thailand, Turkey, Hong Kong, Singapore, India, Indonesia, Vietnam There is no data from South Korea, Brazil, or data recorded by minors. Each session lasts about an hour; sentence duration ranges between 3-10 seconds The content is in the form of an individual reading while being recorded on a mobile phone in a home/office environment. Sensitive data and personal information has been scrubbed. |
Chinese (multinational foreigner) scripted smartphone | |||
10 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 39 hours | Add Quote | CRO_ASR001 | Appen Global | Conversational Speech | Croatian | Croatia | Low background noise (home/office) | 200 | 2 | Available on request | 23919 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Croatian (Croatia) conversational telephony | |
158 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add Quote | hrv_HRV_PHON | Appen Global | Pronunciation Dictionary | Croatian | Croatia | N/A | N/A | N/A | N/A | 20000 | N/A | text | Croatian (Croatia) Pronunciation Dictionary | ||
11 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 11 hours | Add Quote | CRO_ASR002 | Global Phone | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 94 | 1 | 4499 | 23929 | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Croatian (Croatia) scripted microphone | |
116 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 263 hours | Add Quote | CRO_ASR003_CN | Appen China | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 243 | 1 | 73467 | 136140 | 16 | wav | Dataset contains audio with corresponding text prompts | Croatian (Croatia) scripted smartphone | |
159 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add Quote | ces_CZE_PHON | Appen Global | Pronunciation Dictionary | Czech | Czech Republic | N/A | N/A | N/A | N/A | 50000 | N/A | text | Czech (Czech Republic) Pronunciation Dictionary | ||
12 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 31 hours | Add Quote | CZE_ASR001 | Global Phone | Scripted Speech | Czech | Czech Republic | Low background noise (home/office) | 102 | 1 | 12425 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Czech (Czech Republic) scripted microphone | |
13 | Audio | ASR, Virtual Assistant | Landline only | 93 hours | Add Quote | Czech SpeechDat(E) Dataset | Nuance | Scripted Speech | Czech | Czech Republic | Low background noise | 1000 | 1 | 52000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, and phonetically rich words and sentences |
Czech (Czech Republic) scripted telephony | |
161 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add Quote | dan_DNK_POS | Appen Global | Part of Speech Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 100000 | N/A | text | Danish (Denmark) Part of Speech Dictionary | ||
160 | Text | ASR, TTS, Language Modelling | N/A | 107,000 words | Add Quote | dan_DNK_PHON | Appen Global | Pronunciation Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 107000 | N/A | text | Danish (Denmark) Pronunciation Dictionary | ||
90 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add Quote | Speecon Danish | Nuance | Scripted Speech | Danish | Denmark | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Danish (Denmark) scripted microphone | |
15 | Audio | ASR, Automatic Captioning, Keyword Spotting | Microphone | 51 hours | Add Quote | DAR_BRC001 | Appen Global | Broadcast Speech | Dari | Afghanistan | Low background noise (studio) | N/A | 1 | Available on request | Available on request | N/A | wav | Dataset is fully transcribed and timestamped Pronunciation lexicon not currently available but can be developed upon request Dataset is largely speech only and does not include music or advertisements Data types include: talk shows, interviews, news broadcasts (excluding news reading by anchors) |
Dari (Afghanistan) broadcast | |
14 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 40 hours | Add Quote | DAR_ASR001 | Appen Global | Conversational Speech | Dari | Afghanistan | Low background noise | 500 | 2 | Available on request | 11168 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Dataset is largely speech only and does not include music or advertisements |
Dari (Afghanistan) conversational telephony | |
162 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add Quote | prs_AFG_PHON | Appen Global | Pronunciation Dictionary | Dari | Afghanistan | N/A | N/A | N/A | N/A | 30000 | N/A | text | Dari (Afghanistan) Pronunciation Dictionary | ||
163 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add Quote | luo_KEN_PHON | Appen Global | Pronunciation Dictionary | Dholuo | Kenya | N/A | N/A | N/A | N/A | 20000 | N/A | text | Dholuo (Kenya) Pronunciation Dictionary | ||
258 | Audio | ASR, Conversational AI, Speech Analytics | Recording pen/microphone | 84.6 hours | Add Quote | DONGBEI_ASR001_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 268 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. |
Dongbei dialect (China) Conversational Speech | |||
259 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 75.2 hours | Add Quote | DONGBEI_ASR002_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 185 | 1 | 8 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. |
Dongbei dialect (China) Conversational Speech | |||
91 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 47 hours | Add Quote | Speecon Dutch from Belgium | Nuance | Scripted Speech | Dutch | Belgium | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Dutch (Belgium) scripted microphone | |
33 | Audio | ASR, Virtual Assistant | Microphone | 80 hours | Add Quote | Flemish SpeechDat(II) FDB-1000 (FIXED1FL) | Nuance | Scripted Speech | Dutch | Belgium | Low background noise | 1000 | 1 | 52000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Dutch (Belgium) scripted telephony | |
19 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone and mobile phone | 27 hours | Add Quote | Dutch and Flemish SpeechDat-Car | Nuance | Scripted Speech | Dutch | Netherland - Belgium | Mixed (in-car) | 302 | 5 | 15100 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report 125 prompts per adult speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech |
Dutch (Netherlands & Belgium) scripted in-car | |
66 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 36 hours | Add Quote | NLD_ASR001 | Appen Global | Conversational Speech | Dutch | Netherlands | Low background noise | 200 | 2 | Available on request | 14964 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Dutch (Netherlands) conversational telephony | |
164 | Text | ASR, TTS, Language Modelling | N/A | 45,000 words | Add Quote | nld_NLD_PHON | Appen Global | Pronunciation Dictionary | Dutch | Netherlands | N/A | N/A | N/A | N/A | 45000 | N/A | text | Dutch (Netherlands) Pronunciation Dictionary | ||
92 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 68 hours | Add Quote | Speecon Dutch from the Netherlands | Nuance | Scripted Speech | Dutch | Netherlands | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Dutch (Netherlands) scripted microphone | |
122 | Image | Facial Recognition | Camera | 14948 images | Add Quote | IMG_FACE_KEN_CN | Appen China | Human Face | N/A | Kenya | Mixed background and lighting conditions | 99 | N/A | N/A | N/A | N/A | jpg | Images contain all combinations of 9 different lighting conditions, 2 different distances between participants face and smartphone, 7 different camera angles A random 32 images per person include occlusions such as sunglasses, masks, wigs or hats A random 36 shots include different facial expressions including stare, open mouth, pout mouth smile and frown Lighting conditions: indoor normal light, outdoor normal light, indoor backlight, outdoor backlight, indoor ordinary dark light, full black screen fill light, point light source (white light, street light), neon light, side glare) Camera angles: front, left 45°, right 45°, left 15°, right 15°, top 30°, bottom 30° |
East African facial images | |
21 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 28 hours | Add Quote | ENA_ASR001 | Appen Global | Conversational Speech | English | Egypt | Low background noise | 250 | 2 | Available on request | 5619 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins |
English (Arabic - Levant/Egypt) conversational telephony | |
166 | Text | ASR, TTS, Language Modelling | N/A | 157,000 words | Add Quote | eng_AUS_PHON | Appen Global | Pronunciation Dictionary | English | Australia | N/A | N/A | N/A | N/A | 157000 | N/A | text | English (Australia) Pronunciation Dictionary | ||
2 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 92 hours | Add Quote | AUS_ASR001 | Appen Global | Scripted Speech | English | Australia | Low background noise (home/office) | 500 | 1 | 82500 | 35137 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 162 prompts (read speech) per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items (from a set of 215), phonetically rich sentences and words |
English (Australia) scripted telephony | |
3 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 118 hours | Add Quote | AUS_ASR002 | Appen Global | Scripted Speech | English | Australia | Mixed | 1000 | 1 | 75000 | 18952 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 75 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words The prompts are a mixture of 'read' and 'elicited' items where 5 prompts per script are 'spontaneous free speech' |
English (Australia) scripted telephony | |
168 | Text | ASR, TTS, Language Modelling | N/A | 3,000 words | Add Quote | eng_CAN_POS | Appen Global | Part of Speech Dictionary | English | Canada | N/A | N/A | N/A | N/A | 3000 | N/A | text | English (Canada) Part of Speech Dictionary | ||
167 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add Quote | eng_CAN_PHON | Appen Global | Pronunciation Dictionary | English | Canada | N/A | N/A | N/A | N/A | 50000 | N/A | text | English (Canada) Pronunciation Dictionary | ||
22 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 144 hours | Add Quote | ENC_ASR001 | Appen Global | Scripted Speech | English | Canada | Mixed | 1000 | 1 | 99000 | 12483 | 8 | alaw or wav | Fully transcribed to SALA II/SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 99 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
English (Canada) scripted telephony | |
170 | Text | ASR, TTS, Language Modelling | N/A | 18,000 words | Add Quote | eng_HKG_PHON | Appen Global | Pronunciation Dictionary | English | Hong Kong | N/A | N/A | N/A | N/A | 18000 | N/A | text | English (Hong Kong) Pronunciation Dictionary | ||
271 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 143 hours | Add Quote | ENI_ASR003 | Appen Global | Conversational Speech | English | India | Mixed (home, car, public place, outdoor) | 272 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
English (India) conversational smartphone | |
25 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 67 hours | Add Quote | ENI_ASR002 | Appen Global | Conversational Speech | English | India | Low background noise | 540 | 2 | 77565 | 11646 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 271 telephony conversations are recorded for this project |
English (India) conversational telephony | |
172 | Text | ASR, TTS, Language Modelling | N/A | 13,000 words | Add Quote | eng_IND_POS | Appen Global | Part of Speech Dictionary | English | India | N/A | N/A | N/A | N/A | 13000 | N/A | text | English (India) Part of Speech Dictionary | ||
171 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add Quote | eng_IND_PHON | Appen Global | Pronunciation Dictionary | English | India | N/A | N/A | N/A | N/A | 60000 | N/A | text | English (India) Pronunciation Dictionary | ||
24 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 217 hours | Add Quote | ENI_ASR001 | Appen Global | Scripted Speech | English | India | Mixed | 2358 | 1 | 115541 | 9190 | 8 | alaw or wav | Fully transcribed to SpeechDAT type conventions. Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 49 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
English (India) scripted telephony | |
173 | Text | ASR, TTS, Language Modelling | N/A | 12,000 words | Add Quote | eng_IRL_PHON | Appen Global | Pronunciation Dictionary | English | Ireland | N/A | N/A | N/A | N/A | 12000 | N/A | text | English (Ireland) Pronunciation Dictionary | ||
174 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add Quote | eng_NZL_PHON | Appen Global | Pronunciation Dictionary | English | NZ | N/A | N/A | N/A | N/A | 50000 | N/A | text | English (NZ) Pronunciation Dictionary | ||
23 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 53 hours | Add Quote | ENF_ASR001 | Appen Global | Conversational Speech | English | Philippines | Low background noise | 450 | 2 | 41602 | 7272 | 8 | alaw or wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins |
English (Philippines) conversational telephony | |
169 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add Quote | eng_PHL_PHON | Appen Global | Pronunciation Dictionary | English | Philippines | N/A | N/A | N/A | N/A | 5000 | N/A | text | English (Philippines) Pronunciation Dictionary | ||
165 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add Quote | eng_ARE_PHON | Appen Global | Pronunciation Dictionary | English | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 5000 | N/A | text | English (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
67 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 33 hours | Add Quote | OrienTel English as spoken in the United Arab Emirates | Nuance | Scripted Speech | English | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 25500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 51 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
English (United Arab Emirates (UAE)) scripted telephony | |
104 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 150 hours | Add Quote | UKE_ASR001 | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1175 | 2 | 298562 | 24193 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
English (United Kingdom) conversational telephony | |
255 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 50 hours | Add Quote | UKE_ASR001B | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1150 | 2 | Available on request | 13192 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
English (United Kingdom) conversational telephony | |
176 | Text | ASR, TTS, Language Modelling | N/A | 155,000 words | Add Quote | eng_GBR_POS | Appen Global | Part of Speech Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 155000 | N/A | text | English (United Kingdom) Part of Speech Dictionary | ||
175 | Text | ASR, TTS, Language Modelling | N/A | 195,000 words | Add Quote | eng_GBR_PHON | Appen Global | Pronunciation Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 195000 | N/A | text | English (United Kingdom) Pronunciation Dictionary | ||
99 | Audio | TTS | Headset microphone | 11 hours | Add Quote | TC-STAR female baseline voice Laura | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription |
English (United Kingdom) scripted microphone - single female | |
100 | Audio | TTS | Headset microphone | 7 hours | Add Quote | TC-STAR male baseline voice Ian | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription |
English (United Kingdom) scripted microphone - single male | |
272 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 50 hours | Add Quote | USE_ASR004 | Appen Global | Conversational Speech | English | United States | Mixed (home, car, public place, outdoor) | 94 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
English (United States - African American) conversational smartphone | |
266 | Text | Virtual Assistant, Chatbot | N/A | 952,677 messages | Add Quote | ENG_SMS001 | Appen Global | SMS text messages | English | United States | N/A | Available on request | N/A | 952677 | Available on request | N/A | text | This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
267 | Text | Virtual Assistant, Chatbot | N/A | 106,649 messages | Add Quote | ENG_SMS001A | Appen Global | SMS text messages | English | United States | N/A | 390 | N/A | 106649 | Available on request | N/A | text | This is a subset of ENG_SMS001. This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
270 | Text | Virtual Assistant, Chatbot | N/A | 351,826 messages | Add Quote | ENG_SMS002 | Appen Global | WhatsApp text messages | English | United States | N/A | Available on request | N/A | 351826 | Available on request | N/A | text | This dataset contains threaded text message conversations between 2 participants, using WhatsApp. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation WhatsApp - Threaded | |
107 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 1000 hours | Add Quote | USE_ASR003 | Appen Global | Conversational Speech | English | United States | Low background noise | 2000 | 1 | 500000 | 52586 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Conversations cover a wide variety of topics including: study/major/work, hometown, living arrangements, weather and seasons, punctuality, TV programs/film) |
English (United States) conversational smartphone | |
178 | Text | ASR, TTS, Language Modelling | N/A | 263,000 words | Add Quote | eng_USA_POS | Appen Global | Part of Speech Dictionary | English | United States | N/A | N/A | N/A | N/A | 263000 | N/A | text | English (United States) Part of Speech Dictionary | ||
177 | Text | ASR, TTS, Language Modelling | N/A | 330,000 words | Add Quote | eng_USA_PHON | Appen Global | Pronunciation Dictionary | English | United States | N/A | N/A | N/A | N/A | 330000 | N/A | text | English (United States) Pronunciation Dictionary | ||
93 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add Quote | Speecon English (USA) database | Nuance | Scripted Speech | English | United States | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
English (United States) scripted microphone | |
106 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 62 hours | Add Quote | USE_ASR001 | Appen Global | Scripted Speech | English | United States | Low background noise (studio) | 200 | 2 | 80000 | 18318 | 48 | raw PCM or wav PCM | Dataset is fully transcribed and timestamped Dataset is formatted according to SALA II/SpeechDAT style conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 400 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words |
English (United States) scripted microphone | |
128 | Text | NER, Content Classification, Search Engines | N/A | 22,768 sentences | Add Quote | ENG_NER001 | Appen Global | News NER | English | N/A | N/A | N/A | N/A | 22768 | Available on request | N/A | text | English NER news text | ||
132 | Text | NER, Content Classification, Search Engines | N/A | 19,584 sentences | Add Quote | FAR_NER001 | Appen Global | News NER | Iranian Persian | Iran | N/A | N/A | N/A | 19584 | Available on request | N/A | text | Farsi/Persian NER news text | ||
182 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | fin_FIN_POS | Appen Global | Part of Speech Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 10000 | N/A | text | Finnish (Finland) Part of Speech Dictionary | ||
125 | Image | Document Processing, Document Search | Camera | 7293 images | Add Quote | IMG_OCR_FIN_CN | Appen China | Document OCR | Finnish | Finland | Mixed lighting conditions | 4 | N/A | N/A | N/A | N/A | jpg | Images containing text, such as billboards / outer packaging / signage / magazines / menus, etc. | Finnish (Finland) printed text OCR | |
181 | Text | ASR, TTS, Language Modelling | N/A | 85,000 words | Add Quote | fin_FIN_PHON | Appen Global | Pronunciation Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 85000 | N/A | text | Finnish (Finland) Pronunciation Dictionary | ||
142 | Text | ASR, TTS, Language Modelling | N/A | 4,000 words | Add Quote | fra_DZA_PHON | Appen Global | Pronunciation Dictionary | French | Algeria | N/A | N/A | N/A | N/A | 4000 | N/A | text | Arabic script | French (Algeria) Pronunciation Dictionary | |
5 | Audio | ASR, Virtual Assistant | Landline only | 76 hours | Add Quote | Belgian French SpeechDat(II) FDB-1000 (FIXED1BF) | Nuance | Scripted Speech | French | Belgium | Low background noise | 1000 | 1 | 53000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
French (Belgium) scripted telephony | |
36 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 9 hours | Add Quote | FRC_ASR003 | Appen Global | Conversational Speech | French | Canada | Mixed | 68 | 2 | Available on request | 6022 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins For the majority of calls, only one half of the conversation was collected and transcribed, however, for a smaller number of calls, both speakers (in-line/out-line) were collected and transcribed |
French (Canada) conversational telephony | |
183 | Text | ASR, TTS, Language Modelling | N/A | 67,000 words | Add Quote | fra_CAN_PHON | Appen Global | Pronunciation Dictionary | French | Canada | N/A | N/A | N/A | N/A | 67000 | N/A | text | French (Canada) Pronunciation Dictionary | ||
35 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 46 hours | Add Quote | FRC_ASR002 | Appen Global | Scripted Speech | French | Canada | Low background noise (home/office) | 150 | 1 | 22500 | 10755 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 150 prompts per speaker including digits, digit strings (randomly generated), addressses and phonetically rich sentences and words |
French (Canada) scripted microphone | |
34 | Audio | ASR, Virtual Assistant | Mobile phone | 131 hours | Add Quote | FRC_ASR001 | Appen Global | Scripted Speech | French | Canada | Mixed | 1000 | 1 | 100000 | 11697 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 100 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
French (Canada) scripted telephony | |
275 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 159 hours | Add Quote | FRF_ASR004 | Appen Global | Conversational Speech | French | France | Mixed (home, car, public place, outdoor) | 298 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
French (France) conversational smartphone | |
40 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 25 hours | Add Quote | FRF_ASR001 | Appen Global | Conversational Speech | French | France | Low background noise | 563 | 2 | Available on request | 11922 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed |
French (France) conversational telephony | |
39 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone and mobile phone | 113 hours | Add Quote | French SpeechDat-Car | Nuance | Scripted Speech | French | France | Mixed (in-car) | 300 | 5 | 37500 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report Approximately 125 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech 113.7 hours |
French (France) In-Car | |
185 | Text | ASR, TTS, Language Modelling | N/A | 95,000 words | Add Quote | fra_FRA_POS | Appen Global | Part of Speech Dictionary | French | France | N/A | N/A | N/A | N/A | 95000 | N/A | text | French (France) Part of Speech Dictionary | ||
184 | Text | ASR, TTS, Language Modelling | N/A | 112,000 words | Add Quote | fra_FRA_PHON | Appen Global | Pronunciation Dictionary | French | France | N/A | N/A | N/A | N/A | 112000 | N/A | text | French (France) Pronunciation Dictionary | ||
41 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 26 hours | Add Quote | FRF_ASR003 | Global Phone | Scripted Speech | French | France | Low background noise (home/office) | 98 | 1 | 10273 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
French (France) scripted microphone | |
37 | Audio | ASR, Virtual Assistant | Landline only | 41 hours | Add Quote | French SpeechDat(II) FDB-1000 | Nuance | Scripted Speech | French | France | Low background noise (home/office) | 1017 | 1 | 48000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (France) scripted telephony | |
38 | Audio | ASR, Virtual Assistant | Landline only | 305 hours | Add Quote | French SpeechDat(II) FDB-5000 | Nuance | Scripted Speech | French | France | Low background noise | 5040 | 1 | 237000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 47 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (France) scripted telephony | |
60 | Audio | ASR, Virtual Assistant | Landline only | 45 hours | Add Quote | Luxembourgish French SpeechDat(II) FDB-500 (FIXED1LF) | Nuance | Scripted Speech | French | Luxembourg | Low background noise | 614 | 1 | 32000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (Luxembourg) telephony | |
273 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 104 hours | Add Quote | DEU_ASR004 | Appen Global | Conversational Speech | German | Germany | Mixed (home, car, public place, outdoor) | 198 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
German (Germany) conversational smartphone | |
186 | Text | ASR, TTS, Language Modelling | N/A | 146,000 words | Add Quote | deu_DEU_PHON | Appen Global | Pronunciation Dictionary | German | Germany | N/A | N/A | N/A | N/A | 146000 | N/A | text | German (Germany) Pronunciation Dictionary | ||
16 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 16 hours | Add Quote | DEU_ASR001 | Appen Global | Scripted Speech | German | Germany | Low background noise (studio) | 127 | 2 | 12700 | 6826 | 48 | raw PCM | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 100 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words |
German (Germany) scripted microphone | |
18 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 25 hours | Add Quote | DEU_ASR003 | Global Phone | Scripted Speech | German | Germany | Low background noise (home/office) | 77 | 1 | 10085 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
German (Germany) scripted microphone | |
42 | Audio | ASR, Virtual Assistant | Landline only | 31 hours | Add Quote | German SpeechDat (II) FDB-1000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 988 | 1 | 43000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 44 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Germany) telephony | |
43 | Audio | ASR, Virtual Assistant | Landline only | 268 hours | Add Quote | German SpeechDat(II) FDB-4000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 4000 | 1 | 160000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Germany) telephony | |
61 | Audio | ASR, Virtual Assistant | Landline only | 33 hours | Add Quote | Luxembourgish German SpeechDat(II) FDB-500 (FIXED1LG) | Nuance | Scripted Speech | German | Luxembourg | Low background noise | 500 | 1 | 26500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Luxembourg) telephony | |
187 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add Quote | deu_CHE_PHON | Appen Global | Pronunciation Dictionary | German | Switzerland | N/A | N/A | N/A | N/A | 15000 | N/A | text | German (Switzerland) Pronunciation Dictionary | ||
94 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 53 hours | Add Quote | Speecon German (Switzerland) database | Nuance | Scripted Speech | German | Switzerland | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
German (Switzerland) scripted microphone | |
68 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 31 hours | Add Quote | OrienTel German Spoken by Turkish | Nuance | Scripted Speech | German | Turkey | Low background noise | 300 | 1 | 15600 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Turkey) telephony | |
188 | Text | ASR, TTS, Language Modelling | N/A | 5,000 words | Add Quote | ell_GRC_PHON | Appen Global | Pronunciation Dictionary | Greek | Greece | N/A | N/A | N/A | N/A | 5000 | N/A | text | Greek (Greece) Pronunciation Dictionary | ||
117 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 191 hours | Add Quote | GRE_ASR001_CN | Appen China | Scripted Speech | Greek | Greece | Low background noise (home/office) | 287 | 1 | 54113 | 68271 | 16 | wav | Dataset contains audio with corresponding text prompts | Greek (Greece) scripted smartphone | |
189 | Text | ASR, TTS, Language Modelling | N/A | 35,000 words | Add Quote | grn_PRY_PHON | Appen Global | Pronunciation Dictionary | Guarani | Paraguay | N/A | N/A | N/A | N/A | 35000 | N/A | text | Guarani (Paraguay) Pronunciation Dictionary | ||
190 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add Quote | hat_HTI_PHON | Appen Global | Pronunciation Dictionary | Haitian Creole | Haiti | N/A | N/A | N/A | N/A | 15000 | N/A | text | Haitian Creole (Haiti) Pronunciation Dictionary | ||
277 | Image | Document Processing, Document Search | Camera, scan | 964 images | Add Quote | IMG_OCR_Handwritten | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | This is a subset of IMG_OCR_B2C_Other. Scans and photographs of handwritten forms and handwritten documents. 6 Languages collected in 23+ locales: 8% Arabic, 41% English, 7% French, 2% German, 20% Russian, 22% Spanish | Handwritten text document OCR | |
45 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 33 hours | Add Quote | HAU_ASR002 | Appen Global | Conversational Speech | Hausa | Nigeria | Low background noise | 200 | 2 | Available on request | 7949 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Hausa (Nigeria) conversational telephony | |
191 | Text | ASR, TTS, Language Modelling | N/A | 11,000 words | Add Quote | hau_NGA_PHON | Appen Global | Pronunciation Dictionary | Hausa | Nigeria | N/A | N/A | N/A | N/A | 11000 | N/A | text | Hausa (Nigeria) Pronunciation Dictionary | ||
44 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 20 hours | Add Quote | HAU_ASR001 | Global Phone | Scripted Speech | Hausa | Cameroon | Low background noise (home/office) | 103 | 1 | 7895 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Hausa scripted microphone | |
46 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 34 hours | Add Quote | HEB_ASR001 | Appen Global | Conversational Speech | Hebrew | Israel | Low background noise | 200 | 2 | Available on request | 19250 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Hebrew (Israel) conversational telephony | |
192 | Text | ASR, TTS, Language Modelling | N/A | 31,000 words | Add Quote | heb_ISR_PHON | Appen Global | Pronunciation Dictionary | Hebrew | Israel | N/A | N/A | N/A | N/A | 31000 | N/A | text | Hebrew (Israel) Pronunciation Dictionary | ||
48 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 32 hours | Add Quote | HIN_ASR002 | Appen Global | Conversational Speech | Hindi | India | Mixed | 996 | 2 | Available on request | 12266 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed |
Hindi (India) conversational telephony | |
193 | Text | ASR, TTS, Language Modelling | 35,000 words | Add Quote | hin_IND_PHON | Appen Global | Pronunciation Dictionary | Hindi | India | N/A | N/A | N/A | N/A | 35000 | N/A | text | Hindi (India) Pronunciation Dictionary | |||
47 | Audio | ASR, Virtual Assistant | Mobile phone | 224 hours | Add Quote | HIN_ASR001 | Appen Global | Scripted Speech | Hindi | India | Low background noise | 1920 | 1 | 96000 | 9853 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 50 prompts per speaker including digits, natural numbers, personal, business and place names, web addresses, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
Hindi (India) scripted telephony | |
126 | Video | Fitness Applications, Action Classification, Gesture Recognition | Mobile phone | 2000 videos | Add Quote | VED_HUMAN_BODY_CN | Appen China | Human Body | N/A | China | Mixed background and lighting conditions | 1000 | N/A | N/A | N/A | N/A | mp4 | Video clips are approximately 10-20 seconds long | Human body movement | |
194 | Text | ASR, TTS, Language Modelling | N/A | 500 words | Add Quote | hun_HUN_PHON | Appen Global | Pronunciation Dictionary | Hungarian | Hungary | N/A | N/A | N/A | N/A | 500 | N/A | text | Hungarian (Hungary) Pronunciation Dictionary | ||
118 | Audio | ASR, Virtual Assistant, Chatbot | Mobile phone | 286 hours | Add Quote | HUN_ASR001_CN | Appen China | Scripted Speech | Hungarian | Hungary | Low background noise (home/office) | 254 | 1 | 94031 | 201921 | 16 | wav | Dataset contains audio with corresponding text prompts | Hungarian (Hungary) scripted smartphone | |
49 | Audio | ASR, Virtual Assistant | Landline only | 65 hours | Add Quote | Hungarian SpeechDat(E) | Nuance | Scripted Speech | Hungarian | Hungary | Low background noise | 1000 | 1 | 48000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
Hungarian (Hungary) scripted telephony | |
195 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add Quote | ibo_NGA_PHON | Appen Global | Pronunciation Dictionary | Igbo | Nigeria | N/A | N/A | N/A | N/A | 30000 | N/A | text | Igbo (Nigeria) Pronunciation Dictionary | ||
149 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | ind_IDN_POS | Appen Global | Part of Speech Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 10000 | N/A | text | Indonesian (Indonesia) Part of Speech Dictionary | ||
148 | Text | ASR, TTS, Language Modelling | N/A | 95,000 words | Add Quote | ind_IDN_PHON | Appen Global | Pronunciation Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 95000 | N/A | text | Indonesian (Indonesia) Pronunciation Dictionary | ||
262 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 100 hours | Add Quote | NMG_ASR001_CN | Appen China | Conversational Speech | Inner Mongolian | China | Low background noise | 200 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover the following areas: Xilingol League, Tongliao, Hohhot. Each recording session contains about 30 minutes of free dialogue between 2 people. |
Inner Mongolian (China) Conversational Speech | |||
32 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 30 hours | Add Quote | FAR_ASR002 | Appen Global | Conversational Speech | Iranian Persian (Farsi) | Iran | Mixed | 1000 | 2 | Available on request | 12358 | 8 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Iranian Persian (Farsi) (Iran) conversational telephony | |
31 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 85 hours | Add Quote | FAR_ASR001 | Appen Global | Scripted Speech | Iranian Persian (Farsi) | Iran | Mixed | 789 | 1 | 38400 | 8716 | 8 | alaw | Fully transcribed to OrienTel type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 48 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
Iranian Persian (Farsi) (Iran) scripted telephony | |
180 | Text | ASR, TTS, Language Modelling | N/A | 1,400,000 words | Add Quote | pes_IRN_POS | Appen Global | Part of Speech Dictionary | Iranian Persian | Iran | N/A | N/A | N/A | N/A | 1400000 | N/A | text | Iranian Persian (Iran) Part of Speech Dictionary | ||
179 | Text | ASR, TTS, Language Modelling | N/A | 80,000 words | Add Quote | pes_IRN_PHON | Appen Global | Pronunciation Dictionary | Iranian Persian | Iran | N/A | N/A | N/A | N/A | 80000 | N/A | text | Iranian Persian (Iran) Pronunciation Dictionary | ||
276 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone | 256 hours | Add Quote | ITA_ASR005 | Appen Global | Conversational Speech | Italian | Italy | Mixed (home, car, public place, outdoor) | 482 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
Italian (Italy) conversational smartphone | |
52 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 36 hours | Add Quote | ITA_ASR003 | Appen Global | Conversational Speech | Italian | Italy | Low background noise | 200 | 2 | Available on request | 18974 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Italian (Italy) conversational telephony | |
197 | Text | ASR, TTS, Language Modelling | N/A | 147,000 words | Add Quote | ita_ITA_POS | Appen Global | Part of Speech Dictionary | Italian | Italy | N/A | N/A | N/A | N/A | 147000 | N/A | text | Italian (Italy) Part of Speech Dictionary | ||
196 | Text | ASR, TTS, Language Modelling | N/A | 197,000 words | Add Quote | ita_ITA_PHON | Appen Global | Pronunciation Dictionary | Italian | Italy | N/A | N/A | N/A | N/A | 197000 | N/A | text | Italian (Italy) Pronunciation Dictionary | ||
50 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 44 hours | Add Quote | ITA_ASR001 | Appen Global | Scripted Speech | Italian | Italy | Mixed | 200 | 4 | 40000 | 7316 | 22 | raw PCM | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 prompts per speaker including 100 command and control type items and 100 phonetically rich sentences |
Italian (Italy) scripted microphone | |
53 | Audio | TTS | Microphone | 3 hours | Add Quote | ITA_TTS001 | Appen Global | Scripted Speech | Italian | Italy | Low background noise (studio) | 1 | 1 | 3300 | Available on request | 22 | raw PCM | Dataset is accompanied by a pronunciation lexicon containing all words spoken in the Dataset 3,300 prompts per speaker including phonetically rich sentences |
Italian (Italy) scripted microphone | |
51 | Audio | ASR, Virtual Assistant, In Car HMI & Entertainment | Microphone | 47 hours | Add Quote | ITA_ASR002 | Appen Global | Scripted Speech | Italian | Italy | Mixed (in-car) | 205 | 4 | 35875 | 10366 | 48 | raw PCM | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 350 prompts per speaker including digits, street names, generic command and control items, phonetically rich sentences and words Each speaker recorded 1or 2 sessions including Session 1 in a parked vehicle with the engine running and Session 2 in a vehicle travelling at 60 mph (100 km/h) |
Italian (Italy) scripted microphone in-car | |
54 | Audio | ASR, Virtual Assistant | Landline only | 38 hours | Add Quote | Italian Fixed Network Speech SpeechDat(M) Corpus | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 1000 | 1 | 39000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 39 prompts per speaker includign isolated and connected digits, natural numbers, money amounts, spelled words, time and date phrases, yes/no questions, city names, common application words, application words in phrases and phonetically rich sentences |
Italian (Italy) telephony | |
55 | Audio | ASR, Virtual Assistant | Landline only | 228 hours | Add Quote | Italian SpeechDat(II) FDB-3000 | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 3040 | 1 | 134000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 44 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
Italian (Italy) telephony | |
56 | Audio | ASR, Virtual Assistant | Mobile phone | 103 hours | Add Quote | Italian SpeechDat(II) MDB-250 | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 375 | 1 | 19000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 51 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
Italian (Italy) telephony | |
89 | Audio | ASR, Virtual Assistant | Mobile phone | 13 hours | Add Quote | SpeechDat(M) Italian Mobile Network Speech Database | Nuance | Scripted Speech | Italian | Italy | Low background noise (home/office) | 342 | 1 | 13500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
Italian (Italy) telephony | |
199 | Text | ASR, TTS, Language Modelling | N/A | 269,000 words | Add Quote | jpn_JPN_POS | Appen Global | Part of Speech Dictionary | Japanese | Japan | N/A | N/A | N/A | N/A | 269000 | N/A | text | Japanese (Japan) Part of Speech Dictionary | ||
198 | Text | ASR, TTS, Language Modelling | N/A | 262,000 words | Add Quote | jpn_JPN_PHON | Appen Global | Pronunciation Dictionary | Japanese | Japan | N/A | N/A | N/A | N/A | 262000 | N/A | text | Japanese (Japan) Pronunciation Dictionary | ||
57 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 33 hours | Add Quote | JPN_ASR001 | Global Phone | Scripted Speech | Japanese | Japan | Low background noise (home/office) | 144 | 1 | 13067 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Japanese (Japan) scripted microphone | |
95 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 57 hours | Add Quote | Speecon Japanese | Nuance | Scripted Speech | Japanese | Japan | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Japanese (Japan) scripted microphone | |
133 | Text | NER, Content Classification, Search Engines | N/A | 20,629 sentences | Add Quote | JPY_NER001 | Appen Global | News NER | Japanese | Japan | N/A | N/A | N/A | 20629 | Available on request | N/A | text | Japanese NER news text | ||
200 | Text | ASR, TTS, Language Modelling | N/A | 20,000 words | Add Quote | jav_IDN_PHON | Appen Global | Pronunciation Dictionary | Javanese | Indonesia | N/A | N/A | N/A | N/A | 20000 | N/A | text | Javanese (Indonesia) Pronunciation Dictionary | ||
58 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 15 hours | Add Quote | KAN_ASR001 | Appen Global | Conversational Speech | Kannada | India | Mixed | 178 | 2 | Available on request | 15660 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Kannada (India) conversational telephony | |
109 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 57 hours | Add Quote | KAN_ASR001A | Appen Global | Conversational Speech | Kannada | India | Mixed | 1000 | 2 | Available on request | 15660 | 8 | alaw | Approx. 25% of the dataset sessions are transcribed and time stamped - full transcripts can be made available Database is accompanied by a pronunciation lexicon containing all transcribed words |
Kannada (India) conversational telephony | |
201 | Text | ASR, TTS, Language Modelling | N/A | 49,000 words | Add Quote | kan_IND_PHON | Appen Global | Pronunciation Dictionary | Kannada | India | N/A | N/A | N/A | N/A | 49000 | N/A | text | Kannada (India) Pronunciation Dictionary | ||
202 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add Quote | kaz_KAZ_PHON | Appen Global | Pronunciation Dictionary | Kazakh | Kazakhstan | N/A | N/A | N/A | N/A | 30000 | N/A | text | Kazakh (Kazakhstan) Pronunciation Dictionary | ||
204 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add Quote | kor_KOR_POS | Appen Global | Part of Speech Dictionary | Korean | South Korea | N/A | N/A | N/A | N/A | 100000 | N/A | text | Korean (South Korea) Part of Speech Dictionary | ||
203 | Text | ASR, TTS, Language Modelling | N/A | 100,000 words | Add Quote | kor_KOR_PHON | Appen Global | Pronunciation Dictionary | Korean | South Korea | N/A | N/A | N/A | N/A | 100000 | N/A | text | Korean (South Korea) Pronunciation Dictionary | ||
59 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 20 hours | Add Quote | KOR_ASR001 | Global Phone | Scripted Speech | Korean | South Korea | Low background noise (home/office) | 100 | 1 | 8107 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Korean (South Korea) scripted microphone | |
129 | Text | NER, Content Classification, Search Engines | N/A | 25,830 sentences | Add Quote | KOR_NER001 | Appen Global | News NER | Korean | South Korea | N/A | N/A | N/A | 25830 | Available on request | N/A | text | Korean NER news text | ||
205 | Text | ASR, TTS, Language Modelling | N/A | 60,000 words | Add Quote | kur_TUR_PHON | Appen Global | Pronunciation Dictionary | Kurmanji | Turkey | N/A | N/A | N/A | N/A | 60000 | N/A | text | Kurmanji (Turkey) Pronunciation Dictionary | ||
206 | Text | ASR, TTS, Language Modelling | N/A | 9,000 words | Add Quote | lao_LAO_PHON | Appen Global | Pronunciation Dictionary | Lao | Laos | N/A | N/A | N/A | N/A | 9000 | N/A | text | Lao (Laos) Pronunciation Dictionary | ||
207 | Text | ASR, TTS, Language Modelling | N/A | 71,000 words | Add Quote | lit_LTU_PHON | Appen Global | Pronunciation Dictionary | Lithuanian | Lithuania | N/A | N/A | N/A | N/A | 71000 | N/A | text | Lithuanian (Lithuania) Pronunciation Dictionary | ||
208 | Text | ASR, TTS, Language Modelling | N/A | 19,000 words | Add Quote | mal_IND_PHON | Appen Global | Pronunciation Dictionary | Malayalam | India | N/A | N/A | N/A | N/A | 19000 | N/A | text | Malayalam (India) Pronunciation Dictionary | ||
209 | Text | ASR, TTS, Language Modelling | N/A | 10,000 words | Add Quote | msa_MYS_PHON | Appen Global | Pronunciation Dictionary | Malaysian | Malaysia | N/A | N/A | N/A | N/A | 10000 | N/A | text | Malaysian (Malaysia) Pronunciation Dictionary | ||
210 | Text | ASR, TTS, Language Modelling | N/A | 35,000 words | Add Quote | zho_CHN_PHON | Appen Global | Pronunciation Dictionary | Mandarin (Simplified) | China | N/A | N/A | N/A | N/A | 35000 | N/A | text | Mandarin (Simplified) (China) Pronunciation Dictionary | ||
211 | Text | ASR, TTS, Language Modelling | N/A | 50,000 words | Add Quote | zho_TWN_PHON | Appen Global | Pronunciation Dictionary | Mandarin (Traditional) | Taiwan | N/A | N/A | N/A | N/A | 50000 | N/A | text | Mandarin (Traditional) (Taiwan) Pronunciation Dictionary | ||
63 | Audio | ASR, Virtual Assistant, Chatbot | Microphone | 26 hours | Add Quote | MAC_ASR002 | Global Phone | Scripted Speech | Mandarin Chinese | China | Low background noise (home/office) | 132 | 1 | 10225 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web tocover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Mandarin Chinese (China) scripted microphone | |
62 | Audio | ASR, Virtual Assistant | Mobile phone and landline | 323 hours | Add Quote | MAC_ASR001 | Appen Global | Scripted Speech | Mandarin Chinese | China | Mixed | 2000 | 1 | 200000 | 7145 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 98 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items (from a set of 215), phonetically rich sentences and words |
Mandarin Chinese (China) scripted telephony | |
131 | Text | NER, Content Classification, Search Engines | N/A | 17,313 sentences | Add Quote | MAC_NER001 | Appen Global | News NER | Mandarin Chinese | China | N/A | N/A | N/A | 17313 | Available on request | N/A | text | Mandarin NER news text | ||
64 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 15 hours | Add Quote | MAR_ASR001 | Appen Global | Conversational Speech | Marathi | India | Mixed | 180 | 2 | Available on request | 11908 | 8 | alaw | Approx. 29% of the dataset sessions are transcribed and time stamped - full transcripts can be made available Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Marathi (India) conversational telephony | |
110 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 52 hours | Add Quote | MAR_ASR001A | Appen Global | Conversational Speech | Marathi | India | Mixed | 1000 | 2 | Available on request | 11908 | 8 | alaw | Portion of the dataset sessions are transcribed and time stamped - full transcripts can be made available Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Marathi (India) conversational telephony | |
212 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add Quote | mar_IND_PHON | Appen Global | Pronunciation Dictionary | Marathi | India | N/A | N/A | N/A | N/A | 30000 | N/A | text | Marathi (India) Pronunciation Dictionary | ||
213 | Text | ASR, TTS, Language Modelling | N/A | 30,000 words | Add Quote | mon_MNG_PHON | Appen Global | Pronunciation Dictionary | Mongolian | Mongolia | N/A | N/A | N/A | N/A | 30000 | N/A | text | Mongolian (Mongolia) Pronunciation Dictionary | ||
215 | Text | ASR, TTS, Language Modelling | N/A | 3,000 words | Add Quote | nor_NOR_POS | Appen Global | Part of Speech Dictionary | Norwegian | Norway | N/A | N/A | N/A | N/A | 3000 | N/A | text | Norwegian (Norway) Part of Speech Dictionary | ||
214 | Text | ASR, TTS, Language Modelling | N/A | 115,000 words | Add Quote | nor_NOR_PHON | Appen Global | Pronunciation Dictionary | Norwegian | Norway | N/A | N/A | N/A | N/A | 115000 | N/A | text | Norwegian (Norway) Pronunciation Dictionary | ||
264 | Image | Image label recognition training | Mobile phone and camera | 2196 images | Add Quote | IMG_TAG_CN | Appen China | Object Image | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | jpg | Multi-scene picture sample library of 2196 images, with the following categories: KTV: 50, Department store: 55, Office: 100; Museum: 63; Electrical appliances: 55; Marine: 191; Car: 50; Handbags: 35; Night view: 54; Sports equipment: 54; Convenience stores: 34; Restaurant: 54; Window scenery: 62; Pets: 82; Ship: 50; Zoo, 70; Clothing store: 53; Beach: 95; Airport: 65 tickets; Gym: 47; Attractions: 77; Crowd: 67; Desert: 73; Beach: 68; Mountain area: 54; Shopping mall: 55; Trees: 85; Sky: 102; Snow: 71; Snow Mountain: 53; Night view: 78; Playground: 94 | Object Image Collection | |||
216 | Text | ASR, TTS, Language Modelling | N/A | 15,000 words | Add Quote | ori_IND_PHON | Appen Global | Pronunciation Dictionary | Oriya | India | N/A | N/A | N/A | N/A | 15000 | N/A | text | Oriya (India) Pronunciation Dictionary | ||
80 | Audio | ASR, Conversational AI, Speech Analytics | Mobile phone and landline | 20 hours | Add Quote | PAP_ASR001 | Appen Global | Conversational Speech | Panjabi | Pakistan | Low background noise | 205 | 2 | Available on request | 7298 | 8 | alaw | Dataset is fully transcribed and time-stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 71% of calls, both speakers (in-line/out-line) were collected and transcribed, however, for 29% calls, only one half of the conversation was collected and transcribed |
Panjabi (Pakistan) conversational telephony | |
74 |