EXTENSION OF CONVENTIONAL CO-TRAINING LEARNING STRATEGIES TO THREE-VIEW AND COMMITTEE-BASED LEARNING STRATEGIES FOR EFFECTIVE AUTOMATIC SENTENCE SEGMENTATION

dc.authorid0000-0002-7008-4778en_US
dc.contributor.authorDalva, Dogan
dc.contributor.authorGuz, Umit
dc.contributor.authorGürkan, Hakan
dc.date.accessioned2021-03-20T20:13:26Z
dc.date.available2021-03-20T20:13:26Z
dc.date.issued2018
dc.departmentBTÜ, Mühendislik ve Doğa Bilimleri Fakültesi, Elektrik Elektronik Mühendisliği Bölümüen_US
dc.descriptionIEEE Workshop on Spoken Language Technology (SLT) -- DEC 18-21, 2018 -- Athens, GREECEen_US
dc.description.abstractThe objective of this work is to develop effective multi-view semi-supervised machine learning strategies for sentence boundary classification problem when only small sets of sentence boundary labeled data are available. We propose three-view and committee-based learning strategies incorporating with co-training algorithms with agreement, disagreement, and self-combined learning strategies using prosodic, lexical and morphological information. We compare experimental results of proposed three-view and committee-based learning strategies to other semi-supervised learning strategies in the literature namely, self-training and co-training with agreement, disagreement, and self-combined strategies. The experiment results show that sentence segmentation performance can be highly improved using multi-view learning strategies that we propose since data sets can be represented by three redundantly sufficient and disjoint feature sets. We show that the proposed strategies substantially improve the average performance when only a small set of manually labeled data is available for Turkish and English spoken languages, respectively.en_US
dc.description.sponsorshipInst Elect & Elect Engineers, IEEE Signal Proc Socen_US
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TUBITAK)Turkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [107E182, 111E228]; Isik University Scientific Research Project Fund [09A301, 14A201]; J. William Fulbright Post-Doctoral Research Fellowshipen_US
dc.description.sponsorshipThis material is based upon work supported by the Scientific and Technological Research Council of Turkey (TUBITAK) (Project Number: 107E182 and Project Number: 111E228) and Isik University Scientific Research Project Fund (Project Number: 09A301 and Project Number: 14A201) and J. William Fulbright Post-Doctoral Research Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agencies.en_US
dc.identifier.endpage755en_US
dc.identifier.isbn978-1-5386-4334-1
dc.identifier.issn2639-5479
dc.identifier.scopusqualityN/Aen_US
dc.identifier.startpage750en_US
dc.identifier.urihttps://hdl.handle.net/20.500.12885/866
dc.identifier.wosWOS:000463141800104en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.institutionauthorGürkan, Hakan
dc.language.isoenen_US
dc.publisherIeeeen_US
dc.relation.ispartof2018 Ieee Workshop On Spoken Language Technology (Slt 2018)en_US
dc.relation.ispartofseriesIEEE Workshop on Spoken Language Technology
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectBoostingen_US
dc.subjectCo-Trainingen_US
dc.subjectSentence Segmentationen_US
dc.subjectSemi-supervised learningen_US
dc.subjectProsodyen_US
dc.titleEXTENSION OF CONVENTIONAL CO-TRAINING LEARNING STRATEGIES TO THREE-VIEW AND COMMITTEE-BASED LEARNING STRATEGIES FOR EFFECTIVE AUTOMATIC SENTENCE SEGMENTATIONen_US
dc.typeConference Objecten_US

Dosyalar