Lightweight, Length Invariant Models and Dimensionality Reduction in Respiratory Disease Detection

Pál, Tamás [Pál, Tamás (Adatbányászat), szerző] Információs Rendszerek Tanszék (ELTE / IK); PhD Informatika Doktori Iskola (ELTE / IK); Molnár, Bálint [Molnár, Bálint (Informatika), szerző] Információs Rendszerek Tanszék (ELTE / IK); Tarcsi, Ádám [Tarcsi, Ádám (Informatika-média...), szerző] Média- és Oktatásinformatika Tanszék (ELTE / IK)

Angol nyelvű Absztrakt / Kivonat (Egyéb konferenciaközlemény) Tudományos
    Azonosítók
    Támogatások:
    • Integrált kutatói utánpótlás-képzési program az informatika és számítástudomány diszciplináris te...(EFOP-3.6.3-VEKOP-16-2017-00002) Támogató: EFOP-VEKOP
    • TKP2020-NKA-06(Thematic Excellence Programme TKP2020-NKA-06 (National Challenges Subprogramme)) Támogató: NKFI
    Szakterületek:
    • Gépi tanulás, statisztikus adatfeldolgozás, jelfeldolgozáson alapuló alkalmazások (pl. beszéd, kép, videó)
    • Számítástudomány, információtudomány és bioinformatika
    The detection of respiratory diseases has been an important field of study of respiratory illnesses that are responsible for millions of deaths yearly. Machine learning offers a plethora of methods to preprocess, analyze, and classify such recordings. Approaches that have reduced computational demand are preferred to achieve shorter processing time. Two deep learning models are proposed that are length-invariant and have simpler neural network topologies. With length invariance, the processing time is shortened, as splitting the recordings into equal-sized segments is not necessary anymore. Moreover, extracted spectrograms of the recordings can be reduced in dimensionality by calculating aggregated values along the time axis and using efficient methods like PCA or tSNE. Mel Frequency Cepstral Coefficient (MFCC) spectrograms were extracted. The first deep model is a lightweight dense network that receives as input feature vectors from aggregated spectrograms. Inputs of different dimensionality are compared. The second model is inspired by the 1D MaxPooling architecture by Phan that introduce through the use of global max-pooling layers length invariability into the model. An extra hidden layer and other minor modifications are added that increased the classification performance in the case of this dataset. 2D spectrograms are used as input for this model. The respiratory sound database contains 920 annotated breathing recordings so that this database includes the symptoms of 7 classes of diseases or records that constitute as healthy. The data-set was created by a Portuguese and Greek research group. The data were collected from 126 patients so that these samples extend over through all age groups, namely children, adults, elderly. The data-set is also heavily imbalanced. The proposed deep learning, neural networks are systemically investigated on the before-mentioned data-sets and analysed according to the metrics of the discipline.
    Hivatkozás stílusok: IEEEACMAPAChicagoHarvardCSLMásolásNyomtatás
    2022-09-27 22:30