Sentence embedding approach using LSTM auto-encoder for discussion threads summarization

Abdul Wali Khan1, Feras Al-Obeidat2, Afsheen Khalid1, Adnan Amin1 and Fernando Moreira3

  1. Center for Excellence in Information Technology,Institute of Management Sciences
    Peshawar, Pakistan
    abdulwalikhanafridi@gmail.com,(adnan.amin, afsheen.khalid)@imsciences.edu.pk
  2. College of Technological Innovation, Zayed University
    Abu Dhabi, UAE
    Feras.Al-Obeidat@zu.ac.ae
  3. REMIT, IJP, Universidade Portucalense IEETA, Universidade de Aveiro
    Portugal
    fmoreira@uportu.pt

Abstract

Online discussion forums are repositories of valuable information where users interact and articulate their ideas, opinions, and share experiences about numerous topics. They are internet-based online communities where users can ask for help and find the solution to a problem. On online discussion forums, a new user becomes exhausted from reading the significant number of replies in a discussion. An automated discussion thread summarizing system (DTS) is necessary to create a candid view of the entire discussion of a query. Most of the previous approaches for automated DTS use the continuous bag of words (CBOW) model as a sentence embedding tool, which is poor at capturing the overall meaning of the sentence and is unable to grasp word dependency. To overcome this limitation, we introduce the LSTM Auto-encoder as a sentence embedding technique to improve the performance of DTS. The empirical result in the context of average precision, recall, and F-measure of the proposed approach with respect to ROGUE-1 and ROUGE-2 of two standard experimental datasets proves the effectiveness and efficiency of the proposed approach and outperforms the state-of-the-art CBOW model in sentence embedding tasks by boosting the performance of the automated DTS model.

Key words

Sentence embedding, LSTM Auto-encoder, CBOW, Deep learning, Machine learning, NLP

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS221210055K

Publication information

Volume 20, Issue 4 (September 2023)
Year of Publication: 2023
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Khan, A. W., Al-Obeidat, F., Khalid, A., Amin, A., Moreira, F.: Sentence embedding approach using LSTM auto-encoder for discussion threads summarization. Computer Science and Information Systems, Vol. 20, No. 4. (2023), https://doi.org/10.2298/CSIS221210055K