SimAndro-Plus: On Computing Similarity of Android Applications
- Department of Computer and Software, Hanyang University
Seoul, Korea, 04763
{masoud,wook}@hanyang.ac.kr
Abstract
In this paper, we propose SimAndro-Plus as an improved variant of the state-of-the-art method, SimAndro, to compute the similarity of Android applications (apps) regarding their functionalities. SimAndro-Plus has two major differences with SimAndro: 1) it exploits two beneficial features to similarity computation, which are totally disregarded by SimAndro; 2) to compute the similarity score of an app-pair based on strings and package name features, SimAndro-Plus considers not only those terms co-appearing in both apps but also considers those terms appearing in one app while missing in the other one. The results of our extensive experiments with three real-world datasets and a dataset constructed by human experts demonstrate that 1) each of the two aforementioned differences is really effective to achieve better accuracy and 2) SimAndro-Plus outperforms SimAndro in similarity computation by 14% in average.
Key words
android applications, apps data mining, feature extraction, API calls, manifest information, similarity computation
Digital Object Identifier (DOI)
https://doi.org/10.2298/CSIS210208036H
Publication information
Volume 18, Issue 4 (September 2021)
Year of Publication: 2021
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium
Full text
Available in PDF
Portable Document Format
How to cite
Hamedani, M. R., Kim, S.: SimAndro-Plus: On Computing Similarity of Android Applications. Computer Science and Information Systems, Vol. 18, No. 4, 1219–1238. (2021), https://doi.org/10.2298/CSIS210208036H